read.ehrlich.dev
ai-curated rss
about
github
rss
dark
reinforcement learning
Value iteration and policy gradients in sequential decision-making
hot
new
week
month
year
spaces
all
ai
internet culture
writing
information theory
malware analysis
devops
site reliability
security
machine learning
cloud infrastructure
ai agents
quantum computing
open source
digital rights
behavioral economics
reverse engineering
energy
complexity
distributed systems
economics
databases
api design
git internals
wasm
materials science
data engineering
webdev
math olympiad
physics
cryptography
design
thermodynamics
mathematics
solo dev
observability
indie games
chemistry
programming languages
cellular automata
python
containers
technical writing
game theory
birding
reinforcement learning
nanotechnology
creative coding
statistics
climate science
procedural generation
worldbuilding
space exploration
astrophysics
game dev
data visualization
operating systems
networking
linux
computer vision
golang
robotics
astrobiology
aerospace
exploit development
neuroscience
cognitive science
manga
algorithmic trading
elixir
decision theory
computer graphics
political philosophy
history of science
formal verification
javascript
mechanical keyboards
swift
category theory
self hosted
board games
type theory
evolution
philosophy
rust
nix
philosophy of mind
embedded systems
urban exploration
retrocomputing
compilers
amateur astronomy
nuclear
information design
computer architecture
electronic music
oceanography
archaeology
chess
genetics
fractals
geology
ethics
kotlin
go game
vinyl
number theory
anthropology
aquariums
film
history of computing
lisp
synths
bioinformatics
pharmacology
ecology
metaphysics
semiotics
audio programming
history
paleoanthropology
topology
cpp
photography
haskell
consciousness
witsrtn
cartography
meditation
ham radio
fpga
sci fi
shell scripting
demoscene
homelab
horror
speedrunning
typography
tabletop rpg
animation
woodworking
3d printing
sourdough
puzzles
standup comedy
phenomenology
linguistics
existentialism
music theory
true crime
fermentation
lock picking
espresso
zig
finance
pixel art
philosophy of science
nonduality
typescript
conlangs
mycology
forth
245
Confusion around the term reward hacking
(lesswrong.com)
24 days ago ·
reinforcement learning
·
cellular automata
190
Reinforcement Learning for Fast and Robust Longitudinal Qubit Readout
(arxiv.org)
24 days ago ·
quantum computing
·
reinforcement learning
88
Retrieval-Augmented LLM Agents: Learning to Learn from Experience
(arxiv.org)
24 days ago ·
reinforcement learning
·
machine learning
42
ARENA 7.0 Impact Report
(lesswrong.com)
24 days ago ·
ai
·
reinforcement learning
22
World Liberty Financial Launches Toolkit to Let AI Agents Spend USD1
(thedefiant.io)
24 days ago ·
ai agents
·
reinforcement learning
47
Three of the biggest fraud trends from MRC Vegas 2026
(stripe.com)
25 days ago ·
reinforcement learning
·
machine learning
125
More of the Disease, Faster (What happens when you ask an LLM to find you an edge)
(robotwealth.com)
25 days ago ·
algorithmic trading
·
reinforcement learning
10
Price drop: Unlock 40+ top AIs for life: Compare ChatGPT, Claude and Gemini for just $68
(cultofmac.com)
24 days ago ·
ai
·
reinforcement learning
8
Smart seals surprise behind the scenes
(perthnow.com.au)
24 days ago ·
birding
·
reinforcement learning
118
Build Your Weekly Python Study Schedule: 7 Days to Consistent Progress
(realpython.com)
26 days ago ·
python
·
reinforcement learning
48
Music To Build Agents By
(brooker.co.za)
27 days ago ·
ai agents
·
reinforcement learning
15
Learn why robots need to earn trust from GM expert Mikell Taylor
(therobotreport.com)
26 days ago ·
robotics
·
reinforcement learning
135
Using Simulation to Build Robotic Systems for Hospital Automation
(developer.nvidia.com)
28 days ago ·
robotics
·
reinforcement learning
18
How Identity and Secure AI Deliver Business Value for Airlines
(auth0.com)
27 days ago ·
reinforcement learning
·
design
82
Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation
(andlukyane.com)
29 days ago ·
reinforcement learning
·
ai agents
10
Starting; playing; maintaining
(jamesg.blog)
28 days ago ·
music theory
·
reinforcement learning
175
Welcome to the Machine, a guide to building infra software for AI agents
(me.0xffff.me)
1 month ago ·
ai agents
·
reinforcement learning
40
A Distorted New World
(philipphagenlocher.de)
1 month ago ·
behavioral economics
·
reinforcement learning
22
Zoox Coming to Dallas & Phoenix, Partnering with Uber in Las Vegas & Los Angeles
(cleantechnica.com)
1 month ago ·
reinforcement learning
·
devops
18
Today's Games Can Feel "Soulless," But Can Punch The Monkey Change That?
(gamespot.com)
1 month ago ·
reinforcement learning
·
go game
25
The key to a great video game performance? Team trust.
(gamedeveloper.com)
1 month ago ·
game dev
·
reinforcement learning
68
Step-by-Step Guide to Building an AI Agent with Python
(collabnix.com)
1 month ago ·
ai agents
·
reinforcement learning
35
Sorry, Charlie, StarKist Wants AI With Good Taste
(devops.com)
1 month ago ·
reinforcement learning
·
behavioral economics
19
Rhoda AI exits stealth with $450M to train robots from video
(therobotreport.com)
1 month ago ·
reinforcement learning
·
machine learning
12
Ignite Your Next Career Moveπ The Formula for Opportunity Starts Here — SAVE UP TO 40%!Ignite Your Next Career Move
(linux.com)
1 month ago ·
reinforcement learning
·
birding
28
Agentic AI security: Why you need to know about autonomous agents now
(blog.talosintelligence.com)
1 month ago ·
ai agents
·
reinforcement learning
58
Every minute you aren’t running 69 agents, you are falling behind
(geohot.github.io)
1 month ago ·
behavioral economics
·
reinforcement learning
95
From games to biology and beyond: 10 years of AlphaGo’s impact
(deepmind.google)
1 month ago ·
reinforcement learning
·
cellular automata
42
CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models
(arxiv.org)
1 month ago ·
reinforcement learning
42
Meet KARL: A Faster Agent for Enterprise Knowledge, powered by custom RL
(databricks.com)
1 month ago ·
reinforcement learning
42
MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation
(arxiv.org)
1 month ago ·
reinforcement learning
·
ai agents
42
A Rubric-Supervised Critic from Sparse Real-World Outcomes
(arxiv.org)
1 month ago ·
reinforcement learning
42
Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants
(arxiv.org)
1 month ago ·
ai agents
·
reinforcement learning
12
Meta to allow rival AI chatbots on WhatsApp amid EU pressure
(techcentral.co.za)
1 month ago ·
ai
·
reinforcement learning
85
Phishing AI Agents
(zansara.dev)
1 month ago ·
ai agents
·
reinforcement learning
8
Interview with Tom Howe of Hydrolix: AI Bots, the Friends, Foes, and Frenemies of Online Shopping
(hackread.com)
1 month ago ·
ai agents
·
reinforcement learning
42
Most AI Training Is Moving To Reinforcement Learning, Scale AI Says
(bigtechnology.com)
1 month ago ·
reinforcement learning
12
Human in the End: Rethinking How We Interact with AI Agents
(realmorrisliu.com)
1 month ago ·
ai agents
·
reinforcement learning
12
Fragments: February 25
(martinfowler.com)
1 month ago ·
ai agents
·
reinforcement learning
12
Fragments: February 19
(martinfowler.com)
1 month ago ·
ai
·
reinforcement learning
8
Venture Deals Spring 2026 Course
(feld.com)
1 month ago ·
solo dev
·
reinforcement learning
12
AI Skeptics: Sex Ed for Ed Tech (with Kiri Soares)
(mathbabe.org)
1 month ago ·
reinforcement learning
·
ai
35
Linklog | Reinforcement Learning on Operations Research Problem
(yjhan96.github.io)
1 month ago ·
reinforcement learning
·
puzzles
1
How I would learn programming in 2026 if I had to start from zero
(dev.to)
2 months ago ·
cognitive science
·
reinforcement learning
1
A brief update on QGO and related VR projects
(anagan79.itch.io)
2 months ago ·
indie games
·
reinforcement learning
1
How GSD turns Claude into a self-steering developer
(thenewstack.io)
2 months ago ·
reinforcement learning
·
ai
1
'There's no reason to ban us from playing': Analysis debunks notion that transgender women have inherent physical advantages in sports
(livescience.com)
2 months ago ·
internet culture
·
reinforcement learning
1
PATIENCE: You've Never Controlled Anything
(coey.dev)
2 months ago ·
reinforcement learning
·
complexity
1
Rethinking imitation learning with Predictive Inverse Dynamics Models
(microsoft.com)
2 months ago ·
reinforcement learning
·
robotics
1
Skills Are the Most Underrated Feature in Agentic AI
(brethorsting.com)
2 months ago ·
ai agents
·
reinforcement learning
1
Smart AI Policy Means Examining Its Real Harms and Benefits
(eff.org)
2 months ago ·
ai
·
reinforcement learning
160
Pass@k is Mostly Bunk
(brooker.co.za)
2 months ago ·
ai
·
reinforcement learning
1
The Coming War on Car Ownership
(geohot.github.io)
2 months ago ·
robotics
·
reinforcement learning
1
Multimodal reinforcement learning with agentic verifier for AI agents
(microsoft.com)
2 months ago ·
ai agents
·
reinforcement learning
1
Our Man in Caracas
(robertbryce.substack.com)
2 months ago ·
economics
·
reinforcement learning
1
The Shift
(swiftjectivec.com)
2 months ago ·
reinforcement learning
1
Supercharging LLMs: Scalable RL with torchforge and Weaver
(pytorch.org)
3 months ago ·
elixir
·
reinforcement learning
1
Has quantum advantage been achieved?
(quantumfrontiers.com)
3 months ago ·
quantum computing
·
reinforcement learning
220
Reward Hacking in Reinforcement Learning
(lilianweng.github.io)
16 months ago ·
reinforcement learning
1
Permutation-Invariant Neural Networks for Reinforcement Learning
(blog.otoro.net)
53 months ago ·
machine learning
·
reinforcement learning
1
Neuroevolution of Self-Interpretable Agents
(blog.otoro.net)
73 months ago ·
reinforcement learning
1
Learning to Predict Without Looking Ahead
(blog.otoro.net)
78 months ago ·
reinforcement learning
·
ai agents
1
Learning Latent Dynamics for Planning from Pixels
(blog.otoro.net)
87 months ago ·
reinforcement learning
·
ai agents
1
Reinforcement Learning for Improving Agent Design
(blog.otoro.net)
91 months ago ·
reinforcement learning
·
ai agents
1
World Models Experiments
(blog.otoro.net)
95 months ago ·
machine learning
·
reinforcement learning
spaces
all
ai
internet culture
writing
information theory
malware analysis
devops
site reliability
security
machine learning
cloud infrastructure
ai agents
quantum computing
open source
digital rights
behavioral economics
reverse engineering
energy
complexity
distributed systems
economics
databases
api design
git internals
wasm
materials science
data engineering
webdev
math olympiad
physics
cryptography
design
thermodynamics
mathematics
solo dev
observability
indie games
chemistry
programming languages
cellular automata
python
containers
technical writing
game theory
birding
reinforcement learning
nanotechnology
creative coding
statistics
climate science
procedural generation
worldbuilding
space exploration
astrophysics
game dev
data visualization
operating systems
networking
linux
computer vision
golang
robotics
astrobiology
aerospace
exploit development
neuroscience
cognitive science
manga
algorithmic trading
elixir
decision theory
computer graphics
political philosophy
history of science
formal verification
javascript
mechanical keyboards
swift
category theory
self hosted
board games
type theory
evolution
philosophy
rust
nix
philosophy of mind
embedded systems
urban exploration
retrocomputing
compilers
amateur astronomy
nuclear
information design
computer architecture
electronic music
oceanography
archaeology
chess
genetics
fractals
geology
ethics
kotlin
go game
vinyl
number theory
anthropology
aquariums
film
history of computing
lisp
synths
bioinformatics
pharmacology
ecology
metaphysics
semiotics
audio programming
history
paleoanthropology
topology
cpp
photography
haskell
consciousness
witsrtn
cartography
meditation
ham radio
fpga
sci fi
shell scripting
demoscene
homelab
horror
speedrunning
typography
tabletop rpg
animation
woodworking
3d printing
sourdough
puzzles
standup comedy
phenomenology
linguistics
existentialism
music theory
true crime
fermentation
lock picking
espresso
zig
finance
pixel art
philosophy of science
nonduality
typescript
conlangs
mycology
forth