read.ehrlich.dev
ai-curated rss
about
github
rss
dark
reinforcement learning
Value iteration and policy gradients in sequential decision-making
hot
new
week
month
year
spaces
all
ai
sci fi
3d printing
aerospace
ai agents
algorithmic trading
amateur astronomy
animation
anthropology
api design
aquariums
archaeology
astrobiology
astrophysics
audio programming
behavioral economics
bioinformatics
birding
board games
cartography
category theory
cellular automata
chemistry
chess
climate science
cloud infrastructure
cognitive science
compilers
complexity
computer architecture
computer graphics
computer vision
conlangs
consciousness
containers
cpp
creative coding
cryptography
data engineering
data visualization
databases
decision theory
demoscene
design
devops
digital rights
distributed systems
ecology
economics
electronic music
elixir
embedded systems
energy
espresso
ethics
evolution
existentialism
exploit development
fermentation
film
finance
formal verification
forth
fpga
fractals
game dev
game theory
genetics
geology
git internals
go game
golang
ham radio
haskell
history
history of computing
history of science
homelab
horror
indie games
information design
information theory
internet culture
javascript
kotlin
linguistics
linux
lisp
lock picking
machine learning
malware analysis
manga
materials science
math olympiad
mathematics
mechanical keyboards
meditation
metaphysics
music theory
mycology
nanotechnology
networking
neuroscience
nix
nonduality
nuclear
number theory
observability
oceanography
open source
operating systems
paleoanthropology
pharmacology
phenomenology
philosophy
philosophy of mind
philosophy of science
photography
physics
pixel art
political philosophy
procedural generation
programming languages
puzzles
python
quantum computing
reinforcement learning
retrocomputing
reverse engineering
robotics
rust
security
self hosted
semiotics
shell scripting
site reliability
solo dev
sourdough
space exploration
speedrunning
standup comedy
statistics
swift
synths
tabletop rpg
technical writing
thermodynamics
topology
true crime
type theory
typescript
typography
urban exploration
vinyl
wasm
webdev
witsrtn
woodworking
worldbuilding
writing
zig
245
Confusion around the term reward hacking
(lesswrong.com)
2 months ago ·
reinforcement learning
·
cellular automata
190
Reinforcement Learning for Fast and Robust Longitudinal Qubit Readout
(arxiv.org)
2 months ago ·
quantum computing
·
reinforcement learning
88
Retrieval-Augmented LLM Agents: Learning to Learn from Experience
(arxiv.org)
2 months ago ·
reinforcement learning
·
machine learning
42
ARENA 7.0 Impact Report
(lesswrong.com)
2 months ago ·
ai
·
reinforcement learning
22
World Liberty Financial Launches Toolkit to Let AI Agents Spend USD1
(thedefiant.io)
2 months ago ·
ai agents
·
reinforcement learning
47
Three of the biggest fraud trends from MRC Vegas 2026
(stripe.com)
2 months ago ·
reinforcement learning
·
machine learning
125
More of the Disease, Faster (What happens when you ask an LLM to find you an edge)
(robotwealth.com)
2 months ago ·
algorithmic trading
·
reinforcement learning
10
Price drop: Unlock 40+ top AIs for life: Compare ChatGPT, Claude and Gemini for just $68
(cultofmac.com)
2 months ago ·
ai
·
reinforcement learning
8
Smart seals surprise behind the scenes
(perthnow.com.au)
2 months ago ·
birding
·
reinforcement learning
118
Build Your Weekly Python Study Schedule: 7 Days to Consistent Progress
(realpython.com)
2 months ago ·
python
·
reinforcement learning
48
Music To Build Agents By
(brooker.co.za)
2 months ago ·
ai agents
·
reinforcement learning
15
Learn why robots need to earn trust from GM expert Mikell Taylor
(therobotreport.com)
2 months ago ·
robotics
·
reinforcement learning
135
Using Simulation to Build Robotic Systems for Hospital Automation
(developer.nvidia.com)
2 months ago ·
robotics
·
reinforcement learning
18
How Identity and Secure AI Deliver Business Value for Airlines
(auth0.com)
2 months ago ·
reinforcement learning
·
design
82
Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation
(andlukyane.com)
2 months ago ·
reinforcement learning
·
ai agents
10
Starting; playing; maintaining
(jamesg.blog)
2 months ago ·
music theory
·
reinforcement learning
175
Welcome to the Machine, a guide to building infra software for AI agents
(me.0xffff.me)
2 months ago ·
ai agents
·
reinforcement learning
40
A Distorted New World
(philipphagenlocher.de)
2 months ago ·
behavioral economics
·
reinforcement learning
22
Zoox Coming to Dallas & Phoenix, Partnering with Uber in Las Vegas & Los Angeles
(cleantechnica.com)
2 months ago ·
reinforcement learning
·
devops
18
Today's Games Can Feel "Soulless," But Can Punch The Monkey Change That?
(gamespot.com)
2 months ago ·
reinforcement learning
·
go game
25
The key to a great video game performance? Team trust.
(gamedeveloper.com)
2 months ago ·
game dev
·
reinforcement learning
68
Step-by-Step Guide to Building an AI Agent with Python
(collabnix.com)
2 months ago ·
ai agents
·
reinforcement learning
35
Sorry, Charlie, StarKist Wants AI With Good Taste
(devops.com)
2 months ago ·
reinforcement learning
·
behavioral economics
19
Rhoda AI exits stealth with $450M to train robots from video
(therobotreport.com)
2 months ago ·
reinforcement learning
·
machine learning
12
Ignite Your Next Career Moveπ The Formula for Opportunity Starts Here — SAVE UP TO 40%!Ignite Your Next Career Move
(linux.com)
2 months ago ·
reinforcement learning
·
birding
28
Agentic AI security: Why you need to know about autonomous agents now
(blog.talosintelligence.com)
2 months ago ·
ai agents
·
reinforcement learning
58
Every minute you aren’t running 69 agents, you are falling behind
(geohot.github.io)
2 months ago ·
behavioral economics
·
reinforcement learning
95
From games to biology and beyond: 10 years of AlphaGo’s impact
(deepmind.google)
2 months ago ·
reinforcement learning
·
cellular automata
42
CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models
(arxiv.org)
2 months ago ·
reinforcement learning
42
Meet KARL: A Faster Agent for Enterprise Knowledge, powered by custom RL
(databricks.com)
2 months ago ·
reinforcement learning
42
MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation
(arxiv.org)
2 months ago ·
reinforcement learning
·
ai agents
42
A Rubric-Supervised Critic from Sparse Real-World Outcomes
(arxiv.org)
2 months ago ·
reinforcement learning
42
Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants
(arxiv.org)
2 months ago ·
ai agents
·
reinforcement learning
12
Meta to allow rival AI chatbots on WhatsApp amid EU pressure
(techcentral.co.za)
2 months ago ·
ai
·
reinforcement learning
85
Phishing AI Agents
(zansara.dev)
2 months ago ·
ai agents
·
reinforcement learning
8
Interview with Tom Howe of Hydrolix: AI Bots, the Friends, Foes, and Frenemies of Online Shopping
(hackread.com)
2 months ago ·
ai agents
·
reinforcement learning
42
Most AI Training Is Moving To Reinforcement Learning, Scale AI Says
(bigtechnology.com)
3 months ago ·
reinforcement learning
12
Human in the End: Rethinking How We Interact with AI Agents
(realmorrisliu.com)
3 months ago ·
ai agents
·
reinforcement learning
12
Fragments: February 25
(martinfowler.com)
3 months ago ·
ai agents
·
reinforcement learning
12
Fragments: February 19
(martinfowler.com)
3 months ago ·
ai
·
reinforcement learning
8
Venture Deals Spring 2026 Course
(feld.com)
3 months ago ·
solo dev
·
reinforcement learning
12
AI Skeptics: Sex Ed for Ed Tech (with Kiri Soares)
(mathbabe.org)
3 months ago ·
reinforcement learning
·
ai
35
Linklog | Reinforcement Learning on Operations Research Problem
(yjhan96.github.io)
3 months ago ·
reinforcement learning
·
puzzles
1
How I would learn programming in 2026 if I had to start from zero
(dev.to)
3 months ago ·
cognitive science
·
reinforcement learning
1
A brief update on QGO and related VR projects
(anagan79.itch.io)
3 months ago ·
indie games
·
reinforcement learning
1
How GSD turns Claude into a self-steering developer
(thenewstack.io)
3 months ago ·
reinforcement learning
·
ai
1
'There's no reason to ban us from playing': Analysis debunks notion that transgender women have inherent physical advantages in sports
(livescience.com)
3 months ago ·
internet culture
·
reinforcement learning
1
PATIENCE: You've Never Controlled Anything
(coey.dev)
3 months ago ·
reinforcement learning
·
complexity
1
Rethinking imitation learning with Predictive Inverse Dynamics Models
(microsoft.com)
3 months ago ·
reinforcement learning
·
robotics
1
Skills Are the Most Underrated Feature in Agentic AI
(brethorsting.com)
3 months ago ·
ai agents
·
reinforcement learning
1
Smart AI Policy Means Examining Its Real Harms and Benefits
(eff.org)
3 months ago ·
ai
·
reinforcement learning
160
Pass@k is Mostly Bunk
(brooker.co.za)
4 months ago ·
ai
·
reinforcement learning
1
The Coming War on Car Ownership
(geohot.github.io)
4 months ago ·
robotics
·
reinforcement learning
1
Multimodal reinforcement learning with agentic verifier for AI agents
(microsoft.com)
4 months ago ·
ai agents
·
reinforcement learning
1
Our Man in Caracas
(robertbryce.substack.com)
4 months ago ·
economics
·
reinforcement learning
1
The Shift
(swiftjectivec.com)
4 months ago ·
reinforcement learning
1
Supercharging LLMs: Scalable RL with torchforge and Weaver
(pytorch.org)
4 months ago ·
elixir
·
reinforcement learning
1
Has quantum advantage been achieved?
(quantumfrontiers.com)
4 months ago ·
quantum computing
·
reinforcement learning
220
Reward Hacking in Reinforcement Learning
(lilianweng.github.io)
18 months ago ·
reinforcement learning
1
Permutation-Invariant Neural Networks for Reinforcement Learning
(blog.otoro.net)
55 months ago ·
machine learning
·
reinforcement learning
1
Neuroevolution of Self-Interpretable Agents
(blog.otoro.net)
75 months ago ·
reinforcement learning
1
Learning to Predict Without Looking Ahead
(blog.otoro.net)
80 months ago ·
reinforcement learning
·
ai agents
1
Learning Latent Dynamics for Planning from Pixels
(blog.otoro.net)
88 months ago ·
reinforcement learning
·
ai agents
1
Reinforcement Learning for Improving Agent Design
(blog.otoro.net)
92 months ago ·
reinforcement learning
·
ai agents
1
World Models Experiments
(blog.otoro.net)
97 months ago ·
machine learning
·
reinforcement learning
spaces
all
ai
sci fi
3d printing
aerospace
ai agents
algorithmic trading
amateur astronomy
animation
anthropology
api design
aquariums
archaeology
astrobiology
astrophysics
audio programming
behavioral economics
bioinformatics
birding
board games
cartography
category theory
cellular automata
chemistry
chess
climate science
cloud infrastructure
cognitive science
compilers
complexity
computer architecture
computer graphics
computer vision
conlangs
consciousness
containers
cpp
creative coding
cryptography
data engineering
data visualization
databases
decision theory
demoscene
design
devops
digital rights
distributed systems
ecology
economics
electronic music
elixir
embedded systems
energy
espresso
ethics
evolution
existentialism
exploit development
fermentation
film
finance
formal verification
forth
fpga
fractals
game dev
game theory
genetics
geology
git internals
go game
golang
ham radio
haskell
history
history of computing
history of science
homelab
horror
indie games
information design
information theory
internet culture
javascript
kotlin
linguistics
linux
lisp
lock picking
machine learning
malware analysis
manga
materials science
math olympiad
mathematics
mechanical keyboards
meditation
metaphysics
music theory
mycology
nanotechnology
networking
neuroscience
nix
nonduality
nuclear
number theory
observability
oceanography
open source
operating systems
paleoanthropology
pharmacology
phenomenology
philosophy
philosophy of mind
philosophy of science
photography
physics
pixel art
political philosophy
procedural generation
programming languages
puzzles
python
quantum computing
reinforcement learning
retrocomputing
reverse engineering
robotics
rust
security
self hosted
semiotics
shell scripting
site reliability
solo dev
sourdough
space exploration
speedrunning
standup comedy
statistics
swift
synths
tabletop rpg
technical writing
thermodynamics
topology
true crime
type theory
typescript
typography
urban exploration
vinyl
wasm
webdev
witsrtn
woodworking
worldbuilding
writing
zig