reinforcement learning

245

Confusion around the term reward hacking (lesswrong.com)

4 months ago · reinforcement learning · cellular automata
190

Reinforcement Learning for Fast and Robust Longitudinal Qubit Readout (arxiv.org)

4 months ago · quantum computing · reinforcement learning
88

Retrieval-Augmented LLM Agents: Learning to Learn from Experience (arxiv.org)

4 months ago · reinforcement learning · machine learning
42

ARENA 7.0 Impact Report (lesswrong.com)

4 months ago · ai · reinforcement learning
22

World Liberty Financial Launches Toolkit to Let AI Agents Spend USD1 (thedefiant.io)

4 months ago · ai agents · reinforcement learning
47

Three of the biggest fraud trends from MRC Vegas 2026 (stripe.com)

4 months ago · reinforcement learning · machine learning
125

More of the Disease, Faster (What happens when you ask an LLM to find you an edge) (robotwealth.com)

4 months ago · algorithmic trading · reinforcement learning
10

Price drop: Unlock 40+ top AIs for life: Compare ChatGPT, Claude and Gemini for just $68 (cultofmac.com)

4 months ago · ai · reinforcement learning
8

Smart seals surprise behind the scenes (perthnow.com.au)

4 months ago · birding · reinforcement learning
118

Build Your Weekly Python Study Schedule: 7 Days to Consistent Progress (realpython.com)

4 months ago · python · reinforcement learning
48

Music To Build Agents By (brooker.co.za)

4 months ago · ai agents · reinforcement learning
15

Learn why robots need to earn trust from GM expert Mikell Taylor (therobotreport.com)

4 months ago · robotics · reinforcement learning
135

Using Simulation to Build Robotic Systems for Hospital Automation (developer.nvidia.com)

4 months ago · robotics · reinforcement learning
18

How Identity and Secure AI Deliver Business Value for Airlines (auth0.com)

4 months ago · reinforcement learning · design
82

Collaborative Reinforcement Learning: Why HACRL Trains Models in Teams Instead of Isolation (andlukyane.com)

4 months ago · reinforcement learning · ai agents
10

Starting; playing; maintaining (jamesg.blog)

4 months ago · music theory · reinforcement learning
175

Welcome to the Machine, a guide to building infra software for AI agents (me.0xffff.me)

4 months ago · ai agents · reinforcement learning
40

A Distorted New World (philipphagenlocher.de)

4 months ago · behavioral economics · reinforcement learning
22

Zoox Coming to Dallas & Phoenix, Partnering with Uber in Las Vegas & Los Angeles (cleantechnica.com)

4 months ago · reinforcement learning · devops
18

Today's Games Can Feel "Soulless," But Can Punch The Monkey Change That? (gamespot.com)

4 months ago · reinforcement learning · go game
25

The key to a great video game performance? Team trust. (gamedeveloper.com)

4 months ago · game dev · reinforcement learning
68

Step-by-Step Guide to Building an AI Agent with Python (collabnix.com)

4 months ago · ai agents · reinforcement learning
35

Sorry, Charlie, StarKist Wants AI With Good Taste (devops.com)

4 months ago · reinforcement learning · behavioral economics
19

Rhoda AI exits stealth with $450M to train robots from video (therobotreport.com)

4 months ago · reinforcement learning · machine learning
12

Ignite Your Next Career Moveπ The Formula for Opportunity Starts Here — SAVE UP TO 40%!Ignite Your Next Career Move (linux.com)

4 months ago · reinforcement learning · birding
28

Agentic AI security: Why you need to know about autonomous agents now (blog.talosintelligence.com)

4 months ago · ai agents · reinforcement learning
58

Every minute you aren’t running 69 agents, you are falling behind (geohot.github.io)

4 months ago · behavioral economics · reinforcement learning
95

From games to biology and beyond: 10 years of AlphaGo’s impact (deepmind.google)

4 months ago · reinforcement learning · cellular automata
42

CTRL-RAG: Contrastive Likelihood Reward Based Reinforcement Learning for Context-Faithful RAG Models (arxiv.org)

4 months ago · reinforcement learning
42

Meet KARL: A Faster Agent for Enterprise Knowledge, powered by custom RL (databricks.com)

4 months ago · reinforcement learning
42

MAGE: Meta-Reinforcement Learning for Language Agents toward Strategic Exploration and Exploitation (arxiv.org)

4 months ago · reinforcement learning · ai agents
42

A Rubric-Supervised Critic from Sparse Real-World Outcomes (arxiv.org)

4 months ago · reinforcement learning
42

Build, Judge, Optimize: A Blueprint for Continuous Improvement of Multi-Agent Consumer Assistants (arxiv.org)

4 months ago · ai agents · reinforcement learning
12

Meta to allow rival AI chatbots on WhatsApp amid EU pressure (techcentral.co.za)

4 months ago · ai · reinforcement learning
85

Phishing AI Agents (zansara.dev)

4 months ago · ai agents · reinforcement learning
8

Interview with Tom Howe of Hydrolix: AI Bots, the Friends, Foes, and Frenemies of Online Shopping (hackread.com)

4 months ago · ai agents · reinforcement learning
42

Most AI Training Is Moving To Reinforcement Learning, Scale AI Says (bigtechnology.com)

4 months ago · reinforcement learning
12

Human in the End: Rethinking How We Interact with AI Agents (realmorrisliu.com)

4 months ago · ai agents · reinforcement learning
12

Fragments: February 25 (martinfowler.com)

5 months ago · ai agents · reinforcement learning
12

Fragments: February 19 (martinfowler.com)

5 months ago · ai · reinforcement learning
8

Venture Deals Spring 2026 Course (feld.com)

5 months ago · solo dev · reinforcement learning
12

AI Skeptics: Sex Ed for Ed Tech (with Kiri Soares) (mathbabe.org)

5 months ago · reinforcement learning · ai
35

Linklog | Reinforcement Learning on Operations Research Problem (yjhan96.github.io)

5 months ago · reinforcement learning · puzzles
1

How I would learn programming in 2026 if I had to start from zero (dev.to)

5 months ago · cognitive science · reinforcement learning
1

A brief update on QGO and related VR projects (anagan79.itch.io)

5 months ago · indie games · reinforcement learning
1

How GSD turns Claude into a self-steering developer (thenewstack.io)

5 months ago · reinforcement learning · ai
1

'There's no reason to ban us from playing': Analysis debunks notion that transgender women have inherent physical advantages in sports (livescience.com)

5 months ago · internet culture · reinforcement learning
1

PATIENCE: You've Never Controlled Anything (coey.dev)

5 months ago · reinforcement learning · complexity
1

Rethinking imitation learning with Predictive Inverse Dynamics Models (microsoft.com)

5 months ago · reinforcement learning · robotics
1

Skills Are the Most Underrated Feature in Agentic AI (brethorsting.com)

5 months ago · ai agents · reinforcement learning
1

Smart AI Policy Means Examining Its Real Harms and Benefits (eff.org)

5 months ago · ai · reinforcement learning
160

Pass@k is Mostly Bunk (brooker.co.za)

6 months ago · ai · reinforcement learning
1

The Coming War on Car Ownership (geohot.github.io)

6 months ago · robotics · reinforcement learning
1

Multimodal reinforcement learning with agentic verifier for AI agents (microsoft.com)

6 months ago · ai agents · reinforcement learning
1

Our Man in Caracas (robertbryce.substack.com)

6 months ago · economics · reinforcement learning
1

The Shift (swiftjectivec.com)

6 months ago · reinforcement learning
1

Supercharging LLMs: Scalable RL with torchforge and Weaver (pytorch.org)

6 months ago · elixir · reinforcement learning
1

Has quantum advantage been achieved? (quantumfrontiers.com)

6 months ago · quantum computing · reinforcement learning
220

Reward Hacking in Reinforcement Learning (lilianweng.github.io)

20 months ago · reinforcement learning
1

Permutation-Invariant Neural Networks for Reinforcement Learning (blog.otoro.net)

57 months ago · machine learning · reinforcement learning
1

Neuroevolution of Self-Interpretable Agents (blog.otoro.net)

77 months ago · reinforcement learning
1

Learning to Predict Without Looking Ahead (blog.otoro.net)

82 months ago · reinforcement learning · ai agents
1

Learning Latent Dynamics for Planning from Pixels (blog.otoro.net)

90 months ago · reinforcement learning · ai agents
1

Reinforcement Learning for Improving Agent Design (blog.otoro.net)

94 months ago · reinforcement learning · ai agents
1

World Models Experiments (blog.otoro.net)

98 months ago · machine learning · reinforcement learning

spaces