AI Papers: A Deep Dive

AI Papers: A Deep Dive

paperdive.ai

Episodes 119
Avg. Duration 25m
Activity Highly Active
Since May 2026
Latest Episode Jun 2026

Publishing Details

Schedule
Hourly
Format
Episodic
Consistency
38%
Hosting
d2jqfgn4f9ert4.cloudfront.net

Contact & Outreach

About This Podcast

Long-form deep dives into new research on Artificial Intelligence, AI agents and the engineering practice of building them - one paper per episode. We unpack the motivating problem, how the method actually works, the math that matters, what the experiments do and don't show, and the strongest critique against the result. The goal isn't a five-minute summary; it's the kind of conversation you'd have with a colleague who actually read the paper. Topics span large language models, autonomous agents, agentic coding, reinforcement learning for agent training, evaluation and benchmarks, alignment, and the practical engineering decisions that make agentic systems actually work in production. Most papers are pulled from arXiv, often within days of release. Hosted by AI voices generated with ElevenLabs. Episode scripts are produced by a multi-stage Claude pipeline working from a close reading of the source paper. New episodes daily.

Podcasting 2.0 Features

transcript

Explore Statistics

Recent Episodes

Why the Best-Aligned AI Models Are the Easiest to Trick Into Producing Harm

Jun 09, 2026 23m Transcript

Why the Best-Aligned AI Models Are the Easiest to Trick Into Producing Harm Source: Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack Paper was published on…

How an AI Agent Rewrites Its Own Tools, Without an Answer Key

Jun 09, 2026 30m Transcript

How an AI Agent Rewrites Its Own Tools, Without an Answer Key Source: Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper was published on June…

How an Open AI System Verified 672 Hard Math Proofs for Under $300

Jun 09, 2026 25m Transcript

How an Open AI System Verified 672 Hard Math Proofs for Under $300 Source: Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement Paper was published on June…

When the Agent Says It's Done But Nothing Happened: Debugging the Harness, Not the Model

Jun 09, 2026 26m Transcript

When the Agent Says It's Done But Nothing Happened: Debugging the Harness, Not the Model Source: From Failed Trajectories to Reliable LLM Agents: Diagnosing and Repairing Harness Flaws Paper was…

Beating Reinforcement Learning Without Ever Touching the Model's Weights

Jun 09, 2026 22m Transcript

Beating Reinforcement Learning Without Ever Touching the Model's Weights Source: Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents Paper was published on June 03, 2026 This…

Why Streaming Half a Reasoning Chain Beats Sending the Whole Thing

Jun 05, 2026 25m Transcript

Why Streaming Half a Reasoning Chain Beats Sending the Whole Thing Source: Streaming Communication in Multi-Agent Reasoning Paper was published on June 03, 2026 This episode was AI-generated on June…

Teaching a Phone Agent to Reason Silently, And Keeping It Honest

Jun 05, 2026 24m Transcript

Teaching a Phone Agent to Reason Silently, And Keeping It Honest Source: MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models Paper was published on June 03, 2026 This episode…

Agents That Rewrite Their Own Weights Instead of Just Taking Notes

Jun 05, 2026 26m Transcript

Agents That Rewrite Their Own Weights Instead of Just Taking Notes Source: Scaling Self-Evolving Agents via Parametric Memory Paper was published on June 03, 2026 This episode was AI-generated on…

What If a Prompt Injection Never Left? Attacks That Wait in Agent Memory

Jun 05, 2026 26m Transcript

What If a Prompt Injection Never Left? Attacks That Wait in Agent Memory Source: What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems Paper was…

When an AI Agent Cheats Without Being Told: Inside the Meta-Agent Challenge

Jun 05, 2026 21m Transcript

When an AI Agent Cheats Without Being Told: Inside the Meta-Agent Challenge Source: The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? Paper was published on June…

How a 4B Web Agent Beat Models 60x Its Size on 500 Demonstrations

Jun 04, 2026 24m Transcript

How a 4B Web Agent Beat Models 60x Its Size on 500 Demonstrations Source: OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper was published on June 01,…

An AI Got Caught Reading the Answer Key, And Why That Catch Matters

Jun 04, 2026 27m Transcript

An AI Got Caught Reading the Answer Key, And Why That Catch Matters Source: EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning Paper was…

How an Agent Got 44 Points Better by Mining Its Own Scratch Paper

Jun 04, 2026 27m Transcript

How an Agent Got 44 Points Better by Mining Its Own Scratch Paper Source: Inducing Reasoning Primitives from Agent Traces Paper was published on June 02, 2026 This episode was AI-generated on June 3,…

How a Market of Crippled AI Agents Outscored One Unrestricted Model

Jun 04, 2026 25m Transcript

How a Market of Crippled AI Agents Outscored One Unrestricted Model Source: Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions Paper was published on June 01, 2026 This…

The Reasoning Cliff: Why Thinking Longer Makes Models Worse at Exact Step-by-Step Tasks

Jun 04, 2026 31m Transcript

The Reasoning Cliff: Why Thinking Longer Makes Models Worse at Exact Step-by-Step Tasks Source: The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary Paper…

Giving Agents a Notebook Instead of New Weights: How ExpGraph Lets Frozen Models Learn

Jun 02, 2026 26m Transcript

Giving Agents a Notebook Instead of New Weights: How ExpGraph Lets Frozen Models Learn Source: ExpGraph: Model-Agnostic Experience Learning with Graph-Structured Memory for LLM Agents Paper was…

The Trojan Is Your Agent's Memory: Why Single-Step Defenses Miss Persistent Attacks

Jun 02, 2026 26m Transcript

The Trojan Is Your Agent's Memory: Why Single-Step Defenses Miss Persistent Attacks Source: From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper was…

How Making a Research Agent Smarter Quietly Makes It Leak Your Secrets

Jun 02, 2026 25m Transcript

How Making a Research Agent Smarter Quietly Makes It Leak Your Secrets Source: MosaicLeaks:Privacy Risks in Querying-in-the-Open for Deep Research Agents Paper was published on May 29, 2026 This…

AI Agents Tried to Invent a Post-Human Language, And Reinvented Cherokee

Jun 02, 2026 25m Transcript

AI Agents Tried to Invent a Post-Human Language, And Reinvented Cherokee Source: Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion Paper was…

How to Catch an AI Attack That No Single Conversation Reveals

Jun 02, 2026 23m Transcript

How to Catch an AI Attack That No Single Conversation Reveals Source: Stateful Online Monitoring Catches Distributed Agent Attacks Paper was published on May 29, 2026 This episode was AI-generated on…

Frequently Asked Questions

How many episodes does AI Papers: A Deep Dive have?

AI Papers: A Deep Dive has published 119 episodes since May 2026, covering topics in Technology.

Is AI Papers: A Deep Dive still active?

AI Papers: A Deep Dive is currently highly active with new episodes hourly. Average episode length is 25m.

How do I contact AI Papers: A Deep Dive for sponsorship or guest appearances?

Sign up on Grep.FM to access contact details for AI Papers: A Deep Dive, including email and social media links.

Similar Podcasts