Publishing Details
Contact & Outreach
About This Podcast
Podcasting 2.0 Features
Explore Statistics
Recent Episodes
Why the Best-Aligned AI Models Are the Easiest to Trick Into Producing Harm
Why the Best-Aligned AI Models Are the Easiest to Trick Into Producing Harm Source: Safety Paradox: How Enhanced Safety Awareness Leaves LLMs Vulnerable to Posterior Attack Paper was published on…
How an AI Agent Rewrites Its Own Tools, Without an Answer Key
How an AI Agent Rewrites Its Own Tools, Without an Answer Key Source: Retrospective Harness Optimization: Improving LLM Agents via Self-Preference over Trajectory Rollouts Paper was published on June…
How an Open AI System Verified 672 Hard Math Proofs for Under $300
How an Open AI System Verified 672 Hard Math Proofs for Under $300 Source: Goedel-Architect: Streamlining Formal Theorem Proving with Blueprint Generation and Refinement Paper was published on June…
When the Agent Says It's Done But Nothing Happened: Debugging the Harness, Not the Model
When the Agent Says It's Done But Nothing Happened: Debugging the Harness, Not the Model Source: From Failed Trajectories to Reliable LLM Agents: Diagnosing and Repairing Harness Flaws Paper was…
Beating Reinforcement Learning Without Ever Touching the Model's Weights
Beating Reinforcement Learning Without Ever Touching the Model's Weights Source: Agentic Monte Carlo: Simulating Reinforcement Learning for Black-Box Agents Paper was published on June 03, 2026 This…
Why Streaming Half a Reasoning Chain Beats Sending the Whole Thing
Why Streaming Half a Reasoning Chain Beats Sending the Whole Thing Source: Streaming Communication in Multi-Agent Reasoning Paper was published on June 03, 2026 This episode was AI-generated on June…
Teaching a Phone Agent to Reason Silently, And Keeping It Honest
Teaching a Phone Agent to Reason Silently, And Keeping It Honest Source: MIRAGE: Mobile Agents with Implicit Reasoning and Generative World Models Paper was published on June 03, 2026 This episode…
Agents That Rewrite Their Own Weights Instead of Just Taking Notes
Agents That Rewrite Their Own Weights Instead of Just Taking Notes Source: Scaling Self-Evolving Agents via Parametric Memory Paper was published on June 03, 2026 This episode was AI-generated on…
What If a Prompt Injection Never Left? Attacks That Wait in Agent Memory
What If a Prompt Injection Never Left? Attacks That Wait in Agent Memory Source: What If Prompt Injection Never Left? Exploring Cross-Session Stored Prompt Injection in Agentic Systems Paper was…
When an AI Agent Cheats Without Being Told: Inside the Meta-Agent Challenge
When an AI Agent Cheats Without Being Told: Inside the Meta-Agent Challenge Source: The Meta-Agent Challenge: Are Current Agents Capable of Autonomous Agent Development? Paper was published on June…
How a 4B Web Agent Beat Models 60x Its Size on 500 Demonstrations
How a 4B Web Agent Beat Models 60x Its Size on 500 Demonstrations Source: OpenWebRL: Demystifying Online Multi-turn Reinforcement Learning for Visual Web Agents Paper was published on June 01,…
An AI Got Caught Reading the Answer Key, And Why That Catch Matters
An AI Got Caught Reading the Answer Key, And Why That Catch Matters Source: EvoTrainer: Co-Evolving LLM Policies and Training Harnesses for Autonomous Agentic Reinforcement Learning Paper was…
How an Agent Got 44 Points Better by Mining Its Own Scratch Paper
How an Agent Got 44 Points Better by Mining Its Own Scratch Paper Source: Inducing Reasoning Primitives from Agent Traces Paper was published on June 02, 2026 This episode was AI-generated on June 3,…
How a Market of Crippled AI Agents Outscored One Unrestricted Model
How a Market of Crippled AI Agents Outscored One Unrestricted Model Source: Economy of Minds: Emerging Multi-Agent Intelligence with Economic Interactions Paper was published on June 01, 2026 This…
The Reasoning Cliff: Why Thinking Longer Makes Models Worse at Exact Step-by-Step Tasks
The Reasoning Cliff: Why Thinking Longer Makes Models Worse at Exact Step-by-Step Tasks Source: The Deterministic Horizon: When Extended Reasoning Fails and Tool Delegation Becomes Necessary Paper…
Giving Agents a Notebook Instead of New Weights: How ExpGraph Lets Frozen Models Learn
Giving Agents a Notebook Instead of New Weights: How ExpGraph Lets Frozen Models Learn Source: ExpGraph: Model-Agnostic Experience Learning with Graph-Structured Memory for LLM Agents Paper was…
The Trojan Is Your Agent's Memory: Why Single-Step Defenses Miss Persistent Attacks
The Trojan Is Your Agent's Memory: Why Single-Step Defenses Miss Persistent Attacks Source: From Prompt Injection to Persistent Control: Defending Agentic Harness Against Trojan Backdoors Paper was…
How Making a Research Agent Smarter Quietly Makes It Leak Your Secrets
How Making a Research Agent Smarter Quietly Makes It Leak Your Secrets Source: MosaicLeaks:Privacy Risks in Querying-in-the-Open for Deep Research Agents Paper was published on May 29, 2026 This…
AI Agents Tried to Invent a Post-Human Language, And Reinvented Cherokee
AI Agents Tried to Invent a Post-Human Language, And Reinvented Cherokee Source: Emergent Languages in Populations of Language Model Agents: From Token Efficiency to Oversight Evasion Paper was…
How to Catch an AI Attack That No Single Conversation Reveals
How to Catch an AI Attack That No Single Conversation Reveals Source: Stateful Online Monitoring Catches Distributed Agent Attacks Paper was published on May 29, 2026 This episode was AI-generated on…
Frequently Asked Questions
AI Papers: A Deep Dive has published 119 episodes since May 2026, covering topics in Technology.
AI Papers: A Deep Dive is currently highly active with new episodes hourly. Average episode length is 25m.
Sign up on Grep.FM to access contact details for AI Papers: A Deep Dive, including email and social media links.