Failure-First Embodied AI
Failure-First Embodied AI
Publishing Details
Contact & Outreach
About This Podcast
Explore Statistics
Recent Episodes
[Blog] Robot Dogs Are a Security Nightmare — And We Can Prove It
Eight CVEs. A wormable Bluetooth exploit. An encrypted backdoor sending data to Chinese servers. And police departments buying them anyway. A deep dive into the Unitree vulnerability landscape and…
[Daily Paper] When World Models Dream Wrong: Physical-Conditioned Adversarial Attacks against World Models
The first white-box adversarial attack on generative world models targets physical-condition channels to corrupt autonomous planning while maintaining perceptual fidelity. World models have emerged…
[Daily Paper] A Comparative Evaluation of AI Agent Security Guardrails
A systematic benchmark of four commercial AI agent guardrail systems reveals critical gaps in detecting indirect prompt injection and tool abuse across major cloud providers. The deployment of AI…
[Daily Paper] Implicit Jailbreak Attacks via Cross-Modal Information Concealment on Vision-Language Models
A steganography-based attack that hides malicious instructions inside images using least significant bit encoding, achieving 90%+ jailbreak success rates on GPT-4o and Gemini in under three…
[Daily Paper] VeriGuard: Enhancing LLM Agent Safety via Verified Code Generation
A dual-stage framework that provides formal safety guarantees for LLM-based agents through offline policy verification and lightweight runtime monitoring. VeriGuard addresses a fundamental question…
[Daily Paper] Low-Resource Languages Jailbreak GPT-4
Translating harmful queries into low-resource languages bypasses GPT-4's safety filters at high rates, exposing a systematic cross-lingual gap in LLM safety training. Safety alignment research has…
[Daily Paper] RedAgent: Red Teaming Large Language Models with Context-aware Autonomous Language Agent
A multi-agent system that models jailbreak strategies as reusable abstractions, enabling context-aware attacks that break most black-box LLMs in under five queries and uncovered 60 real-world…
[Daily Paper] LlamaFirewall: An Open Source Guardrail System for Building Secure AI Agents
LlamaFirewall provides a three-layer open-source defense framework protecting agentic LLM systems from prompt injection, goal misalignment, and insecure code generation at runtime. Safety alignment…
[Daily Paper] Towards Physically Realizable Adversarial Attacks in Embodied Vision Navigation
Adversarial patches on physical objects reduce navigation success rates by over 22% in embodied agents, using multi-view optimization and two-stage opacity tuning to remain effective and…
[Daily Paper] ARMOR: Aligning Secure and Safe Large Language Models via Meticulous Reasoning
ARMOR defends LLMs against jailbreak attacks by using inference-time reasoning to detect attack strategies, extract true intent, and apply policy-grounded safety analysis. Most jailbreak defences…
[Daily Paper] Vision-Language-Action Safety: Threats, Challenges, Evaluations, and Mechanisms
A comprehensive survey unifying VLA safety research across adversarial attacks, defenses, benchmarks, and six deployment domains. If you want to understand where the embodied AI safety field stands…
[Daily Paper] Refusal Falls off a Cliff: How Safety Alignment Fails in Reasoning Models
Mechanistic analysis of reasoning models discovers the 'refusal cliff'—models correctly identify harmful prompts during thinking but systematically suppress their refusal at the final output…
[Daily Paper] Using Large Language Models for Embodied Planning Introduces Systematic Safety Risks
DESPITE benchmark reveals that across 23 models, near-perfect planning ability does not ensure safety—the best planner still generates dangerous plans 28.3% of the time. One of the persistent…
[Daily Paper] CART: Context-Aware Terrain Adaptation using Temporal Sequence Selection for Legged Robots
CART introduces a context-aware terrain adaptation controller that fuses proprioceptive and exteroceptive sensing to enable legged robots to robustly walk on complex off-road terrain, evaluated…
[Daily Paper] Safe Unlearning: A Surprisingly Effective and Generalizable Solution to Defend Against Jailbreak Attacks
Directly removing harmful knowledge from LLMs via machine unlearning—with just 20 training examples—cuts jailbreak success rates more effectively than safety fine-tuning on 100k samples. Safety…
[Daily Paper] An Anatomy of Vision-Language-Action Models: From Modules to Milestones and Challenges
A structured survey that treats Safety as one of five foundational VLA challenges alongside Representation, Execution, Generalization, and Evaluation. When VLA models began scaling beyond research…
[Daily Paper] FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models
FailSafe introduces a scalable failure generation and recovery system that automatically creates diverse failure cases with executable recovery actions, boosting VLA manipulation success by up to…
[Daily Paper] C-ΔΘ: Circuit-Restricted Weight Arithmetic for Selective Refusal
C-ΔΘ uses mechanistic circuit analysis to localize refusal-causal computation and distill it into a sparse offline weight update, eliminating per-request inference-time safety hooks. Safety…
[Daily Paper] Attention-Guided Patch-Wise Sparse Adversarial Attacks on Vision-Language-Action Models
ADVLA exploits attention maps and Top-K masking to craft sparse, stealthy adversarial patches in VLA models' textual feature space, achieving high attack success rates while remaining nearly…
[Daily Paper] Symbolic Guardrails for Domain-Specific Agents: Stronger Safety and Security Guarantees Without Sacrificing Utility
A systematic study of 80 agent safety benchmarks shows that 74% of specifiable policies can be enforced by symbolic guardrails, providing formal safety guarantees that training-based methods…
Frequently Asked Questions
Failure-First Embodied AI has published 434 episodes since September 2025, covering topics in Education, Science.
Failure-First Embodied AI is currently declining with new episodes daily.
Sign up on Grep.FM to access contact details for Failure-First Embodied AI, including email and social media links.
Similar Podcasts
Something You Should Know
Mike Carruthers | OmniCast Media
1,280 episodes
EconTalk
Russ Roberts
1,053 episodes
Personality Hacker Podcast
Joel Mark Witt & Antonia Dodge
761 episodes
You Are Not So Smart
You Are Not So Smart
338 episodes
The Psychology of your 20s
iHeartPodcasts
426 episodes
Therapist Uncensored Podcast
Sue Marriott LCSW, CGP & Ann Kelley PhD
300 episodes