Publishing Details
About This Podcast
Social Media
Explore Statistics
Recent Episodes
Voice Intelligence at Scale: From Call of Duty to Fraud Detection with Modulate AI
Every day billions of voice conversations happen across games, customer service calls, and financial transactions. Almost none of them are understood by machines. In this episode of Inference Time…
From GPU Scarcity to GPU Waste: Solving the Utilization Crisis
In this episode of Inference Time Tactics, Cooper and Byron sit down with Charlie and Anil from Rapt AI to tackle one of the industry's most expensive problems: GPU underutilization. With half a…
Lessons from the Leading Edge: What 420 AI Deployments Reveal About Enterprise Success
In this episode of Inference Time Tactics, Rob, Cooper, and Byron sit down with Shawn Rogers, CEO of BARC US to unpack fresh data from 421 organizations actively deploying AI in production. Shawn…
The Thinking Algorithm Leaderboard: Why No Single Model Wins
In this episode of Inference Time Tactics, Cooper and Byron break down NeuroMetric's Thinking Algorithm Leaderboard and what it reveals about building production-ready AI agents. They share why…
Benchmarking Generalization: How AI Learns Beyond Training Data
In this episode of Inference Time Tactics, Rob and Cooper from Neurometric sit down with Yash Sharma, an AI researcher whose work is reshaping how we understand model generalization. Yash recently…
Solving the Cold Start Problem in AI Inference
In this episode of Inference Time Tactics, Rob, Cooper, and Byron sit down with Prashanth Velidandi, co-founder of InferX, to explore how serverless inference is tackling the AI “cold start problem.”…
From MIT Decoding Research to Today’s Inference Tradeoffs
Check out the latest episode of Inference Time Tactics. Our guest is Pawan Deshpande, founder, product leader, and angel investor in companies like Anthropic and Toast, with roles at Google, Scale AI…
Drag, Drop, and Deploy: Rethinking How We Build AI Systems
In this episode of Inference Time Tactics, Rob, Cooper, Byron, and Dave share product updates for Neurometric’s Inference Time Compute Studio and what they reveal about the shift from single models…
Beyond Vibe Testing: Smarter Eval for Agentic AI
In this episode of Inference Time Tactics, Rob, Cooper, and Byron explore Salesforce’s CRMArena-Pro benchmark and what it reveals about the limits of enterprise AI agents. They share why benchmark…
GPT-5, The $100B Gap, and The Economics of Inference
In this episode of Inference Time Tactics, Rob and Cooper unpack the launch of GPT 5.0 and what OpenAI’s new routing layer signals about the shifting AI landscape. They explore the tradeoffs of cost,…
When AI Overthinks: Lessons from the Illusion of Thinking Paper
In this episode of Inference Time Tactics, Rob, Cooper, and CTO Byron unpack Apple’s “Illusion of Thinking” paper—why it split the AI community, what it reveals about reasoning model limits, and how…
The Strategic Trade Offs Behind Inference Time Compute Decisions
In this episode of Inference Time Tactics, Rob and Cooper dig into the strategic trade-offs driving a major shift in AI: why some enterprises start with closed models like OpenAI or Anthropic, then…
Why Inference Time Compute Is the Future of AI
Welcome to the very first episode of Inference Time Tactics — the podcast for builders, researchers, and engineers pushing the limits of AI performance. In this kickoff conversation, hosts Rob May…
Frequently Asked Questions
Inference Time Tactics has published 13 episodes since August 2025, covering topics in Technology.
Inference Time Tactics is currently highly active with new episodes monthly. Average episode length is 29m.