Humans of Reliability

Humans of Reliability

Rootly

Episodes 26
Avg. Duration 26m
Activity Highly Active
Since Jan 2025
Latest Episode Mar 2026

Publishing Details

Schedule
Monthly
Format
Episodic
Consistency
52%
Hosting
feeds.buzzsprout.com

About This Podcast

Behind every reliable software system, there are people working hard to keep it online. 

Humans of Reliability is a series that spotlights the engineers, leaders, and innovators at the heart of incident management and system reliability. Through candid conversations, we explore the challenges, lessons, and personal journeys of those navigating complex technical landscapes to ensure the systems we rely on run smoothly. 

From unforgettable incident stories to favorite tools, workflows, and hobbies, Humans of Reliability uncovers the human side of technology—offering insights and inspiration for anyone passionate about building and maintaining resilient systems.

https://rootly.com/humans-of-reliability

Explore Statistics

Recent Episodes

Burnout Doesn't Ask Permission: Recognizing, Recovering, and Rebuilding w/ Stephen Townsend

Mar 04, 2026 31m

Burnout doesn't announce itself. For Stephen Townsend, SRE team lead and host of the Slight Reliability podcast, it crept in over months of mounting pressure on a massive transformation program, and…

S2026E2 Code Is Cheap, Reliability Isn’t: Owning Production in the AI era w/ Swizec Teller

Feb 16, 2026 29m

Code has never been easier to write. With AI copilots and agentic coding tools, spinning up features feels almost effortless. But production systems don’t run on vibes, they run on reliability.In…

S2026E1 Democratizing Reliability: Empowering Non-Devs with Dileshni Jayasinghe (commonsku)

Jan 14, 2026 22m

Many companies don’t invest in incident management until something goes wrong. commonsku took a different path.In this episode of Humans of Reliability, Sylvain sits down with Dileshni Jayasingha, VP…

S1E23 99%+ Accuracy on a Moving Target: Model Deprecation and Reliability with Tomás Hernando Koffman (Not Diamond)

Dec 22, 2025 30m

Shipping systems powered by LLMs would be hard enough if the models stayed the same. But in reality, they don’t. Models get updated and deprecated at a pace traditional software wouldn’t. All while…

S1E22 The Reality of GenAI in Production with Eduardo Ordax (AWS)

Dec 12, 2025 27m

GenAI demos are easy. Production is where everything breaks. In this episode, Eduardo Ordax, Principal GTM GenAI at AWS, breaks down what actually stops companies from shipping reliable AI systems,…

S1E21 It’s Never Different This Time: LLM Reliability Without the Hype with Julien Simon

Nov 19, 2025 30m

In this episode, Julien Simon, longtime voice in the open-source ML world, reminds us that even in the era of GenAI, reliability fundamentals haven’t changed.Julien breaks down why calling “the same…

S1E20 You Can’t Fix What You Don’t Measure: Observability in the Age of AI with Conor Bronsdon

Nov 05, 2025 31m

Only 50% of companies monitor their ML systems. Building observability for AI is not simple: it goes beyond 200 OK pings. In this episode, Sylvain Kalache sits down with Conor Brondsdon (Galileo) to…

S1E19 The End of “Good Code”? AI, Throughput, and Reliability with CircleCI CTO Rob Zuber

Sep 10, 2025 37m

Is “good code” still the right measure of engineering success in an AI-driven world? In this episode of Humans of Reliability, Rob Zuber, CircleCI CTO, joins Sylvain to explore how coding assistants…

S1E18 Frontline Reliability: Protecting User Journeys with SLOs with Shery Brauner (Razor, ex-Zalando)

Aug 20, 2025 31m

What does it really take to move from firefighting incidents to building reliability at scale? In this episode of Humans of Reliability, Shery Brauner (Razor, ex-Zalando) shares her unique journey…

S1E17 Balancing Reliability at the Crypto-Finance Frontier with Brian Shaw (Uphold)

Jul 03, 2025 13m

Sylvain Kalache sits down with Brian Shaw, Senior Engineering Leader at Uphold, to explore the reliability challenges that arise when operating at the intersection of traditional finance and crypto…

S1E16 Command Under Pressure: David Owczarek on Incident Leadership and Human-Centered Reliability

Jun 17, 2025 23m

Incident response is as much about people as it is about systems. In this episode, David Owczarek, a veteran engineer leader and seasoned incident commander, joins Silvan Kalache to unpack the human…

S1E15 AI at the Frontlines of Healthcare Reliability with Ryan Lockard (CVS Health)

May 30, 2025 24m

AI is transforming reliability work—from reactive firefighting to proactive engineering. In this episode, Ryan Lockard, VP of Platform Engineering and AI Enablement at CVS Health, joins Sylvain…

S1E14 Trust Is the Product: Building Reliable Billing in the AI Era with Cosmo Wolfe (Metronome)

May 26, 2025 20m

In this episode, we sit down with Cosmo Wolfe, Head of Technology at Metronome, to unpack how reliability, trust, and architecture intersect in one of the most critical and overlooked parts of the AI…

S1E13 The Golden Path to Nowhere: When Platforms Undermine Reliability with Chase Roberts (Northflank)

May 14, 2025 27m

Internal platforms promise speed, consistency, and scale — but what happens when they become a distraction? In this episode, Chase Roberts, COO at Northflank, joins Sylvain Kalache to examine the…

S1E12 AI can boost developer productivity, if used right, with Justin Reock, Deputy CTO at DX

Apr 30, 2025 37m

In this episode of Humans of Reliability, we sit down with Justin Reock, Deputy CTO at DX, to unpack the real impact of generative AI on developer productivity. Drawing from early data in DX’s GenAI…

S1E11 Why Reliability in the AI Era Starts with the Network with Marino Wijay

Apr 17, 2025 27m

In this episode, we explore how networking has shaped reliability as we know it. Marino Wijay cloud networking expert and Staff Solutions Architect at Kong shares how his journey began not as an SRE,…

S1E10 Metrics That Matter: Measuring Developer Productivity in the AI Era

Apr 09, 2025 39m

In this episode of Humans of Reliability, Ryan McDonald is joined by Mark Quigley, Head of Platform Engineering at 90, for a conversation that cuts through the noise around developer productivity…

Are AI and Platforms Making SRE Obsolete? With Kaspar von Grünberg, Humanitec’s CEO

Mar 24, 2025 25m

Last year, over 89% of companies claimed to have adopted platform engineering. And, in the past month, LLMs have been disrupting how we think about software development. In this context, Kaspar, asks…

S1E7 Scientific Incident Management with Dan Slimmon

Mar 14, 2025 37m

Dan Slimmon is an incident management veteran who's worked at Etsy, HashiCorp, and now leads consulting and training on pragmatic, non-bureaucratic incident response. In this episode, Dan shares his…

How AI broke serverless and what to do about it with Vercel’s Mariano Fernández Cocirio

Mar 06, 2025 13m

Mariano, Staff Product Manager at Vercel, explains why serverless architectures are hitting unexpected limits—they’re too fast. The industry has spent millions optimizing serverless for speed, but AI…

Frequently Asked Questions

How many episodes does Humans of Reliability have?

Humans of Reliability has published 26 episodes since January 2025, covering topics in Technology.

Is Humans of Reliability still active?

Humans of Reliability is currently highly active with new episodes monthly. Average episode length is 26m.

Similar Podcasts