Data Engineering Podcast

Data Engineering Podcast

Tobias Macey

Episodes 512
Avg. Duration 54m
Activity Highly Active
Apple Rating 4.6 (145)
Since Jan 2017
Latest Episode Jun 2026

Outreach Signals

Open to Sponsors Features Guests

Publishing Details

Schedule
Weekly
Format
Episodic
Consistency
57%
Hosting
serve.podhome.fm

Contact & Outreach

About This Podcast

This show goes behind the scenes for the tools, techniques, and difficulties associated with the discipline of data engineering. Databases, workflows, automation, and data manipulation are just some of the topics that you will find here.

Podcasting 2.0 Features

chapters episode funding images medium person podping podroll remoteItem soundbite

Social Media

Explore Statistics

Recent Episodes

Text to Data Products: Kaarvi’s End-to-End AI for Ingestion, Quality, and Dashboards

Jun 08, 2026 52m

Summary In this episode Shravan Gunda, founder and CEO of Kaarvi AI, talks about building an AI-native, agent-driven data platform designed to eliminate the janitorial work that consumes most data…

Scaling Graph Analytics Without ETL: Inside PuppyGraph’s Architecture

Jun 01, 2026 54m Transcript

SummaryIn this episode Weimo Liu, co‑founder of PuppyGraph, talks about the engineering behind their “zero-copy” graph querying engine for lakehouse and database sources. He explores how PuppyGraph…

Maximizing GPU Utilization: Heterogeneous Pipelines with Ray and Kubernetes

May 06, 2026 58m Transcript

SummaryIn this episode Robert Nishihara, co-founder of Anyscale and co-creator of Ray, talks about maximizing hardware utilization for AI and data-intensive workloads. He explores Ray’s evolution…

The AI-First Data Engineer: 10–50x Productivity and What Changes Next

Apr 07, 2026 59m Transcript

Summary In this episode, I sit down with Gleb Mezhanskiy, CEO and co-founder of Datafold, to explore how agentic AI is reshaping data engineering. We unpack the leap from chat-assisted coding to…

Treat Metering Like Finance: Building Data Platforms for Consumption Economics

Mar 29, 2026 50m Transcript

Summary In this episode Himant Goyal, Senior Product Manager at Salesforce, talks about how data platform investments enable reliable, accurate metering for consumption-based business models. Himant…

Beyond the PDF: Rowan Cockett on Reproducible, Composable Science

Mar 22, 2026 42m Transcript

Summary In this episode Rowan Cockett, co-founder and CEO of CurveNote and co-founder of the Continuous Science Foundation, talks about building data systems that make scientific research…

Beyond Prompts: Practical Paths to Self‑Improving AI

Mar 16, 2026 1h 1m Transcript

Summary In this episode Raj Shukla, CTO of SymphonyAI, explores what it really takes to build self‑improving AI systems that work in production. Raj unpacks how agentic systems interact with…

Orion at Gravity: Trustworthy AI Analysts for the Enterprise

Mar 08, 2026 1h 5m Transcript

Summary In this episode of the Data Engineering Podcast, Lucas Thelosen and Drew Gilson, co-founders of Gravity, discuss their vision for agentic analytics in the enterprise, enabled by semantic…

From Models to Momentum: Uniting Architects and Engineers with ER/Studio

Mar 02, 2026 45m Transcript

Summary In this episode of the Data Engineering Podcast, Jamie Knowles (Product Director) and Ryan Hirsch (Product Marketing Manager) discuss the importance of enterprise data modeling with…

From Data Models to Mind Models: Designing AI Memory at Scale

Feb 22, 2026 57m Transcript

Summary In this episode of the Data Engineering Podcast, Vasilije "Vas" Markovich, founder of Cognee, discusses building agentic memory, a crucial aspect of artificial intelligence that enables…

Prompt Management, Tracing, and Evals: The New Table Stakes for GenAI Ops

Feb 15, 2026 50m Transcript

Summary In this episode of the Data Engineering Podcast, Aman Agarwal, creator of OpenLit, discusses the operational groundwork required to run LLM-powered applications reliably and cost-effectively.…

From Legacy to AI-Ready: How MongoDB AMP Accelerates Modernization

Feb 08, 2026 46m Transcript

SummaryIn this episode, Shilpa Kolhar, SVP of Product and Engineering at MongoDB, discusses using MongoDB as a unified foundation for AI-driven and agentic applications. She explains how the…

Branches, Diffs, and SQL: How Dolt Powers Agentic Workflows

Feb 01, 2026 56m Transcript

Summary In this episode Tim Sehn, founder and CEO of DoltHub, talks about Dolt - the world’s first version‑controlled SQL database - and why Git‑style semantics belong at the heart of data systems…

Logical First, Physical Second: A Pragmatic Path to Trusted Data

Jan 25, 2026 40m Transcript

Summary In this episode of the Data Engineering Podcast Jamie Knowles, Product Director for ER/Studio, talks about data architecture and its importance in driving business meaning. He discusses how…

Your Data, Your Lake: How Observe Uses Iceberg and Streaming ETL for Observability

Jan 18, 2026 1h 12m Transcript

Summary In this episode Jacob Leverich, cofounder and CTO of Observe, talks about applying lakehouse architectures to observability workloads. Jacob discusses Observe’s decision to leverage…

Semantic Operators Meet Dataframes: Building Context for Agents with FENIC

Jan 12, 2026 56m Transcript

Summary In this episode Kostas Pardalis talks about Fenic - an open-source, PySpark-inspired dataframe engine designed to bring LLM-powered semantics into reliable data engineering workflows. Kostas…

Beyond Dashboards: How Data Teams Earn a Seat at the Table

Jan 05, 2026 49m Transcript

Summary In this episode Goutham Budati about his Data–Perspective–Action framework and how it empowers data teams to become true business partners. Gautham traces his path from automating Excel…

Unfreezing The Data Lake: The Future-Proof File Format

Dec 29, 2025 59m Transcript

Summary In this episode PhD researcher Xinyu Zeng talks about F3, the “future-proof file format” designed to address today’s hardware realities and evolving workloads. He digs into the limitations of…

From Context to Semantics: How Metadata Powers Agentic AI

Dec 21, 2025 1h 6m Transcript

Summary In this episode Suresh Srinivas and Sriharsha Chintalapani explore how metadata platforms are evolving from human-centric catalogs into the foundational context layer for AI and agentic…

From Data Engineering to AI Engineering: Where the Lines Blur

Dec 14, 2025 26m Transcript

Summary In this solo episode of the Data Engineering Podcast, host Tobias Macey reflects on how AI has transformed the practice and pace of data engineering over time. Starting from its origins in…

Frequently Asked Questions

How many episodes does Data Engineering Podcast have?

Data Engineering Podcast has published 512 episodes since January 2017, covering topics in Education, Technology.

Is Data Engineering Podcast still active?

Data Engineering Podcast is currently highly active with new episodes weekly. Average episode length is 54m.

How do I contact Data Engineering Podcast for sponsorship or guest appearances?

Sign up on Grep.FM to access contact details for Data Engineering Podcast, including email and social media links.

Similar Podcasts