Publishing Details
About This Podcast
Sample space is a podcast about tools, thoughts and techniques from machine learning practitioners. We talk to toolmakers and practitioners about interesting problems in the real world to find out how great ideas in our field actually manifest.
Podcasting 2.0 Features
Social Media
Explore Statistics
Recent Episodes
S1E13 Time for some (extreme) distillation with Thomas van Dongen - founder of the Minish Lab
Word embeddings might feel like they are a little bit out of fashion. After all, we have attention mechanisms and transformer models now, right? Well, it turns out that if you apply distillation the…
S1E12 Imbalanced learn: regrets and onwards - with Guillaume Lemaitre, maintainer
Imbalanced learn is one of the most popular scikit-learn projects out there. It has support for resampling techniques which historically have always been used for imbalanced classification use-cases.…
S1E11 You want to be in control of your own Copilot - with Ty Dunn, co-founder at Continue.dev
There are many LLMs that you can use for programming these days. Some of them even go into your IDE like Cursor or Github Copilot. But what if you want to tweak these LLMs do to what you want?…
S1E10 What it is like to maintain the scikit-learn docs - with David Arturo Amor Quiroz, scikit-learn docs maintainer
Scikit-learn's documentation pages are celebrated. But not everyone is aware that the project actually has somebody on payroll to take care of it. In this episode we talk to Arturo about stories from…
S1E9 Sqlite can totally do embeddings now - with Alex Garcia, sqlite-vec maintainer
Vector databases are kind of everywhere these days. There is a big pool of VC's that are pooring money into the ecosystem too. But while all of that is happening, sqlite has also gotten support for…
S1E8 How to rethink the notebook - with Akshay Agrawal, co-creator of Marimo
Jupyter has been a great environment to explore computational ideas, but that doesn't mean that it can be the only environment for interactive coding in Python. It also comes with some downsides,…
S1E7 You are always dealing with many tables - with Madelon Hulsebos
When you are working on a data pipeline for ML ... you are never dealing with a single table. It always demands different tables for different reasons that all have to be mashed together in order to…
S1E6 How Narwhals has many end users ... that never use it directly with Marco Gorelli
When you pip install a package you will for sure end up using it later. But often you will also install a bunch of dependencies and it is very likely that you won't directly interact with all of…
S1E5 Pragmatic data science checklists with Peter Bull - cofounder Drivendata
A lot of things can (and have) gone wrong when folks tried to apply data science projects. So how might we prevent that? Maybe what we need to do is to look at the medical profession and their…
S1E4 Model safety, that's a pickle! with Adrin Jalali - scikit-learn maintainer
Historically it's always been the case that you would use a pickle file to store a trained scikit-learn model on disk for deployment. Pickles make sense because these are so flexible, but they do…
S1E3 Moving Towards KDearestNeighbors with Leland McInnes - creator of UMAP
Leland McInnes is known for a lot of packages. There's UMAP, but also PyNNDescent and HDBScan. Recently he's also been working on tools to help visualise clusters of data and he's also cooking up…
S1E2 Talk like a DataFrame, run like SQL with Phillip Cloud - core-committer on Ibis
Ibis is a Python library that offers a single data-frame API, from Python, which can run your queries on many different backends. These include databases like Postgres, but also commercial vendors…
S1E1 Enhancing Jupyter with Widgets with Trevor Manz - creator of anywidget.
In this (first!) episode of Sample Space we talk to Trevor Mantz, the creator of anywidget. It's a (neat!) tool to help you build more interactive notebooks by giving you tools to apply just enough…
S1 Introducing Sample Space
We're starting a new podcast!
Frequently Asked Questions
Sample Space has published 14 episodes since April 2024, covering topics in Technology.
Sample Space is currently dormant with new episodes monthly. Average episode length is 1h 2m.