My Super Simple ML Workbench (That Covers ~80% of Classic ML)

What do I actually need my ML workbench to do, reliably, over and over?

For ~80% of classic ML work — tabular data, scikit-learn, baseline models — the answer is surprisingly short. The workbench only needs to do five things well:

Give me an isolated Python environment per project.
Install a small, stable set of core libraries.
Let me run notebooks and scripts in the same place.
Make it easy to reproduce the setup on another machine.
Stay fast and boring so I stop thinking about the tooling.

That last point matters more than it sounds. The best workbench is one you forget about.

The Failure Modes (Before Any Model Is Trained)

Most tooling pain happens before you write a single line of model code:

“It worked on my laptop” → different Python or library versions across machines.
“I broke an old project” → global installs or shared environments where one project’s upgrade poisons another.
“I don’t remember how I set this up” → no captured dependencies, just a vague memory of pip install commands.

These are all environment problems, not ML problems. And they all have the same fix: isolate per project, declare dependencies, and lock versions.

From these failure modes, three design constraints fall out:

One command to bootstrap a project (environment + dependencies).
One file that documents dependencies (pyproject.toml).
One place to run both notebooks and scripts.

Enter uv: One Tool for All of It

uv gives me exactly this. It replaces three separate tools with one coherent workflow:

What I used to need	What uv replaces it with
pyenv (Python versions)	`uv` manages Pythons
venv / conda (envs)	`uv` creates project envs
pip / pip-tools (deps)	`uv add` + `uv.lock`

The mental model is simple:

uv init → create a project + pyproject.toml.
uv add → declare dependencies (and lock them).
uv sync → recreate the environment anywhere from uv.lock.
uv run → run commands inside that environment.

Why this matters:

The project folder is the unit of isolation. No global state to corrupt.
Dependencies are declared, not “remembered.” They live in pyproject.toml and are pinned in uv.lock.
Reproducibility becomes the default, not something you bolt on later.

If I can clone the repo on a new machine and run:
uv sync
uv run jupyter lab
…and be productive in under a minute, then the workbench is doing its job.

My 80% Stack

Here is the full list of libraries I start with. It is deliberately small.

Core (always installed):

Library	Role
numpy	Arrays and numerics
pandas	DataFrames
scikit-learn	Classic ML algorithms
matplotlib	Plotting (low-level)
seaborn	Plotting (high-level)
jupyterlab	Notebooks

Python version: 3.10 or 3.11.

Optional (added per project):

xgboost or lightgbm — strong tabular baselines.
polars — faster DataFrame work (when pandas becomes the bottleneck).

Resist the temptation to install everything — you want a small, boring base that you trust.

Every library you add is a version you have to manage and a dependency that can break. Start minimal, add when the project demands it.

The Project Skeleton

my-ml-project/
  pyproject.toml      # dependencies & metadata (managed by uv)
  uv.lock             # exact locked versions
  data/
    raw/              # original data
    processed/        # cleaned / feature-ready data
  notebooks/
    01_exploration.ipynb
    02_first_model.ipynb
  src/
    __init__.py
    features.py       # feature engineering helpers
    models.py         # train/evaluate code
  reports/
    figures/
  README.md

Why this shape works for 80% of classic ML:

Quick experiments live in notebooks/.
Reusable logic (feature transforms, training loops) lives in src/.
Raw vs processed data stays cleanly separated — you never overwrite originals.
Everything is backed by one uv-managed environment with pyproject.toml + uv.lock.

Most tabular ML work is:

Load CSV / Parquet → pandas

Explore and visualise → pandas, seaborn, matplotlib

Train baseline models → scikit-learn

Evaluate and iterate → same stack

Save models → pickle / joblib

…and this all fits comfortably in this simple setup.

The Workflow: uv + VS Code (Jupyter Extension)

This is the workflow I actually use day-to-day. The key idea:

uv owns the environment and dependencies.
VS Code (with the Jupyter extension) starts the Jupyter server.
Notebooks and scripts run in the same uv environment.
I do not manually run uv run jupyter notebook — VS Code handles that.

uv is my single source of truth for the environment. VS Code is just a client that launches Jupyter in that environment.

Step-by-step

1. Create a new project:

uv init my-ml-project
cd my-ml-project

2. Add core dependencies:

uv add jupyterlab pandas scikit-learn matplotlib seaborn numpy

This does three things: updates pyproject.toml, generates/updates uv.lock, and creates or refreshes the virtual environment.

3. Open the folder in VS Code:

code .

Or open it via File → Open Folder in the GUI.

4. Create a notebooks/ folder and a first notebook:

Create notebooks/01_exploration.ipynb.

5. Select the uv environment as the Jupyter kernel:

In the notebook, use the VS Code kernel picker (top-right of the notebook UI) to select the Python interpreter from the uv environment. It is usually the one under this project folder (e.g., .venv/bin/python).

VS Code will start Jupyter inside that environment. From this point, notebooks and scripts share the same dependencies.

6. Test imports:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.linear_model import LinearRegression

If this runs without errors, you are good to go.

7. Ongoing rules:

For new libraries: uv add <package> from the terminal.
Always select the same uv interpreter as the kernel for notebooks in this project.
Never let VS Code create its own separate environment for this project. uv is the authority.

Workflow at a Glance

Workflow Diagram

This is the whole workbench. No Docker, no cloud notebooks, no conda channels to debug. Just uv, VS Code, and a handful of libraries you already know.

When you find yourself fighting the tooling instead of exploring the data, the setup is too complicated. This one stays out of the way — and that is the point.

My Super Simple ML Workbench (That Covers ~80% of Classic ML)

The Failure Modes (Before Any Model Is Trained)

Enter uv: One Tool for All of It

My 80% Stack

The Project Skeleton

The Workflow: uv + VS Code (Jupyter Extension)

Step-by-step

Workflow at a Glance

ML Classics

Related Notes

Lightweight Governance for Coding Assistants

Evaluating ML Models: It’s About Choosing Your Mistakes