If you're building AI applications, you need Docker. Here's why: AI tools evolve fast, their dependencies conflict constantly, and you don't want to pollute your host system with Python environments for every new tool you try.
This guide covers the Docker workflows I actually use daily as an AI developer.
Why Docker for AI?
Before Docker, setting up an AI tool meant:
- Create a Python venv
- **Docker Hub rate limits** — Anonymous pulls are limited. Use `docker login` or a mirror like `hub.rat.dev`.
2. Install 47 dependencies
3. Realize it conflicts with another project's CUDA version
4. Spend an hour debugging
5. Give up
With Docker, every tool gets its own isolated environment. Clean. Fast. Reproducible.
Essential Docker Commands
# Run a container and remove it when done
docker run --rm -it ubuntu:22.04 bash
# Run with GPU access
docker run --gpus all nvidia/cuda:12.2-runtime nvidia-smi
# Mount local directories
docker run -v /path/on/host:/path/in/container my-image
# Run in background
docker run -d --name my-service -p 8080:80 nginx
Docker Compose for AI Stacks
Most AI tools are multi-service. Docker Compose lets you define the entire stack in one file:
version: '3.8'
services:
app:
build: .
ports:
- "8080:8080"
depends_on:
- redis
- postgres
environment:
- REDIS_URL=redis://redis:6379
- DB_URL=postgres://user:pass@postgres:5432/ai
redis:
image: redis:7-alpine
restart: unless-stopped
postgres:
image: pgvector/pgvector:pg16
environment:
POSTGRES_DB: ai
POSTGRES_USER: user
POSTGRES_PASSWORD: pass
volumes:
- pgdata:/var/lib/postgresql/data
volumes:
pgdata:
One `docker compose up -d` and your entire AI stack is running.
Real-World Example: Self-Hosted AI Agent Stack
Here's the stack I use daily:
services:
hermes-agent:
container_name: hermes-agent
build: ./hermes
network_mode: host
volumes:
- ./hermes/config:/home/user/.hermes
postgres:
image: pgvector/pgvector:pg16
container_name: hermes-db
ports:
- "15432:5432"
volumes:
- pgdata:/var/lib/postgresql/data
redis:
image: redis:7-alpine
container_name: hermes-cache
ports:
- "6379:6379"
volumes:
pgdata:
This runs all services with a single command — no manual setup, no dependency hell.
Dockerfile Best Practices for AI
# 1. Use specific tags, not latest
FROM python:3.11-slim
# 2. Install system deps first (cached better)
RUN apt-get update && apt-get install -y \
build-essential curl git && \
rm -rf /var/lib/apt/lists/*
# 3. Install Python deps separately (leverage cache)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
# 4. Copy app code last
COPY . .
# 5. Use non-root user
RUN useradd -m appuser
USER appuser
CMD ["python", "app.py"]
Common AI Docker Patterns
GPU Pass-Through
docker run --gpus all nvidia/cuda:12.2-base
Persistent Model Cache
services:
llm:
image: my-llm-server
volumes:
- model-cache:/root/.cache/huggingface
volumes:
model-cache:
Health Checks
services:
api:
image: my-api
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
interval: 30s
timeout: 10s
retries: 3
Pitfalls I've Hit
2. Volume permission issues — Container writes as root; host can't read. Fix with `user: "${UID}:${GID}"` in Compose.
3. Network conflicts — Multiple stacks on the same ports. Use different host ports or custom networks.
4. Disk space — Old images pile up fast. `docker system prune -a` regularly.
Verdict
Docker is non-negotiable for modern AI development. The initial learning curve pays for itself within days. Once you have your stack in Docker Compose, you can tear it down and rebuild it anywhere — your laptop, a cloud VM, or a friend's server.
Learn Docker Compose first. It's the most practical skill for AI developers.