Docker for AI Developers: A Practical Guide

If you're building AI applications, you need Docker. Here's why: AI tools evolve fast, their dependencies conflict constantly, and you don't want to pollute your host system with Python environments for every new tool you try.

This guide covers the Docker workflows I actually use daily as an AI developer.

Why Docker for AI?

Before Docker, setting up an AI tool meant:

Create a Python venv

2. Install 47 dependencies

3. Realize it conflicts with another project's CUDA version

4. Spend an hour debugging

5. Give up

With Docker, every tool gets its own isolated environment. Clean. Fast. Reproducible.

Essential Docker Commands


# Run a container and remove it when done
docker run --rm -it ubuntu:22.04 bash

# Run with GPU access
docker run --gpus all nvidia/cuda:12.2-runtime nvidia-smi

# Mount local directories
docker run -v /path/on/host:/path/in/container my-image

# Run in background
docker run -d --name my-service -p 8080:80 nginx

Docker Compose for AI Stacks

Most AI tools are multi-service. Docker Compose lets you define the entire stack in one file:


version: '3.8'
services:
  app:
    build: .
    ports:
      - "8080:8080"
    depends_on:
      - redis
      - postgres
    environment:
      - REDIS_URL=redis://redis:6379
      - DB_URL=postgres://user:pass@postgres:5432/ai

  redis:
    image: redis:7-alpine
    restart: unless-stopped

  postgres:
    image: pgvector/pgvector:pg16
    environment:
      POSTGRES_DB: ai
      POSTGRES_USER: user
      POSTGRES_PASSWORD: pass
    volumes:
      - pgdata:/var/lib/postgresql/data

volumes:
  pgdata:

One `docker compose up -d` and your entire AI stack is running.

Real-World Example: Self-Hosted AI Agent Stack

Here's the stack I use daily:


services:
  hermes-agent:
    container_name: hermes-agent
    build: ./hermes
    network_mode: host
    volumes:
      - ./hermes/config:/home/user/.hermes

  postgres:
    image: pgvector/pgvector:pg16
    container_name: hermes-db
    ports:
      - "15432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    container_name: hermes-cache
    ports:
      - "6379:6379"

volumes:
  pgdata:

This runs all services with a single command — no manual setup, no dependency hell.

Dockerfile Best Practices for AI


# 1. Use specific tags, not latest
FROM python:3.11-slim

# 2. Install system deps first (cached better)
RUN apt-get update && apt-get install -y \
    build-essential curl git && \
    rm -rf /var/lib/apt/lists/*

# 3. Install Python deps separately (leverage cache)
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# 4. Copy app code last
COPY . .

# 5. Use non-root user
RUN useradd -m appuser
USER appuser

CMD ["python", "app.py"]

Common AI Docker Patterns

GPU Pass-Through


docker run --gpus all nvidia/cuda:12.2-base

Persistent Model Cache


services:
  llm:
    image: my-llm-server
    volumes:
      - model-cache:/root/.cache/huggingface

volumes:
  model-cache:

Health Checks


services:
  api:
    image: my-api
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
      interval: 30s
      timeout: 10s
      retries: 3

Pitfalls I've Hit

**Docker Hub rate limits** — Anonymous pulls are limited. Use `docker login` or a mirror like `hub.rat.dev`.

2. Volume permission issues — Container writes as root; host can't read. Fix with `user: "${UID}:${GID}"` in Compose.

3. Network conflicts — Multiple stacks on the same ports. Use different host ports or custom networks.

4. Disk space — Old images pile up fast. `docker system prune -a` regularly.

Verdict

Docker is non-negotiable for modern AI development. The initial learning curve pays for itself within days. Once you have your stack in Docker Compose, you can tear it down and rebuild it anywhere — your laptop, a cloud VM, or a friend's server.

Learn Docker Compose first. It's the most practical skill for AI developers.