If you're building AI applications, you need Docker. Here's why: AI tools evolve fast, their dependencies conflict constantly, and you don't want to pollute your host system with Python environments for every new tool you try.

This guide covers the Docker workflows I actually use daily as an AI developer.

Why Docker for AI?

Before Docker, setting up an AI tool meant:

  1. Create a Python venv
  2. 2. Install 47 dependencies

    3. Realize it conflicts with another project's CUDA version

    4. Spend an hour debugging

    5. Give up

    With Docker, every tool gets its own isolated environment. Clean. Fast. Reproducible.

    Essential Docker Commands

    
    # Run a container and remove it when done
    docker run --rm -it ubuntu:22.04 bash
    
    # Run with GPU access
    docker run --gpus all nvidia/cuda:12.2-runtime nvidia-smi
    
    # Mount local directories
    docker run -v /path/on/host:/path/in/container my-image
    
    # Run in background
    docker run -d --name my-service -p 8080:80 nginx
    

    Docker Compose for AI Stacks

    Most AI tools are multi-service. Docker Compose lets you define the entire stack in one file:

    
    version: '3.8'
    services:
      app:
        build: .
        ports:
          - "8080:8080"
        depends_on:
          - redis
          - postgres
        environment:
          - REDIS_URL=redis://redis:6379
          - DB_URL=postgres://user:pass@postgres:5432/ai
    
      redis:
        image: redis:7-alpine
        restart: unless-stopped
    
      postgres:
        image: pgvector/pgvector:pg16
        environment:
          POSTGRES_DB: ai
          POSTGRES_USER: user
          POSTGRES_PASSWORD: pass
        volumes:
          - pgdata:/var/lib/postgresql/data
    
    volumes:
      pgdata:
    

    One `docker compose up -d` and your entire AI stack is running.

    Real-World Example: Self-Hosted AI Agent Stack

    Here's the stack I use daily:

    
    services:
      hermes-agent:
        container_name: hermes-agent
        build: ./hermes
        network_mode: host
        volumes:
          - ./hermes/config:/home/user/.hermes
    
      postgres:
        image: pgvector/pgvector:pg16
        container_name: hermes-db
        ports:
          - "15432:5432"
        volumes:
          - pgdata:/var/lib/postgresql/data
    
      redis:
        image: redis:7-alpine
        container_name: hermes-cache
        ports:
          - "6379:6379"
    
    volumes:
      pgdata:
    

    This runs all services with a single command — no manual setup, no dependency hell.

    Dockerfile Best Practices for AI

    
    # 1. Use specific tags, not latest
    FROM python:3.11-slim
    
    # 2. Install system deps first (cached better)
    RUN apt-get update && apt-get install -y \
        build-essential curl git && \
        rm -rf /var/lib/apt/lists/*
    
    # 3. Install Python deps separately (leverage cache)
    COPY requirements.txt .
    RUN pip install --no-cache-dir -r requirements.txt
    
    # 4. Copy app code last
    COPY . .
    
    # 5. Use non-root user
    RUN useradd -m appuser
    USER appuser
    
    CMD ["python", "app.py"]
    

    Common AI Docker Patterns

    GPU Pass-Through

    
    docker run --gpus all nvidia/cuda:12.2-base
    

    Persistent Model Cache

    
    services:
      llm:
        image: my-llm-server
        volumes:
          - model-cache:/root/.cache/huggingface
    
    volumes:
      model-cache:
    

    Health Checks

    
    services:
      api:
        image: my-api
        healthcheck:
          test: ["CMD", "curl", "-f", "http://localhost:8080/health"]
          interval: 30s
          timeout: 10s
          retries: 3
    

    Pitfalls I've Hit

    1. **Docker Hub rate limits** — Anonymous pulls are limited. Use `docker login` or a mirror like `hub.rat.dev`.
    2. 2. Volume permission issues — Container writes as root; host can't read. Fix with `user: "${UID}:${GID}"` in Compose.

      3. Network conflicts — Multiple stacks on the same ports. Use different host ports or custom networks.

      4. Disk space — Old images pile up fast. `docker system prune -a` regularly.

      Verdict

      Docker is non-negotiable for modern AI development. The initial learning curve pays for itself within days. Once you have your stack in Docker Compose, you can tear it down and rebuild it anywhere — your laptop, a cloud VM, or a friend's server.

      Learn Docker Compose first. It's the most practical skill for AI developers.