Linux

Setting Up a Python Development Environment with Docker Compose

John Doe 8 min read

Setting Up a Python Development Environment with Docker Compose

This guide walks you through setting up a robust Python development environment using Docker Compose. The setup includes services for PostgreSQL, Jupyter Notebook, Streamlit, and pytest, ensuring a seamless workflow for development, testing, and data management.


Project Structure

The directory structure for the project is organized as follows:

.
├── docker-compose.yml
├── dockerfiles/
│   ├── Dockerfile_postgres
│   ├── Dockerfile_notebook
│   ├── Dockerfile_streamlit
│   └── Dockerfile_pytest
├── src/
│   ├── notebooks/
│   └── streamlit_dashboard/
│       └── streamlit_app.py
├── data/
│   ├── db/
│   └── sql/
├── requirements.txt
└── .env

Service Components

1. PostgreSQL Database

The PostgreSQL service provides a persistent database using the official PostgreSQL Alpine image. Below is the Dockerfile:

FROM postgres:15.7-alpine3.19

RUN apk update && apk add vim less

ENV PAGER less
ENV EDITOR vim

Key configuration in docker-compose.yml:

database:
  build:
    dockerfile: dockerfiles/Dockerfile_postgres
    context: .
  volumes:
    - ./data/db:/var/lib/postgresql/data
    - ./data/sql:/dump-data
  environment:
    - POSTGRES_USER=postgres
    - POSTGRES_PASSWORD=postgres
    - POSTGRES_DB=dfs
  ports:
    - 5432:5432

2. Jupyter Notebook

The notebook service uses the Jupyter scipy-notebook base image, with additional Python packages installed for database interaction and project dependencies:

FROM jupyter/scipy-notebook

USER root
RUN apt update -y && apt install -y libpq-dev python3-dev gcc g++
USER jovyan

COPY ./requirements.txt .
RUN pip install psycopg2-binary
RUN pip install -r requirements.txt

Configuration in docker-compose.yml:

notebook:
  build:
    dockerfile: dockerfiles/Dockerfile_notebook
    context: .
  volumes:
    - ./src:/src
    - ./src/notebooks:/home/jovyan/work
  ports:
    - 18888:8888
  command: jupyter notebook

3. Streamlit Dashboard

The Streamlit service uses Python 3.11 as the base image and installs necessary packages for web application hosting:

FROM python:3.11

WORKDIR /app

COPY requirements.txt .
RUN pip install -r requirements.txt
RUN pip install streamlit streamlit_modal rich

ENV PYTHONPATH=/app/src

Configuration in docker-compose.yml:

streamlit:
  build:
    dockerfile: dockerfiles/Dockerfile_streamlit
    context: .
  volumes:
    - ./:/src
    - ./requirements.txt:/src/src/requirements.txt
  ports:
    - 18501:8501
  command: streamlit run /src/src/streamlit_dashboard/streamlit_app.py

4. Testing Environment

The pytest service is configured to run automated tests within the project:

FROM python:3.11

WORKDIR /src

COPY requirements.txt .
RUN pip install -r requirements.txt
RUN pip install pytest

Configuration in docker-compose.yml:

pytest:
  profiles:
    - test
  build:
    context: .
    dockerfile: dockerfiles/Dockerfile_pytest
  volumes:
    - .:/src
  working_dir: /src/test
  entrypoint: ["pytest", "-s", "-o", "log_cli=true"]

Network Configuration

All services are connected via an internal bridge network for seamless communication:

networks:
  internal-network:
    driver: bridge

Getting Started

Step 1: Create Necessary Directories

Run the following commands to set up the required directories:

mkdir -p src/notebooks src/streamlit_dashboard data/{db,sql}

Step 2: Create an .env File

Define environment variables in the .env file:

touch .env

Step 3: Start the Services

Use Docker Compose to start all services in detached mode:

docker compose up -d

Access Points:

Step 4: Run Tests

To execute tests with pytest:

docker compose --profile test run pytest

Best Practices

  1. Secure Sensitive Information: Store credentials and sensitive data in the .env file and avoid hardcoding.
  2. Use Volume Mounts: Leverage volume mounts for persistent data and easy access during development.
  3. Isolate Build Contexts: Keep separate build contexts for each service to optimize Docker performance.
  4. Pin Image Versions: Use specific version tags for base images to ensure reproducibility.
  5. Configure PYTHONPATH: Set PYTHONPATH for clean imports and better module organization.

Frequently Asked Questions

How do I connect to the PostgreSQL database from my Python code?

To connect to the PostgreSQL database from your Python code running in the Jupyter notebook or Streamlit app:

import psycopg2

# Connect to the database
conn = psycopg2.connect(
    host="database",  # Use the service name as the hostname
    database="dfs",
    user="postgres",
    password="postgres",
    port=5432
)

# Create a cursor
cur = conn.cursor()

# Execute a query
cur.execute("SELECT * FROM your_table")

# Fetch results
results = cur.fetchall()

# Close the connection
cur.close()
conn.close()

Note that you use host="database" because Docker Compose creates an internal network where services can reference each other by their service name.

How can I add more Python packages to my environment?

To add more Python packages:

  1. Add the package names to your requirements.txt file
  2. Rebuild the affected services:
    docker compose build notebook streamlit pytest
    
  3. Restart the services:
    docker compose up -d
    

For one-time package installations in Jupyter, you can also use !pip install package_name in a notebook cell.

Can I use this setup for production environments?

This setup is primarily designed for development. For production:

  1. Remove development-specific services like Jupyter Notebook
  2. Add proper security measures (secure passwords, SSL)
  3. Configure proper logging and monitoring
  4. Use Docker Swarm or Kubernetes for orchestration
  5. Set up proper backup strategies for the database
  6. Consider using managed services for critical components

How do I persist data between container restarts?

Data persistence is handled through Docker volumes:

  1. Database data is stored in ./data/db:/var/lib/postgresql/data
  2. Your code is mounted from your local directories to the containers
  3. For additional data that needs to persist, add more volume mounts in the docker-compose.yml file:
    volumes:
      - ./local/path:/container/path
    

What if I need to debug issues in a container?

To debug issues:

  1. Access container logs:

    docker compose logs service_name
    
  2. Get an interactive shell in a running container:

    docker compose exec service_name bash
    # or for Alpine-based images:
    docker compose exec service_name sh
    
  3. Check container status:

    docker compose ps
    

How can I expose my Streamlit app to others on my network?

By default, Streamlit binds to localhost. To make it accessible on your network:

  1. Modify the Streamlit command in docker-compose.yml:

    command: streamlit run /src/src/streamlit_dashboard/streamlit_app.py --server.address=0.0.0.0
    
  2. If needed, configure your host firewall to allow connections on port 18501

  3. Others can access your app using your machine’s IP address: http://your-ip-address:18501


This setup offers a complete development environment with:

  • Persistent database management
  • Interactive notebook development
  • Streamlit-based web application hosting
  • Automated testing with pytest

By following this guide, you’ll have a scalable and maintainable Python development environment powered by Docker Compose.