In this lab, you will containerize a machine learning training pipeline and inference server using Docker. You will train a Wine classifier, serve it via Flask, and orchestrate both containers with Docker Compose. A key focus of this lab is Docker volume management — you will use both named volumes and bind mounts to share data between containers and persist artifacts on the host.
-
Deliverable 1: The training script has been run in a container and the resulting model is saved to a shared volume. Able to explain why Docker is useful for reproducibility and portability in ML training scenarios.
-
Deliverable 2: Containerize the inference service to serve predictions on a specific port and show the
./logs/predictions.logfile on your host to the TA. Explain what the Dockerfile is and how it helps containerize the inference service. -
Deliverable 3: Call the inference service health endpoint before and after destroying the named volume to demonstrate how model availability changes. Explain the difference between named volumes and bind mounts in Docker.
Install Docker on your system and verify your installation:
docker run hello-worldFollow the instructions for your operating system: https://docs.docker.com/get-docker/
Create a Docker container for the training code. When launched, the container should train a Wine classifier and save the model file to a shared volume. You can use the partially completed Dockerfile and code in docker/training/.
Create a RandomForestClassifier and train it on the Wine dataset
- Save the trained model using
joblib.dump()to/app/models/wine_model.pkl
Open docker/training/Dockerfile. It is nearly complete. Fill in the TODO to set the command that runs the training script.
Build the training image and run it, mounting a named volume for model storage:
docker build -t mlip-training -f docker/training/Dockerfile .
docker run --rm -v wine_model_storage:/app/models mlip-trainingYou should see output showing the test accuracy and a message that the model was saved.
What is a named volume? When you use -v wine_model_storage:/app/models, Docker creates a named volume called wine_model_storage that is managed by Docker. The data in this volume persists even after the container exits.
Create a Docker container that loads the trained model from the shared volume and serves predictions via a Flask API. The server also logs predictions to a bind-mounted directory so you can inspect them from the host.
Open docker/inference/server.py. You need to:
Load the trained model from the shared volume, extract features from the incoming JSON request, run inference, and log each prediction to a host-mounted log file (/app/logs/predictions.log)
The server includes a /health endpoint that reports whether the model file exists — this is useful for debugging volume issues.
Create a new file docker/inference/Dockerfile from scratch. It should:
- Use
python:3.11-slimas the base image - Set the working directory to
/app - Copy and install dependencies from a requirements file
- Copy
server.pyto the working directory - Expose port 8081
- Set the command to run
server.py
HINT: Look at the training Dockerfile for reference. Note that the build context is the project root, so paths should be docker/inference/....
You will also need to create docker/inference/requirements.txt with the necessary packages (Flask, scikit-learn, joblib, numpy).
Create a local directory for logs, then run the inference container with both a named volume (for the model) and a bind mount (for logs):
mkdir -p ./logs
docker build -t mlip-inference -f docker/inference/Dockerfile .
docker run --rm -p 8081:8080 \
-v wine_model_storage:/app/models \
-v $(pwd)/logs:/app/logs \
mlip-inferenceNotice the two -v flags:
wine_model_storage:/app/models— named volume (Docker-managed, shared with training)$(pwd)/logs:/app/logs— bind mount (maps your local./logs/directory into the container)
Check the health endpoint:
curl http://localhost:8081/healthSend a prediction request (13 Wine features):
curl -X POST http://localhost:8081/predict \
-H 'Content-Type: application/json' \
-d '{"input": [13.2, 1.78, 2.14, 11.2, 100, 2.65, 2.76, 0.26, 1.28, 4.38, 1.05, 3.40, 1050]}'Test error handling with a bad request:
curl -X POST http://localhost:8081/predict \
-H 'Content-Type: application/json' \
-d '{"bad_key": [1,2,3]}'After sending predictions, check your local ./logs/ directory — you should see a predictions.log file with timestamped entries. This is the bind mount in action: the container writes to /app/logs/ and the file appears on your host filesystem.
Docker Compose allows you to define and manage multi-container applications without long command-line parameters. Complete the docker-compose.yml file to set up both services.
You need to fill in:
- Build context and Dockerfile path for each service
- Named volume
wine_model_storagemounted to/app/modelson both services (for sharing the model) - Bind mount
./logsmapped to/app/logson the inference service (for prediction logs) - Port mapping for the inference service
- Named volume definition in the
volumes:section at the bottom
Then run:
docker compose up --buildAfter both services start, test with the same curl commands from Step 2d. Verify that:
- The prediction endpoint returns a valid wine class
- The
./logs/predictions.logfile on your host is being written to
Shut it down:
docker compose downThis step demonstrates the differences in how Docker manages data persistence and host-container sharing.
Check where Docker physically stores your model on the host:
docker volume ls
docker volume inspect wine_model_storageObserve the Mountpoint field; this is the Docker-managed path on your machine.
Verify that the trained model persists across training container destruction:
Start both containers to run training and inference:
docker compose up --build
(Confirm health: curl http://localhost:8081/health)
Stop and remove containers while retaining the named volume:
docker compose down
Restart only the inference container:
docker compose up inference
Run the health check again and confirm it still returns healthy, demonstrating that the model was successfully persisted in the named volume.
To fully reset the environment and delete the model:
docker compose down -vThe -v flag wipes the Named Volume. Now run up the inference container again, and verify the health again, discuss your results with the TA.
If you encounter issues:
- Check Docker daemon status
- Verify port availability (is port 8081 already in use?)
- Review service logs with
docker compose logs - Ensure the training service completes before the inference service starts
- Use
docker compose execto inspect container file systems - If the model file is missing, check that your named volume is correctly mounted with
docker volume ls - If logs are not appearing on the host, verify your bind mount path