Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
25 changes: 25 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -111,6 +111,27 @@ RUN curl -LsSf https://astral.sh/uv/install.sh | sh \

RUN /opt/vllm-env/bin/python -c "import vllm; print(vllm.__version__)" > /opt/vllm-env/version

# --- vLLM ROCm variant ---
FROM llamacpp AS vllm-rocm

ARG VLLM_VERSION=0.15.1

USER root

RUN apt update && apt install -y python3 python3-venv python3-dev curl ca-certificates build-essential && rm -rf /var/lib/apt/lists/*

RUN mkdir -p /opt/vllm-env && chown -R modelrunner:modelrunner /opt/vllm-env

USER modelrunner

# Install uv and vLLM with ROCm support
RUN curl -LsSf https://astral.sh/uv/install.sh | sh \
&& ~/.local/bin/uv venv --python /usr/bin/python3 --system-site-packages /opt/vllm-env \
&& ~/.local/bin/uv pip install --python /opt/vllm-env/bin/python \
vllm==${VLLM_VERSION} --extra-index-url https://wheels.vllm.ai/rocm/

RUN /opt/vllm-env/bin/python -c "import vllm; print(vllm.__version__)" > /opt/vllm-env/version
Comment on lines +114 to +133
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

There's significant duplication between this new vllm-rocm stage and the existing vllm stage (lines 86-112). Many of the setup steps, such as installing system dependencies, creating directories, and installing uv, are identical.

To improve maintainability and reduce code duplication, consider creating a common base stage (e.g., vllm-base) from which both vllm and vllm-rocm can inherit. This base stage would handle the common setup, and the specific vllm and vllm-rocm stages would only contain the logic for installing their respective versions of the vllm package.

For example:

# --- vLLM base ---
FROM llamacpp AS vllm-base
USER root
RUN apt update && apt install -y python3 python3-venv python3-dev curl ca-certificates build-essential && rm -rf /var/lib/apt/lists/*
RUN mkdir -p /opt/vllm-env && chown -R modelrunner:modelrunner /opt/vllm-env
USER modelrunner
RUN curl -LsSf https://astral.sh/uv/install.sh | sh

# --- vLLM variant ---
FROM vllm-base AS vllm
# ... vllm specific install
...

# --- vLLM ROCm variant ---
FROM vllm-base AS vllm-rocm
# ... vllm-rocm specific install
...

This would make the Dockerfile cleaner and easier to manage as more variants are added.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a fair point about the duplication here


# --- SGLang variant ---
FROM llamacpp AS sglang

Expand Down Expand Up @@ -206,6 +227,10 @@ FROM vllm AS final-vllm
# Copy the built binary from builder
COPY --from=builder /app/model-runner /app/model-runner

FROM vllm-rocm AS final-vllm-rocm
# Copy the built binary from builder
COPY --from=builder /app/model-runner /app/model-runner

FROM sglang AS final-sglang
# Copy the built binary from builder-sglang (without vLLM)
COPY --from=builder-sglang /app/model-runner /app/model-runner
Expand Down
18 changes: 17 additions & 1 deletion Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,10 @@ LLAMA_SERVER_VERSION := latest
LLAMA_SERVER_VARIANT := cpu
BASE_IMAGE := ubuntu:24.04
VLLM_BASE_IMAGE := nvidia/cuda:13.0.2-runtime-ubuntu24.04
VLLM_ROCM_BASE_IMAGE := rocm/pytorch:rocm6.3.4_ubuntu24.04_py3.12_pytorch_release_2.6.0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

vllm tends to pull it's own version of everything on install in the venv, etc. We might be able to use plain-old ubuntu for all vLLM images. Llama.cpp is a bit different, need to install the dependancies ourselves

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

llama.cpp and vLLM can be on different ROCm versions if needs be. Hard to avoid that, we are moving to the one container image per backend model anyway.

DOCKER_IMAGE := docker/model-runner:latest
DOCKER_IMAGE_VLLM := docker/model-runner:latest-vllm-cuda
DOCKER_IMAGE_VLLM_ROCM := docker/model-runner:latest-vllm-rocm
DOCKER_IMAGE_SGLANG := docker/model-runner:latest-sglang
DOCKER_IMAGE_DIFFUSERS := docker/model-runner:latest-diffusers
DOCKER_TARGET ?= final-llamacpp
Expand All @@ -26,7 +28,7 @@ DOCKER_BUILD_ARGS := \
BUILD_DMR ?= 1

# Main targets
.PHONY: build run clean test integration-tests test-docker-ce-installation docker-build docker-build-multiplatform docker-run docker-build-vllm docker-run-vllm docker-build-sglang docker-run-sglang docker-run-impl help validate lint docker-build-diffusers docker-run-diffusers
.PHONY: build run clean test integration-tests test-docker-ce-installation docker-build docker-build-multiplatform docker-run docker-build-vllm docker-run-vllm docker-build-vllm-rocm docker-run-vllm-rocm docker-build-sglang docker-run-sglang docker-run-impl help validate lint docker-build-diffusers docker-run-diffusers
# Default target
.DEFAULT_GOAL := build

Expand Down Expand Up @@ -106,6 +108,18 @@ docker-build-vllm:
docker-run-vllm: docker-build-vllm
@$(MAKE) -s docker-run-impl DOCKER_IMAGE=$(DOCKER_IMAGE_VLLM)

# Build vLLM ROCm Docker image
docker-build-vllm-rocm:
@$(MAKE) docker-build \
DOCKER_TARGET=final-vllm-rocm \
DOCKER_IMAGE=$(DOCKER_IMAGE_VLLM_ROCM) \
LLAMA_SERVER_VARIANT=rocm \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

critical

This LLAMA_SERVER_VARIANT value will likely cause the build to fail. The Dockerfile uses this variable to construct the path to the llama.cpp binary inside the llama-server image. Based on the llamacpp/native/rocm.Dockerfile provided in the context, the path includes the ROCm version (e.g., ...rocm7.0.2...), but here you are setting the variant to just rocm.

This will result in a path mismatch and a COPY failure in the Dockerfile.

To fix this, the LLAMA_SERVER_VARIANT should include the ROCm version. Assuming the version is 7.0.2 as per rocm.Dockerfile, this line should be updated.

		LLAMA_SERVER_VARIANT=rocm7.0.2 \

BASE_IMAGE=$(VLLM_ROCM_BASE_IMAGE)

# Run vLLM ROCm Docker container with TCP port access and mounted model storage
docker-run-vllm-rocm: docker-build-vllm-rocm
@$(MAKE) -s docker-run-impl DOCKER_IMAGE=$(DOCKER_IMAGE_VLLM_ROCM)

# Build SGLang Docker image
docker-build-sglang:
@$(MAKE) docker-build \
Expand Down Expand Up @@ -160,6 +174,8 @@ help:
@echo " docker-run - Run in Docker container with TCP port access and mounted model storage"
@echo " docker-build-vllm - Build vLLM Docker image"
@echo " docker-run-vllm - Run vLLM Docker container"
@echo " docker-build-vllm-rocm - Build vLLM ROCm Docker image"
@echo " docker-run-vllm-rocm - Run vLLM ROCm Docker container"
@echo " docker-build-sglang - Build SGLang Docker image"
@echo " docker-run-sglang - Run SGLang Docker container"
@echo " docker-build-diffusers - Build Diffusers Docker image"
Expand Down