From 90c8b1aaae7585563f9b5c623188812b6ac1197b Mon Sep 17 00:00:00 2001 From: Eric Curtin Date: Fri, 6 Feb 2026 15:11:39 +0000 Subject: [PATCH] Add Hugging Face support to Docker Model Runner docs Docker Model Runner now supports pulling models from Hugging Face in addition to Docker Hub and OCI-compliant registries. Documentation updated to reflect this new capability across the manual pages. Signed-off-by: Eric Curtin --- content/manuals/ai/model-runner/_index.md | 18 ++++++++++-------- 1 file changed, 10 insertions(+), 8 deletions(-) diff --git a/content/manuals/ai/model-runner/_index.md b/content/manuals/ai/model-runner/_index.md index f65a76613f7f..cf1099480995 100644 --- a/content/manuals/ai/model-runner/_index.md +++ b/content/manuals/ai/model-runner/_index.md @@ -17,8 +17,8 @@ aliases: Docker Model Runner (DMR) makes it easy to manage, run, and deploy AI models using Docker. Designed for developers, Docker Model Runner streamlines the process of pulling, running, and serving -large language models (LLMs) and other AI models directly from Docker Hub or any -OCI-compliant registry. +large language models (LLMs) and other AI models directly from Docker Hub, +any OCI-compliant registry, or [Hugging Face](https://huggingface.co/). With seamless integration into Docker Desktop and Docker Engine, you can serve models via OpenAI and Ollama-compatible APIs, package GGUF files as @@ -32,7 +32,8 @@ with AI models locally. ## Key features -- [Pull and push models to and from Docker Hub](https://hub.docker.com/u/ai) +- [Pull and push models to and from Docker Hub or any OCI-compliant registry](https://hub.docker.com/u/ai) +- [Pull models from Hugging Face](https://huggingface.co/) - Serve models on [OpenAI and Ollama-compatible APIs](api-reference.md) for easy integration with existing apps - Support for [llama.cpp, vLLM, and Diffusers inference engines](inference-engines.md) (vLLM and Diffusers on Linux with NVIDIA GPUs) - [Generate images from text prompts](inference-engines.md#diffusers) using Stable Diffusion models with the Diffusers backend @@ -81,11 +82,12 @@ Docker Engine only: ## How Docker Model Runner works -Models are pulled from Docker Hub the first time you use them and are stored -locally. They load into memory only at runtime when a request is made, and -unload when not in use to optimize resources. Because models can be large, the -initial pull may take some time. After that, they're cached locally for faster -access. You can interact with the model using +Models are pulled from Docker Hub, an OCI-compliant registry, or +[Hugging Face](https://huggingface.co/) the first time you use them and are +stored locally. They load into memory only at runtime when a request is made, +and unload when not in use to optimize resources. Because models can be large, +the initial pull may take some time. After that, they're cached locally for +faster access. You can interact with the model using [OpenAI and Ollama-compatible APIs](api-reference.md). ### Inference engines