Docker chose OCI Artifacts for AI model packaging. This post covers the ModelPack standard and the technical reasoning behind it — a new way to treat models like container images.
[01] Background — 4 Eras of Infrastructure Evolution
graph LR
HW["1. Hardware-
centric"] --> VM["2. Virtual
Machines"]
VM --> CT["3. Containers
(Docker / K8s)"]
CT --> AI["4. AI Model-
centric (now)"]
style HW fill:#f5f5f5,stroke:#616161
style VM fill:#e3f2fd,stroke:#1565c0
style CT fill:#fff3e0,stroke:#e65100
style AI fill:#e8f5e9,stroke:#2e7d32
In the current era, developers face these problems when deploying models:
| Problem |
Description |
| Non-standard storage |
Scattered across Hugging Face, S3, custom servers |
| Weight file formats |
Mixed .bin, .safetensors, .gguf, etc. |
| Environment drift |
Loose coupling between model, code, and metadata |
| Vendor lock-in |
Distribution tied to specific platforms |
[02] ModelPack — “Docker for AI Models”
ModelPack is a vendor-neutral open-source standard defining AI model packaging conventions on top of OCI Artifacts.
2-1. Docker Container vs ModelPack
| Aspect |
Docker Container |
ModelPack |
| Packaging target |
Application + runtime |
Model weights + metadata + inference code |
| Format |
OCI Image |
OCI Artifacts |
| Registry |
Docker Hub, Harbor, etc. |
Same registries reused |
| CLI |
docker |
modctl |
Key advantage — existing OCI registries (Docker Hub, Harbor, GitHub Packages) work as-is. No dedicated AI infrastructure needed.
2-2. Three Components of ModelPack
| Component |
Role |
Analogy |
| Model Spec |
Technical rules based on OCI image spec (manifest, config layer, data layer) |
OCI Image Spec |
| Modelfile |
Metadata and file mapping definition (NAME, ARCH, FAMILY, PARAMSIZE, FORMAT) |
Dockerfile |
| modctl |
CLI tool (build, push, pull, extract) |
docker CLI |
2-3. Efficiency Design
Separates model weights and code into distinct layers:
1
2
3
|
Layer 1: model weights (large, rarely changes)
Layer 2: inference code (small, changes often)
Layer 3: metadata (config, license)
|
When code changes, you don’t re-transfer huge weight files — maximizing layer caching effects.
[03] OCI Image vs OCI Artifacts — Why Artifacts?
Docker chose OCI Artifacts, not OCI Image. The reasoning is the core of this decision.
| Item |
OCI Image |
OCI Artifacts |
| Purpose |
Container execution |
Arbitrary content storage/distribution |
| Layer format |
TAR archive (filesystem) |
Free-form (domain-specific) |
| Metadata |
Container runtime info |
Domain-specific (free JSON) |
| Media type |
Fixed (image config, layer) |
Custom definition allowed |
3-2. Four Reasons for Choosing Artifacts
graph TD
OCI["Choose OCI Artifacts"] --> R1["① Domain-specific
metadata"]
OCI --> R2["② Performance
optimization"]
OCI --> R3["③ Separation from
inference engine"]
OCI --> R4["④ Clear intent
expression"]
style OCI fill:#e3f2fd,stroke:#1565c0
① Domain-Specific Metadata
Model size, parameter count, quantization info defined in JSON. Download only the small metadata file first to compare and select models.
1
2
3
4
5
6
7
8
|
{
"format": "gguf",
"quantization": "Q4_K_M",
"parameters": "7B",
"architecture": "llama",
"created": "2026-05-06T...",
"digests": {...}
}
|
| Optimization |
Description |
| No compression |
Models are high-entropy files — compression provides almost no benefit |
| Single file format |
Memory mapping (mmap) possible — faster inference startup |
| Deterministic blobs |
Same model file always yields same blob → deduplication efficiency |
③ Separation from Inference Engine
Models and inference engines (llama.cpp, vLLM, etc.) are deployed separately.
1
2
3
4
5
|
Model (OCI Artifact)
↓ pull
User environment
↓
Engine optimized for the system (installed separately)
|
Benefits:
- User picks the engine matching their GPU/CPU
- No need to package the model with every possible engine combination
④ Clear Intent Expression
Explicitly indicates “this is not an OCI Image” via media types:
- Attempts to run it as a container fail
- Makes clear it can’t run without an inference engine
- Prevents confusion and unexpected errors
1
2
3
|
application/vnd.docker.ai.model.config.v0.1+json ← model config
application/vnd.docker.ai.gguf.v3 ← GGUF model file
application/vnd.docker.ai.license ← license file
|
4-2. Model Layer Characteristics
| Trait |
Description |
| Not a filesystem layer |
Just a file storage |
| Identification |
By media type, not filename |
| Usage |
Docker Model Runner looks up the model store |
| Item |
Example |
| Format |
gguf, safetensors, onnx
|
| Quantization |
Q4_K_M, Q8_0, FP16
|
| Parameters |
7B, 13B, 70B
|
| Architecture |
llama, mistral, qwen
|
| Created timestamp |
ISO 8601 |
| File digest |
SHA256 for integrity |
[05] Real Workflow
5-1. ModelPack — Using modctl
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
|
# 1. Install modctl, prepare model files
modctl --version
# 2. Write a Modelfile (define metadata)
cat > Modelfile <<'EOF'
NAME llama-3-8b
ARCH llama
FAMILY llama-3
PARAMSIZE 8B
FORMAT gguf
CONFIG ./config.json
MODEL ./llama-3-8b-q4.gguf
CODE ./inference.py
DOC ./README.md
EOF
# 3. Build as OCI layers
modctl build -t harbor.example.com/models/llama-3-8b:q4 .
# 4. Push to remote registry
modctl push harbor.example.com/models/llama-3-8b:q4
# 5. Pull and extract on the consuming environment
modctl pull harbor.example.com/models/llama-3-8b:q4
modctl extract harbor.example.com/models/llama-3-8b:q4 ./model/
|
5-2. Using Docker Model Runner
Docker also provides its own tool, Docker Model Runner.
1
2
3
4
5
|
# Auto-convert from Hugging Face and push as OCI Artifact
docker model pull hf://meta-llama/Llama-3-8B-Instruct
# Run LLM locally
docker model run llama-3-8b-instruct
|
[06] Enterprise Benefits
| Area |
Benefit |
| Reuse DevOps infra |
Existing Docker Hub, Artifactory, Harbor work as-is |
| Security |
Registry Access Management (RAM) policy-based access control |
| Versioning |
Use OCI tag system directly |
| Cloud-native |
Deep integration with containerd, Kubernetes |
| First-class object |
Treat models as first-class citizens in cloud-native env |
[07] Future Plans
Docker’s announced upcoming features:
| Feature |
Description |
| Runtime config |
Templates, context size, default parameters |
| LoRA adapters |
Add fine-tuning per use case |
| Multimodal projectors |
Support for vision-language models (VLM) |
| Model index |
List of parameter/quantization variants |
| Deeper containerd integration |
Manage models at the container runtime level |
| ModelPack interop |
Improved compatibility with other standards |
[08] Summary
| Key point |
Content |
| Goal |
Standardize AI model distribution like Docker |
| Choice |
OCI Artifacts (not OCI Image) |
| Reason |
Domain-specific metadata, no compression, engine separation, clear intent |
| Infra |
Reuse existing OCI registries (Docker Hub, Harbor) |
| CLI |
ModelPack’s modctl or docker model
|
| Format ID |
Media type, not filename |
Just as container tech standardized software distribution, OCI Artifacts-based ModelPack/Docker Model Runner aims to standardize AI model distribution. A practical approach that leverages existing cloud infrastructure while ensuring model consistency, portability, and performance.
References