File tree Expand file tree Collapse file tree 3 files changed +3
-3
lines changed
Expand file tree Collapse file tree 3 files changed +3
-3
lines changed Original file line number Diff line number Diff line change @@ -23,7 +23,7 @@ This quickstart guide is intended for engineers familiar with k8s and model serv
23231 . ** Install the Inference Extension CRDs:**
2424
2525 ``` sh
26- kubectl apply -k https://github.com/kubernetes-sigs/gateway-api-inference-extension/config/crd
26+ kubectl apply -f https://github.com/kubernetes-sigs/gateway-api-inference-extension/releases/download/v0.1.0/manifests.yaml
2727 ```
2828
29291 . ** Deploy InferenceModel**
Original file line number Diff line number Diff line change 7171 spec :
7272 containers :
7373 - name : inference-gateway-ext-proc
74- image : us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:main
74+ image : us-central1-docker.pkg.dev/k8s-staging-images/gateway-api-inference-extension/epp:v0.1.0
7575 args :
7676 - -poolName
7777 - " vllm-llama2-7b-pool"
Original file line number Diff line number Diff line change 1414 spec :
1515 containers :
1616 - name : lora
17- image : " vllm/vllm-openai:latest "
17+ image : " vllm/vllm-openai:0.7.1 "
1818 imagePullPolicy : Always
1919 command : ["python3", "-m", "vllm.entrypoints.openai.api_server"]
2020 args :
You can’t perform that action at this time.
0 commit comments