-
Notifications
You must be signed in to change notification settings - Fork 339
add helm deploy for EC-RAG #2413
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Yongbozzz
wants to merge
5
commits into
opea-project:main
Choose a base branch
from
Yongbozzz:zyb/helm
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
5 commits
Select commit
Hold shift + click to select a range
82dbc3e
add helm deploy for EC-RAG
Yongbozzz 99b4b06
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] 1506349
update dir for CI test
Yongbozzz d2aa12d
Merge remote-tracking branch 'origin/zyb/helm' into zyb/helm
Yongbozzz 537cc08
update dir for CI test
Yongbozzz File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| # Copyright (C) 2026 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| apiVersion: v2 | ||
| name: edgecraftrag | ||
| description: Helm chart for EdgeCraftRAG stack | ||
| type: application | ||
| version: 0.1.0 | ||
| appVersion: "25.11" |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # EdgeCraft RAG Helm Chart | ||
|
|
||
| This doc intrudoces the Helm chart for deploying EdgeCraft RAG (ecrag) on a Kubernetes cluster. | ||
|
|
||
| ## Prerequisites | ||
|
|
||
| - A running Kubernetes cluster. | ||
| - Helm installed. | ||
| - Required Docker images available in your registry or locally. | ||
|
|
||
| ## Configuration | ||
|
|
||
| Before installing, you should configure the `edgecraftrag/values.yaml` file according to your environment. | ||
|
|
||
| ### Key Configurations | ||
|
|
||
| 1. **Images**: Set the registry and tag for `ecrag` and `vllm`. | ||
| ```yaml | ||
| image: | ||
| ecrag: | ||
| registry: <your-registry> | ||
| tag: <your-tag> | ||
| vllm: | ||
| registry: <your-registry> | ||
| tag: <your-tag> | ||
| ``` | ||
|
|
||
| 2. **Environment Variables**: Configure proxies and host IP. | ||
| ```yaml | ||
| env: | ||
| http_proxy: "http://proxy:port" | ||
| https_proxy: "http://proxy:port" | ||
| HOST_IP: "<node-ip>" | ||
| ``` | ||
|
|
||
| 3. **LLM Settings**: Adjust LLM model paths and parameters. | ||
| ```yaml | ||
| llm: | ||
| LLM_MODEL: "/path/to/model/inside/container" # Ensure this maps to paths.model | ||
| ``` | ||
|
|
||
| 4. **Persistent Paths**: Ensure the host paths exist for mounting. | ||
| ```yaml | ||
| paths: | ||
| model: /home/user/models | ||
| docs: /home/user/docs | ||
| ``` | ||
|
|
||
| ## Installation | ||
|
|
||
| To install the chart, please use below command (`edgecraftrag` as an example) | ||
|
|
||
| ```bash | ||
| cd kubernetes/helm | ||
| helm install edgecraftrag ./ | ||
| ``` | ||
|
|
||
| If there're different clusters available, please install the chart with specific kube config, e.g. : | ||
|
|
||
| ```bash | ||
| helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml | ||
| ``` | ||
|
|
||
| ## Verification | ||
|
|
||
| ### Accessing the Web UI | ||
|
|
||
| Once the service is running, you can access the UI via your browser. | ||
|
|
||
| 1. **Identify the Port**: | ||
| Check the `nodePort` configured in the `edgecraftrag/values.yaml` file. This is the external access port. | ||
|
|
||
| 2. **Identify the IP**: | ||
| Use the IP address of the Kubernetes node where the deployment is running. | ||
| * If running on your local machine (e.g., MicroK8s), use `localhost` or your machine's LAN IP. | ||
| * If running on a remote cluster, use that node's IP. | ||
|
|
||
| 3. **Open in Browser**: | ||
| Navigate to `http://<NodeIP>:<NodePort>` | ||
| > Example: `http://192.168.1.5:31234` | ||
|
|
||
| ## Uninstallation | ||
|
|
||
| To uninstall/delete the `edgecraftrag` deployment: | ||
|
|
||
| ```bash | ||
| helm uninstall edgecraftrag | ||
| ``` | ||
|
|
||
| If there're different clusters available, please uninstall the chart with specific kube config, e.g. : | ||
|
|
||
| ```bash | ||
| helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml | ||
| ``` | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,94 @@ | ||
| # EdgeCraft RAG Helm Chart | ||
|
|
||
| 此文档将为您介绍如何使用Helm chart在Kubernetes集群上部署EdgeCraft RAG (ecrag)。 | ||
|
|
||
| ## 前置条件 | ||
|
|
||
| - 您需要一个运行中的Kubernetes集群。 | ||
| - 您需要已经安装Helm。 | ||
| - 所需的Docker镜像已在您的镜像仓库或本地可用。 | ||
|
|
||
| ## 配置 | ||
|
|
||
| 安装前,请根据您的环境配置 `edgecraftrag/values.yaml` 文件。 | ||
|
|
||
| ### 关键配置 | ||
|
|
||
| 1. **镜像**:设置 `ecrag` 和 `vllm` 的镜像仓库和标签。 | ||
| ```yaml | ||
| image: | ||
| ecrag: | ||
| registry: <your-registry> | ||
| tag: <your-tag> | ||
| vllm: | ||
| registry: <your-registry> | ||
| tag: <your-tag> | ||
| ``` | ||
|
|
||
| 2. **环境变量**:配置代理和主机IP。 | ||
| ```yaml | ||
| env: | ||
| http_proxy: "http://proxy:port" | ||
| https_proxy: "http://proxy:port" | ||
| HOST_IP: "<node-ip>" | ||
| ``` | ||
|
|
||
| 3. **LLM设置**:调整LLM模型路径和参数。 | ||
| ```yaml | ||
| llm: | ||
| LLM_MODEL: "/path/to/model/inside/container" # 确保此路径映射到 paths.model | ||
| ``` | ||
|
|
||
| 4. **持久化路径**:确保主机挂载路径存在。 | ||
| ```yaml | ||
| paths: | ||
| model: /home/user/models | ||
| docs: /home/user/docs | ||
| ``` | ||
|
|
||
| ## 安装 | ||
|
|
||
| 请使用如下命令安装helm(以`edgecraftrag`作为发布名为例): | ||
|
|
||
| ```bash | ||
| cd kubernetes/helm | ||
| helm install edgecraftrag ./ | ||
| ``` | ||
|
|
||
| 如果有不同的集群可用,请使用指定的kube config安装chart,例如: | ||
|
|
||
| ```bash | ||
| helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml | ||
| ``` | ||
|
|
||
| ## 验证 | ||
|
|
||
| ### 访问Web界面 | ||
|
|
||
| 服务运行后,您可以通过浏览器访问UI。 | ||
|
|
||
| 1. **确认端口**: | ||
| 查看 `edgecraftrag/values.yaml` 文件中配置的 `nodePort`。这是外部访问端口。 | ||
|
|
||
| 2. **确认IP**: | ||
| 使用部署所运行的Kubernetes节点的IP地址。 | ||
| * 如果在本地机器运行(如MicroK8s),使用 `localhost` 或您机器的局域网IP。 | ||
| * 如果在远程集群运行,使用该节点的IP。 | ||
|
|
||
| 3. **在浏览器中打开**: | ||
| 访问 `http://<NodeIP>:<NodePort>` | ||
| > 示例:`http://192.168.1.5:31234` | ||
|
|
||
| ## 卸载 | ||
|
|
||
| 卸载/删除部署的`edgecraftrag`: | ||
|
|
||
| ```bash | ||
| helm uninstall edgecraftrag | ||
| ``` | ||
|
|
||
| 如果有不同的集群可用,请使用指定的kube config卸载chart,例如: | ||
|
|
||
| ```bash | ||
| helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,38 @@ | ||
| # Copyright (C) 2026 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| apiVersion: v1 | ||
| kind: ConfigMap | ||
| metadata: | ||
| name: edgecraftrag-env | ||
| data: | ||
| # Common environment variables | ||
| no_proxy: "{{ .Values.env.no_proxy }}" | ||
| http_proxy: "{{ .Values.env.http_proxy }}" | ||
| https_proxy: "{{ .Values.env.https_proxy }}" | ||
| HOST_IP: "{{ .Values.env.HOST_IP }}" | ||
| ENABLE_BENCHMARK: "{{ .Values.env.ENABLE_BENCHMARK }}" | ||
| CHAT_HISTORY_ROUND: "{{ .Values.env.CHAT_HISTORY_ROUND }}" | ||
| METADATA_DATABASE_URL: "{{ .Values.env.METADATA_DATABASE_URL }}" | ||
| MEGA_SERVICE_PORT: "{{ .Values.ports.mega }}" | ||
| PIPELINE_SERVICE_HOST_IP: edgecraftrag-server | ||
| PIPELINE_SERVICE_PORT: "{{ .Values.ports.pipeline }}" | ||
| UI_SERVICE_PORT: "{{ .Values.ports.ui.port }}" | ||
| VLLM_SERVICE_PORT_B60: "{{ .Values.ports.vllm }}" | ||
|
|
||
| # llm-serving-xpu specific environment variables | ||
| LLM_MODEL: "{{ .Values.llm.LLM_MODEL }}" | ||
| DTYPE: "{{ .Values.llm.DTYPE }}" | ||
| ZE_AFFINITY_MASK: "{{ .Values.llm.ZE_AFFINITY_MASK }}" | ||
| ENFORCE_EAGER: "{{ .Values.llm.ENFORCE_EAGER }}" | ||
| TRUST_REMOTE_CODE: "{{ .Values.llm.TRUST_REMOTE_CODE }}" | ||
| DISABLE_SLIDING_WINDOW: "{{ .Values.llm.DISABLE_SLIDING_WINDOW }}" | ||
| GPU_MEMORY_UTIL: "{{ .Values.llm.GPU_MEMORY_UTIL }}" | ||
| NO_ENABLE_PREFIX_CACHING: "{{ .Values.llm.NO_ENABLE_PREFIX_CACHING }}" | ||
| MAX_NUM_BATCHED_TOKENS: "{{ .Values.llm.MAX_NUM_BATCHED_TOKENS }}" | ||
| MAX_MODEL_LEN: "{{ .Values.llm.MAX_MODEL_LEN }}" | ||
| DISABLE_LOG_REQUESTS: "{{ .Values.llm.DISABLE_LOG_REQUESTS }}" | ||
| BLOCK_SIZE: "{{ .Values.llm.BLOCK_SIZE }}" | ||
| QUANTIZATION: "{{ .Values.llm.QUANTIZATION }}" | ||
| TP: "{{ .Values.llm.TP }}" | ||
| DP: "{{ .Values.llm.DP }}" |
61 changes: 61 additions & 0 deletions
61
EdgeCraftRAG/kubernetes/helm/templates/daemonset-edgecraftrag-server.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| # Copyright (C) 2026 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| apiVersion: apps/v1 | ||
| kind: DaemonSet | ||
| metadata: | ||
| name: edgecraftrag-server | ||
| spec: | ||
| selector: | ||
| matchLabels: | ||
| app: edgecraftrag-server | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: edgecraftrag-server | ||
| spec: | ||
| securityContext: | ||
| runAsUser: 1000 | ||
| runAsGroup: 1000 | ||
| supplementalGroups: | ||
| - {{ .Values.gpu.groups.video }} | ||
| - {{ .Values.gpu.groups.render }} | ||
| containers: | ||
| - name: edgecraftrag-server | ||
| image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-server:{{ .Values.image.ecrag.tag }}" | ||
| imagePullPolicy: IfNotPresent | ||
| envFrom: | ||
| - configMapRef: | ||
| name: edgecraftrag-env | ||
| env: | ||
| - name: PIPELINE_SERVICE_HOST_IP | ||
| value: "0.0.0.0" | ||
| ports: | ||
| - containerPort: {{ .Values.ports.pipeline }} | ||
| volumeMounts: | ||
| - name: model-path | ||
| mountPath: /home/user/models | ||
| - name: docs-path | ||
| mountPath: /home/user/docs | ||
| - name: tmpfile-path | ||
| mountPath: /home/user/ui_cache | ||
| - name: prompt-path | ||
| mountPath: /templates/custom | ||
| - name: dri-device | ||
| mountPath: /dev/dri | ||
| volumes: | ||
| - name: model-path | ||
| hostPath: | ||
| path: "{{ .Values.paths.model }}" | ||
| - name: docs-path | ||
| hostPath: | ||
| path: "{{ .Values.paths.docs }}" | ||
| - name: tmpfile-path | ||
| hostPath: | ||
| path: "{{ .Values.paths.tmpfile }}" | ||
| - name: prompt-path | ||
| hostPath: | ||
| path: "{{ .Values.paths.prompt }}" | ||
| - name: dri-device | ||
| hostPath: | ||
| path: /dev/dri |
61 changes: 61 additions & 0 deletions
61
EdgeCraftRAG/kubernetes/helm/templates/daemonset-llm-serving-xpu.yaml
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,61 @@ | ||
| # Copyright (C) 2026 Intel Corporation | ||
| # SPDX-License-Identifier: Apache-2.0 | ||
|
|
||
| apiVersion: apps/v1 | ||
| kind: DaemonSet | ||
| metadata: | ||
| name: llm-serving-xpu | ||
| spec: | ||
| selector: | ||
| matchLabels: | ||
| app: llm-serving-xpu | ||
| template: | ||
| metadata: | ||
| labels: | ||
| app: llm-serving-xpu | ||
| spec: | ||
| securityContext: | ||
| runAsUser: 1000 | ||
| runAsGroup: 1000 | ||
| supplementalGroups: | ||
| - {{ .Values.gpu.groups.video }} | ||
| - {{ .Values.gpu.groups.render }} | ||
| containers: | ||
| - name: llm-serving-xpu | ||
| image: "{{ .Values.image.vllm.registry }}/llm-scaler-vllm:{{ .Values.image.vllm.tag }}" | ||
| imagePullPolicy: IfNotPresent | ||
| command: | ||
| - "/bin/bash" | ||
| - "-c" | ||
| - "cd /workspace/vllm/models && source /opt/intel/oneapi/setvars.sh --force && \ | ||
| VLLM_OFFLOAD_WEIGHTS_BEFORE_QUANT=1 TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1 \ | ||
| CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn \ | ||
| python3 -m vllm.entrypoints.openai.api_server \ | ||
| --model $LLM_MODEL --dtype $DTYPE --enforce-eager --port $VLLM_SERVICE_PORT_B60 \ | ||
| --trust-remote-code --disable-sliding-window --gpu-memory-util $GPU_MEMORY_UTIL \ | ||
| --no-enable-prefix-caching --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \ | ||
| --disable-log-requests --max-model-len $MAX_MODEL_LEN --block-size $BLOCK_SIZE \ | ||
| --quantization $QUANTIZATION -tp=$TP -dp=$DP" | ||
| envFrom: | ||
| - configMapRef: | ||
| name: edgecraftrag-env | ||
| ports: | ||
| - containerPort: {{ .Values.ports.vllm }} | ||
| securityContext: | ||
| privileged: true | ||
| volumeMounts: | ||
| - name: model-path | ||
| mountPath: /workspace/vllm/models | ||
| - name: dri-device | ||
| mountPath: /dev/dri | ||
| volumes: | ||
| - name: model-path | ||
| hostPath: | ||
| path: "{{ .Values.paths.model }}" | ||
| - name: dri-device | ||
| hostPath: | ||
| path: /dev/dri | ||
| tolerations: | ||
| - key: "gpu" | ||
| operator: "Exists" | ||
| effect: "NoSchedule" |
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.