diff --git a/EdgeCraftRAG/kubernetes/helm/Chart.yaml b/EdgeCraftRAG/kubernetes/helm/Chart.yaml new file mode 100644 index 0000000000..ab8d0453f1 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/Chart.yaml @@ -0,0 +1,9 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v2 +name: edgecraftrag +description: Helm chart for EdgeCraftRAG stack +type: application +version: 0.1.0 +appVersion: "25.11" diff --git a/EdgeCraftRAG/kubernetes/helm/README.md b/EdgeCraftRAG/kubernetes/helm/README.md new file mode 100644 index 0000000000..2736571eae --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/README.md @@ -0,0 +1,94 @@ +# EdgeCraft RAG Helm Chart + +This doc intrudoces the Helm chart for deploying EdgeCraft RAG (ecrag) on a Kubernetes cluster. + +## Prerequisites + +- A running Kubernetes cluster. +- Helm installed. +- Required Docker images available in your registry or locally. + +## Configuration + +Before installing, you should configure the `edgecraftrag/values.yaml` file according to your environment. + +### Key Configurations + +1. **Images**: Set the registry and tag for `ecrag` and `vllm`. + ```yaml + image: + ecrag: + registry: + tag: + vllm: + registry: + tag: + ``` + +2. **Environment Variables**: Configure proxies and host IP. + ```yaml + env: + http_proxy: "http://proxy:port" + https_proxy: "http://proxy:port" + HOST_IP: "" + ``` + +3. **LLM Settings**: Adjust LLM model paths and parameters. + ```yaml + llm: + LLM_MODEL: "/path/to/model/inside/container" # Ensure this maps to paths.model + ``` + +4. **Persistent Paths**: Ensure the host paths exist for mounting. + ```yaml + paths: + model: /home/user/models + docs: /home/user/docs + ``` + +## Installation + +To install the chart, please use below command (`edgecraftrag` as an example) + +```bash +cd kubernetes/helm +helm install edgecraftrag ./ +``` + +If there're different clusters available, please install the chart with specific kube config, e.g. : + +```bash +helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml +``` + +## Verification + +### Accessing the Web UI + +Once the service is running, you can access the UI via your browser. + +1. **Identify the Port**: + Check the `nodePort` configured in the `edgecraftrag/values.yaml` file. This is the external access port. + +2. **Identify the IP**: + Use the IP address of the Kubernetes node where the deployment is running. + * If running on your local machine (e.g., MicroK8s), use `localhost` or your machine's LAN IP. + * If running on a remote cluster, use that node's IP. + +3. **Open in Browser**: + Navigate to `http://:` + > Example: `http://192.168.1.5:31234` + +## Uninstallation + +To uninstall/delete the `edgecraftrag` deployment: + +```bash +helm uninstall edgecraftrag +``` + +If there're different clusters available, please uninstall the chart with specific kube config, e.g. : + +```bash +helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml +``` diff --git a/EdgeCraftRAG/kubernetes/helm/README_zh.md b/EdgeCraftRAG/kubernetes/helm/README_zh.md new file mode 100644 index 0000000000..cb697cc53c --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/README_zh.md @@ -0,0 +1,94 @@ +# EdgeCraft RAG Helm Chart + +此文档将为您介绍如何使用Helm chart在Kubernetes集群上部署EdgeCraft RAG (ecrag)。 + +## 前置条件 + +- 您需要一个运行中的Kubernetes集群。 +- 您需要已经安装Helm。 +- 所需的Docker镜像已在您的镜像仓库或本地可用。 + +## 配置 + +安装前,请根据您的环境配置 `edgecraftrag/values.yaml` 文件。 + +### 关键配置 + +1. **镜像**:设置 `ecrag` 和 `vllm` 的镜像仓库和标签。 + ```yaml + image: + ecrag: + registry: + tag: + vllm: + registry: + tag: + ``` + +2. **环境变量**:配置代理和主机IP。 + ```yaml + env: + http_proxy: "http://proxy:port" + https_proxy: "http://proxy:port" + HOST_IP: "" + ``` + +3. **LLM设置**:调整LLM模型路径和参数。 + ```yaml + llm: + LLM_MODEL: "/path/to/model/inside/container" # 确保此路径映射到 paths.model + ``` + +4. **持久化路径**:确保主机挂载路径存在。 + ```yaml + paths: + model: /home/user/models + docs: /home/user/docs + ``` + +## 安装 + +请使用如下命令安装helm(以`edgecraftrag`作为发布名为例): + +```bash +cd kubernetes/helm +helm install edgecraftrag ./ +``` + +如果有不同的集群可用,请使用指定的kube config安装chart,例如: + +```bash +helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml +``` + +## 验证 + +### 访问Web界面 + +服务运行后,您可以通过浏览器访问UI。 + +1. **确认端口**: + 查看 `edgecraftrag/values.yaml` 文件中配置的 `nodePort`。这是外部访问端口。 + +2. **确认IP**: + 使用部署所运行的Kubernetes节点的IP地址。 + * 如果在本地机器运行(如MicroK8s),使用 `localhost` 或您机器的局域网IP。 + * 如果在远程集群运行,使用该节点的IP。 + +3. **在浏览器中打开**: + 访问 `http://:` + > 示例:`http://192.168.1.5:31234` + +## 卸载 + +卸载/删除部署的`edgecraftrag`: + +```bash +helm uninstall edgecraftrag +``` + +如果有不同的集群可用,请使用指定的kube config卸载chart,例如: + +```bash +helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml +``` diff --git a/EdgeCraftRAG/kubernetes/helm/templates/configmap-env.yaml b/EdgeCraftRAG/kubernetes/helm/templates/configmap-env.yaml new file mode 100644 index 0000000000..a873ca785f --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/configmap-env.yaml @@ -0,0 +1,38 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v1 +kind: ConfigMap +metadata: + name: edgecraftrag-env +data: + # Common environment variables + no_proxy: "{{ .Values.env.no_proxy }}" + http_proxy: "{{ .Values.env.http_proxy }}" + https_proxy: "{{ .Values.env.https_proxy }}" + HOST_IP: "{{ .Values.env.HOST_IP }}" + ENABLE_BENCHMARK: "{{ .Values.env.ENABLE_BENCHMARK }}" + CHAT_HISTORY_ROUND: "{{ .Values.env.CHAT_HISTORY_ROUND }}" + METADATA_DATABASE_URL: "{{ .Values.env.METADATA_DATABASE_URL }}" + MEGA_SERVICE_PORT: "{{ .Values.ports.mega }}" + PIPELINE_SERVICE_HOST_IP: edgecraftrag-server + PIPELINE_SERVICE_PORT: "{{ .Values.ports.pipeline }}" + UI_SERVICE_PORT: "{{ .Values.ports.ui.port }}" + VLLM_SERVICE_PORT_B60: "{{ .Values.ports.vllm }}" + + # llm-serving-xpu specific environment variables + LLM_MODEL: "{{ .Values.llm.LLM_MODEL }}" + DTYPE: "{{ .Values.llm.DTYPE }}" + ZE_AFFINITY_MASK: "{{ .Values.llm.ZE_AFFINITY_MASK }}" + ENFORCE_EAGER: "{{ .Values.llm.ENFORCE_EAGER }}" + TRUST_REMOTE_CODE: "{{ .Values.llm.TRUST_REMOTE_CODE }}" + DISABLE_SLIDING_WINDOW: "{{ .Values.llm.DISABLE_SLIDING_WINDOW }}" + GPU_MEMORY_UTIL: "{{ .Values.llm.GPU_MEMORY_UTIL }}" + NO_ENABLE_PREFIX_CACHING: "{{ .Values.llm.NO_ENABLE_PREFIX_CACHING }}" + MAX_NUM_BATCHED_TOKENS: "{{ .Values.llm.MAX_NUM_BATCHED_TOKENS }}" + MAX_MODEL_LEN: "{{ .Values.llm.MAX_MODEL_LEN }}" + DISABLE_LOG_REQUESTS: "{{ .Values.llm.DISABLE_LOG_REQUESTS }}" + BLOCK_SIZE: "{{ .Values.llm.BLOCK_SIZE }}" + QUANTIZATION: "{{ .Values.llm.QUANTIZATION }}" + TP: "{{ .Values.llm.TP }}" + DP: "{{ .Values.llm.DP }}" diff --git a/EdgeCraftRAG/kubernetes/helm/templates/daemonset-edgecraftrag-server.yaml b/EdgeCraftRAG/kubernetes/helm/templates/daemonset-edgecraftrag-server.yaml new file mode 100644 index 0000000000..fa219977c8 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/daemonset-edgecraftrag-server.yaml @@ -0,0 +1,61 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: edgecraftrag-server +spec: + selector: + matchLabels: + app: edgecraftrag-server + template: + metadata: + labels: + app: edgecraftrag-server + spec: + securityContext: + runAsUser: 1000 + runAsGroup: 1000 + supplementalGroups: + - {{ .Values.gpu.groups.video }} + - {{ .Values.gpu.groups.render }} + containers: + - name: edgecraftrag-server + image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-server:{{ .Values.image.ecrag.tag }}" + imagePullPolicy: IfNotPresent + envFrom: + - configMapRef: + name: edgecraftrag-env + env: + - name: PIPELINE_SERVICE_HOST_IP + value: "0.0.0.0" + ports: + - containerPort: {{ .Values.ports.pipeline }} + volumeMounts: + - name: model-path + mountPath: /home/user/models + - name: docs-path + mountPath: /home/user/docs + - name: tmpfile-path + mountPath: /home/user/ui_cache + - name: prompt-path + mountPath: /templates/custom + - name: dri-device + mountPath: /dev/dri + volumes: + - name: model-path + hostPath: + path: "{{ .Values.paths.model }}" + - name: docs-path + hostPath: + path: "{{ .Values.paths.docs }}" + - name: tmpfile-path + hostPath: + path: "{{ .Values.paths.tmpfile }}" + - name: prompt-path + hostPath: + path: "{{ .Values.paths.prompt }}" + - name: dri-device + hostPath: + path: /dev/dri diff --git a/EdgeCraftRAG/kubernetes/helm/templates/daemonset-llm-serving-xpu.yaml b/EdgeCraftRAG/kubernetes/helm/templates/daemonset-llm-serving-xpu.yaml new file mode 100644 index 0000000000..2e3095cf7d --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/daemonset-llm-serving-xpu.yaml @@ -0,0 +1,61 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: apps/v1 +kind: DaemonSet +metadata: + name: llm-serving-xpu +spec: + selector: + matchLabels: + app: llm-serving-xpu + template: + metadata: + labels: + app: llm-serving-xpu + spec: + securityContext: + runAsUser: 1000 + runAsGroup: 1000 + supplementalGroups: + - {{ .Values.gpu.groups.video }} + - {{ .Values.gpu.groups.render }} + containers: + - name: llm-serving-xpu + image: "{{ .Values.image.vllm.registry }}/llm-scaler-vllm:{{ .Values.image.vllm.tag }}" + imagePullPolicy: IfNotPresent + command: + - "/bin/bash" + - "-c" + - "cd /workspace/vllm/models && source /opt/intel/oneapi/setvars.sh --force && \ + VLLM_OFFLOAD_WEIGHTS_BEFORE_QUANT=1 TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1 \ + CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn \ + python3 -m vllm.entrypoints.openai.api_server \ + --model $LLM_MODEL --dtype $DTYPE --enforce-eager --port $VLLM_SERVICE_PORT_B60 \ + --trust-remote-code --disable-sliding-window --gpu-memory-util $GPU_MEMORY_UTIL \ + --no-enable-prefix-caching --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \ + --disable-log-requests --max-model-len $MAX_MODEL_LEN --block-size $BLOCK_SIZE \ + --quantization $QUANTIZATION -tp=$TP -dp=$DP" + envFrom: + - configMapRef: + name: edgecraftrag-env + ports: + - containerPort: {{ .Values.ports.vllm }} + securityContext: + privileged: true + volumeMounts: + - name: model-path + mountPath: /workspace/vllm/models + - name: dri-device + mountPath: /dev/dri + volumes: + - name: model-path + hostPath: + path: "{{ .Values.paths.model }}" + - name: dri-device + hostPath: + path: /dev/dri + tolerations: + - key: "gpu" + operator: "Exists" + effect: "NoSchedule" diff --git a/EdgeCraftRAG/kubernetes/helm/templates/deployment-ecrag.yaml b/EdgeCraftRAG/kubernetes/helm/templates/deployment-ecrag.yaml new file mode 100644 index 0000000000..16679c49ca --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/deployment-ecrag.yaml @@ -0,0 +1,48 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: apps/v1 +kind: Deployment +metadata: + name: ecrag +spec: + replicas: {{ .Values.replica.ecrag }} + selector: + matchLabels: + app: ecrag + template: + metadata: + labels: + app: ecrag + spec: + containers: + - name: ecrag + image: "{{ .Values.image.ecrag.registry }}/edgecraftrag:{{ .Values.image.ecrag.tag }}" + imagePullPolicy: IfNotPresent + envFrom: + - configMapRef: + name: edgecraftrag-env + ports: + - containerPort: {{ .Values.ports.mega }} + volumeMounts: + - name: model-path + mountPath: /home/user/models + - name: docs-path + mountPath: /home/user/docs + - name: tmpfile-path + mountPath: /home/user/ui_cache + - name: prompt-path + mountPath: /templates/custom + volumes: + - name: model-path + hostPath: + path: "{{ .Values.paths.model }}" + - name: docs-path + hostPath: + path: "{{ .Values.paths.docs }}" + - name: tmpfile-path + hostPath: + path: "{{ .Values.paths.tmpfile }}" + - name: prompt-path + hostPath: + path: "{{ .Values.paths.prompt }}" diff --git a/EdgeCraftRAG/kubernetes/helm/templates/deployment-edgecraftrag-ui.yaml b/EdgeCraftRAG/kubernetes/helm/templates/deployment-edgecraftrag-ui.yaml new file mode 100644 index 0000000000..6b720fc6b6 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/deployment-edgecraftrag-ui.yaml @@ -0,0 +1,48 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: apps/v1 +kind: Deployment +metadata: + name: edgecraftrag-ui +spec: + replicas: {{ .Values.replica.ecrag_ui }} + selector: + matchLabels: + app: edgecraftrag-ui + template: + metadata: + labels: + app: edgecraftrag-ui + spec: + containers: + - name: edgecraftrag-ui + image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-ui:{{ .Values.image.ecrag.tag }}" + imagePullPolicy: IfNotPresent + envFrom: + - configMapRef: + name: edgecraftrag-env + ports: + - containerPort: {{ .Values.ports.ui.port }} + volumeMounts: + - name: model-path + mountPath: /home/user/models + - name: docs-path + mountPath: /home/user/docs + - name: tmpfile-path + mountPath: /home/user/ui_cache + - name: prompt-path + mountPath: /templates/custom + volumes: + - name: model-path + hostPath: + path: "{{ .Values.paths.model }}" + - name: docs-path + hostPath: + path: "{{ .Values.paths.docs }}" + - name: tmpfile-path + hostPath: + path: "{{ .Values.paths.tmpfile }}" + - name: prompt-path + hostPath: + path: "{{ .Values.paths.prompt }}" diff --git a/EdgeCraftRAG/kubernetes/helm/templates/service-ecrag.yaml b/EdgeCraftRAG/kubernetes/helm/templates/service-ecrag.yaml new file mode 100644 index 0000000000..66edfce5b5 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/service-ecrag.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v1 +kind: Service +metadata: + name: ecrag +spec: + selector: + app: ecrag + ports: + - protocol: TCP + port: {{ .Values.ports.mega }} + targetPort: {{ .Values.ports.mega }} diff --git a/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-server.yaml b/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-server.yaml new file mode 100644 index 0000000000..fba8a7f8f2 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-server.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v1 +kind: Service +metadata: + name: edgecraftrag-server +spec: + selector: + app: edgecraftrag-server + ports: + - protocol: TCP + port: {{ .Values.ports.pipeline }} + targetPort: {{ .Values.ports.pipeline }} diff --git a/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-ui.yaml b/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-ui.yaml new file mode 100644 index 0000000000..c953046423 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/service-edgecraftrag-ui.yaml @@ -0,0 +1,16 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v1 +kind: Service +metadata: + name: edgecraftrag-ui +spec: + type: NodePort + selector: + app: edgecraftrag-ui + ports: + - protocol: TCP + port: {{ .Values.ports.ui.port }} + targetPort: {{ .Values.ports.ui.port }} + nodePort: {{ .Values.ports.ui.nodePort }} diff --git a/EdgeCraftRAG/kubernetes/helm/templates/service-llm-serving-xpu.yaml b/EdgeCraftRAG/kubernetes/helm/templates/service-llm-serving-xpu.yaml new file mode 100644 index 0000000000..c5e240a87e --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/templates/service-llm-serving-xpu.yaml @@ -0,0 +1,14 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +apiVersion: v1 +kind: Service +metadata: + name: llm-serving-xpu +spec: + selector: + app: llm-serving-xpu + ports: + - protocol: TCP + port: {{ .Values.ports.vllm }} + targetPort: {{ .Values.ports.vllm }} diff --git a/EdgeCraftRAG/kubernetes/helm/values.yaml b/EdgeCraftRAG/kubernetes/helm/values.yaml new file mode 100644 index 0000000000..b272f4b2e8 --- /dev/null +++ b/EdgeCraftRAG/kubernetes/helm/values.yaml @@ -0,0 +1,60 @@ +# Copyright (C) 2026 Intel Corporation +# SPDX-License-Identifier: Apache-2.0 + +image: + ecrag: + registry: opea + tag: latest + vllm: + registry: intel + tag: 1.1-preview + +replica: + ecrag: 1 + ecrag_ui: 1 + +env: + no_proxy: "" + http_proxy: "" + https_proxy: "" + HOST_IP: "" + ENABLE_BENCHMARK: false + CHAT_HISTORY_ROUND: 0 + METADATA_DATABASE_URL: "" + +llm: + LLM_MODEL: "" + DTYPE: float16 + ZE_AFFINITY_MASK: 0,1 + ENFORCE_EAGER: 1 + TRUST_REMOTE_CODE: 1 + DISABLE_SLIDING_WINDOW: 1 + GPU_MEMORY_UTIL: 0.9 + NO_ENABLE_PREFIX_CACHING: 1 + MAX_NUM_BATCHED_TOKENS: 8192 + MAX_MODEL_LEN: 49152 + DISABLE_LOG_REQUESTS: 1 + BLOCK_SIZE: 64 + QUANTIZATION: sym_int4 + TP: 1 + DP: 1 + + +ports: + pipeline: 16010 + mega: 16011 + ui: + port: 8082 + nodePort: 30082 + vllm: 8086 + +paths: + model: /home/user/models + docs: /home/user/docs + tmpfile: /home/user/ui_cache + prompt: /templates/custom + +gpu: + groups: + video: 44 + render: 991