opea-project · Yongbozzz · Feb 2, 2026 · Feb 2, 2026 · Feb 3, 2026 · Feb 3, 2026
@@ -0,0 +1,9 @@
+# Copyright (C) 2026 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v2
+name: edgecraftrag
+description: Helm chart for EdgeCraftRAG stack
+type: application
+version: 0.1.0
+appVersion: "25.11"
@@ -0,0 +1,94 @@
+# EdgeCraft RAG Helm Chart
+
+This doc intrudoces the Helm chart for deploying EdgeCraft RAG (ecrag) on a Kubernetes cluster.
+
+## Prerequisites
+
+- A running Kubernetes cluster.
+- Helm installed.
+- Required Docker images available in your registry or locally.
+
+## Configuration
+
+Before installing, you should configure the `edgecraftrag/values.yaml` file according to your environment.
+
+### Key Configurations
+
+1. **Images**: Set the registry and tag for `ecrag` and `vllm`.
+   ```yaml
+   image:
+     ecrag:
+       registry: <your-registry>
+       tag: <your-tag>
+     vllm:
+       registry: <your-registry>
+       tag: <your-tag>
+   ```
+
+2. **Environment Variables**: Configure proxies and host IP.
+   ```yaml
+   env:
+     http_proxy: "http://proxy:port"
+     https_proxy: "http://proxy:port"
+     HOST_IP: "<node-ip>"
+   ```
+
+3. **LLM Settings**: Adjust LLM model paths and parameters.
+   ```yaml
+   llm:
+     LLM_MODEL: "/path/to/model/inside/container" # Ensure this maps to paths.model
+   ```
+
+4. **Persistent Paths**: Ensure the host paths exist for mounting.
+   ```yaml
+   paths:
+     model: /home/user/models
+     docs: /home/user/docs
+   ```
+
+## Installation
+
+To install the chart, please use below command (`edgecraftrag` as an example)
+
+```bash
+cd kubernetes/helm
+helm install edgecraftrag ./
+```
+
+If there're different clusters available, please install the chart with specific kube config, e.g. :
+
+```bash
+helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml
+```
+
+## Verification
+
+### Accessing the Web UI
+
+Once the service is running, you can access the UI via your browser.
+
+1.  **Identify the Port**:
+    Check the `nodePort` configured in the `edgecraftrag/values.yaml` file. This is the external access port.
+
+2.  **Identify the IP**:
+    Use the IP address of the Kubernetes node where the deployment is running.
+    *   If running on your local machine (e.g., MicroK8s), use `localhost` or your machine's LAN IP.
+    *   If running on a remote cluster, use that node's IP.
+
+3.  **Open in Browser**:
+    Navigate to `http://<NodeIP>:<NodePort>`
+    > Example: `http://192.168.1.5:31234`
+
+## Uninstallation
+
+To uninstall/delete the `edgecraftrag` deployment:
+
+```bash
+helm uninstall edgecraftrag
+```
+
+If there're different clusters available, please uninstall the chart with specific kube config, e.g. :
+
+```bash
+helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
+```
@@ -0,0 +1,94 @@
+# EdgeCraft RAG Helm Chart
+
+此文档将为您介绍如何使用Helm chart在Kubernetes集群上部署EdgeCraft RAG (ecrag)。
+
+## 前置条件
+
+- 您需要一个运行中的Kubernetes集群。
+- 您需要已经安装Helm。
+- 所需的Docker镜像已在您的镜像仓库或本地可用。
+
+## 配置
+
+安装前，请根据您的环境配置 `edgecraftrag/values.yaml` 文件。
+
+### 关键配置
+
+1. **镜像**：设置 `ecrag` 和 `vllm` 的镜像仓库和标签。
+   ```yaml
+   image:
+     ecrag:
+       registry: <your-registry>
+       tag: <your-tag>
+     vllm:
+       registry: <your-registry>
+       tag: <your-tag>
+   ```
+
+2. **环境变量**：配置代理和主机IP。
+   ```yaml
+   env:
+     http_proxy: "http://proxy:port"
+     https_proxy: "http://proxy:port"
+     HOST_IP: "<node-ip>"
+   ```
+
+3. **LLM设置**：调整LLM模型路径和参数。
+   ```yaml
+   llm:
+     LLM_MODEL: "/path/to/model/inside/container" # 确保此路径映射到 paths.model
+   ```
+
+4. **持久化路径**：确保主机挂载路径存在。
+   ```yaml
+   paths:
+     model: /home/user/models
+     docs: /home/user/docs
+   ```
+
+## 安装
+
+请使用如下命令安装helm（以`edgecraftrag`作为发布名为例）：
+
+```bash
+cd kubernetes/helm
+helm install edgecraftrag ./
+```
+
+如果有不同的集群可用，请使用指定的kube config安装chart，例如：
+
+```bash
+helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml
+```
+
+## 验证
+
+### 访问Web界面
+
+服务运行后，您可以通过浏览器访问UI。
+
+1.  **确认端口**：
+    查看 `edgecraftrag/values.yaml` 文件中配置的 `nodePort`。这是外部访问端口。
+
+2.  **确认IP**：
+    使用部署所运行的Kubernetes节点的IP地址。
+    *   如果在本地机器运行（如MicroK8s），使用 `localhost` 或您机器的局域网IP。
+    *   如果在远程集群运行，使用该节点的IP。
+
+3.  **在浏览器中打开**：
+    访问 `http://<NodeIP>:<NodePort>`
+    > 示例：`http://192.168.1.5:31234`
+
+## 卸载
+
+卸载/删除部署的`edgecraftrag`：
+
+```bash
+helm uninstall edgecraftrag
+```
+
+如果有不同的集群可用，请使用指定的kube config卸载chart，例如：
+
+```bash
+helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
+```
@@ -0,0 +1,38 @@
+# Copyright (C) 2026 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: v1
+kind: ConfigMap
+metadata:
+  name: edgecraftrag-env
+data:
+  # Common environment variables
+  no_proxy: "{{ .Values.env.no_proxy }}"
+  http_proxy: "{{ .Values.env.http_proxy }}"
+  https_proxy: "{{ .Values.env.https_proxy }}"
+  HOST_IP: "{{ .Values.env.HOST_IP }}"
+  ENABLE_BENCHMARK: "{{ .Values.env.ENABLE_BENCHMARK }}"
+  CHAT_HISTORY_ROUND: "{{ .Values.env.CHAT_HISTORY_ROUND }}"
+  METADATA_DATABASE_URL: "{{ .Values.env.METADATA_DATABASE_URL }}"
+  MEGA_SERVICE_PORT: "{{ .Values.ports.mega }}"
+  PIPELINE_SERVICE_HOST_IP: edgecraftrag-server
+  PIPELINE_SERVICE_PORT: "{{ .Values.ports.pipeline }}"
+  UI_SERVICE_PORT: "{{ .Values.ports.ui.port }}"
+  VLLM_SERVICE_PORT_B60: "{{ .Values.ports.vllm }}"
+
+  # llm-serving-xpu specific environment variables
+  LLM_MODEL: "{{ .Values.llm.LLM_MODEL }}"
+  DTYPE: "{{ .Values.llm.DTYPE }}"
+  ZE_AFFINITY_MASK: "{{ .Values.llm.ZE_AFFINITY_MASK }}"
+  ENFORCE_EAGER: "{{ .Values.llm.ENFORCE_EAGER }}"
+  TRUST_REMOTE_CODE: "{{ .Values.llm.TRUST_REMOTE_CODE }}"
+  DISABLE_SLIDING_WINDOW: "{{ .Values.llm.DISABLE_SLIDING_WINDOW }}"
+  GPU_MEMORY_UTIL: "{{ .Values.llm.GPU_MEMORY_UTIL }}"
+  NO_ENABLE_PREFIX_CACHING: "{{ .Values.llm.NO_ENABLE_PREFIX_CACHING }}"
+  MAX_NUM_BATCHED_TOKENS: "{{ .Values.llm.MAX_NUM_BATCHED_TOKENS }}"
+  MAX_MODEL_LEN: "{{ .Values.llm.MAX_MODEL_LEN }}"
+  DISABLE_LOG_REQUESTS: "{{ .Values.llm.DISABLE_LOG_REQUESTS }}"
+  BLOCK_SIZE: "{{ .Values.llm.BLOCK_SIZE }}"
+  QUANTIZATION: "{{ .Values.llm.QUANTIZATION }}"
+  TP: "{{ .Values.llm.TP }}"
+  DP: "{{ .Values.llm.DP }}"
@@ -0,0 +1,61 @@
+# Copyright (C) 2026 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: edgecraftrag-server
+spec:
+  selector:
+    matchLabels:
+      app: edgecraftrag-server
+  template:
+    metadata:
+      labels:
+        app: edgecraftrag-server
+    spec:
+      securityContext:
+        runAsUser: 1000
+        runAsGroup: 1000
+        supplementalGroups:
+        - {{ .Values.gpu.groups.video }}
+        - {{ .Values.gpu.groups.render }}
+      containers:
+        - name: edgecraftrag-server
+          image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-server:{{ .Values.image.ecrag.tag }}"
+          imagePullPolicy: IfNotPresent
+          envFrom:
+            - configMapRef:
+                name: edgecraftrag-env
+          env:
+            - name: PIPELINE_SERVICE_HOST_IP
+              value: "0.0.0.0"
+          ports:
+            - containerPort: {{ .Values.ports.pipeline }}
+          volumeMounts:
+            - name: model-path
+              mountPath: /home/user/models
+            - name: docs-path
+              mountPath: /home/user/docs
+            - name: tmpfile-path
+              mountPath: /home/user/ui_cache
+            - name: prompt-path
+              mountPath: /templates/custom
+            - name: dri-device
+              mountPath: /dev/dri
+      volumes:
+        - name: model-path
+          hostPath:
+            path: "{{ .Values.paths.model }}"
+        - name: docs-path
+          hostPath:
+            path: "{{ .Values.paths.docs }}"
+        - name: tmpfile-path
+          hostPath:
+            path: "{{ .Values.paths.tmpfile }}"
+        - name: prompt-path
+          hostPath:
+            path: "{{ .Values.paths.prompt }}"
+        - name: dri-device
+          hostPath:
+            path: /dev/dri
@@ -0,0 +1,61 @@
+# Copyright (C) 2026 Intel Corporation
+# SPDX-License-Identifier: Apache-2.0
+
+apiVersion: apps/v1
+kind: DaemonSet
+metadata:
+  name: llm-serving-xpu
+spec:
+  selector:
+    matchLabels:
+      app: llm-serving-xpu
+  template:
+    metadata:
+      labels:
+        app: llm-serving-xpu
+    spec:
+      securityContext:
+        runAsUser: 1000
+        runAsGroup: 1000
+        supplementalGroups:
+        - {{ .Values.gpu.groups.video }}
+        - {{ .Values.gpu.groups.render }}
+      containers:
+        - name: llm-serving-xpu
+          image: "{{ .Values.image.vllm.registry }}/llm-scaler-vllm:{{ .Values.image.vllm.tag }}"
+          imagePullPolicy: IfNotPresent
+          command:
+            - "/bin/bash"
+            - "-c"
+            - "cd /workspace/vllm/models && source /opt/intel/oneapi/setvars.sh --force && \
+               VLLM_OFFLOAD_WEIGHTS_BEFORE_QUANT=1 TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1 \
+               CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn \
+               python3 -m vllm.entrypoints.openai.api_server \
+               --model $LLM_MODEL --dtype $DTYPE --enforce-eager --port $VLLM_SERVICE_PORT_B60 \
+               --trust-remote-code --disable-sliding-window --gpu-memory-util $GPU_MEMORY_UTIL \
+               --no-enable-prefix-caching --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \
+               --disable-log-requests --max-model-len $MAX_MODEL_LEN --block-size $BLOCK_SIZE \
+               --quantization $QUANTIZATION -tp=$TP -dp=$DP"
+          envFrom:
+            - configMapRef:
+                name: edgecraftrag-env
+          ports:
+            - containerPort: {{ .Values.ports.vllm }}
+          securityContext:
+            privileged: true
+          volumeMounts:
+            - name: model-path
+              mountPath: /workspace/vllm/models
+            - name: dri-device
+              mountPath: /dev/dri
+      volumes:
+        - name: model-path
+          hostPath:
+            path: "{{ .Values.paths.model }}"
+        - name: dri-device
+          hostPath:
+            path: /dev/dri
+      tolerations:
+        - key: "gpu"
+          operator: "Exists"
+          effect: "NoSchedule"