Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 9 additions & 0 deletions EdgeCraftRAG/kubernetes/helm/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
# Copyright (C) 2026 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v2
name: edgecraftrag
description: Helm chart for EdgeCraftRAG stack
type: application
version: 0.1.0
appVersion: "25.11"
94 changes: 94 additions & 0 deletions EdgeCraftRAG/kubernetes/helm/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# EdgeCraft RAG Helm Chart

This doc intrudoces the Helm chart for deploying EdgeCraft RAG (ecrag) on a Kubernetes cluster.

## Prerequisites

- A running Kubernetes cluster.
- Helm installed.
- Required Docker images available in your registry or locally.

## Configuration

Before installing, you should configure the `edgecraftrag/values.yaml` file according to your environment.

### Key Configurations

1. **Images**: Set the registry and tag for `ecrag` and `vllm`.
```yaml
image:
ecrag:
registry: <your-registry>
tag: <your-tag>
vllm:
registry: <your-registry>
tag: <your-tag>
```

2. **Environment Variables**: Configure proxies and host IP.
```yaml
env:
http_proxy: "http://proxy:port"
https_proxy: "http://proxy:port"
HOST_IP: "<node-ip>"
```

3. **LLM Settings**: Adjust LLM model paths and parameters.
```yaml
llm:
LLM_MODEL: "/path/to/model/inside/container" # Ensure this maps to paths.model
```

4. **Persistent Paths**: Ensure the host paths exist for mounting.
```yaml
paths:
model: /home/user/models
docs: /home/user/docs
```

## Installation

To install the chart, please use below command (`edgecraftrag` as an example)

```bash
cd kubernetes/helm
helm install edgecraftrag ./
```

If there're different clusters available, please install the chart with specific kube config, e.g. :

```bash
helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml
```

## Verification

### Accessing the Web UI

Once the service is running, you can access the UI via your browser.

1. **Identify the Port**:
Check the `nodePort` configured in the `edgecraftrag/values.yaml` file. This is the external access port.

2. **Identify the IP**:
Use the IP address of the Kubernetes node where the deployment is running.
* If running on your local machine (e.g., MicroK8s), use `localhost` or your machine's LAN IP.
* If running on a remote cluster, use that node's IP.

3. **Open in Browser**:
Navigate to `http://<NodeIP>:<NodePort>`
> Example: `http://192.168.1.5:31234`

## Uninstallation

To uninstall/delete the `edgecraftrag` deployment:

```bash
helm uninstall edgecraftrag
```

If there're different clusters available, please uninstall the chart with specific kube config, e.g. :

```bash
helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
```
94 changes: 94 additions & 0 deletions EdgeCraftRAG/kubernetes/helm/README_zh.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,94 @@
# EdgeCraft RAG Helm Chart

此文档将为您介绍如何使用Helm chart在Kubernetes集群上部署EdgeCraft RAG (ecrag)。

## 前置条件

- 您需要一个运行中的Kubernetes集群。
- 您需要已经安装Helm。
- 所需的Docker镜像已在您的镜像仓库或本地可用。

## 配置

安装前,请根据您的环境配置 `edgecraftrag/values.yaml` 文件。

### 关键配置

1. **镜像**:设置 `ecrag` 和 `vllm` 的镜像仓库和标签。
```yaml
image:
ecrag:
registry: <your-registry>
tag: <your-tag>
vllm:
registry: <your-registry>
tag: <your-tag>
```

2. **环境变量**:配置代理和主机IP。
```yaml
env:
http_proxy: "http://proxy:port"
https_proxy: "http://proxy:port"
HOST_IP: "<node-ip>"
```

3. **LLM设置**:调整LLM模型路径和参数。
```yaml
llm:
LLM_MODEL: "/path/to/model/inside/container" # 确保此路径映射到 paths.model
```

4. **持久化路径**:确保主机挂载路径存在。
```yaml
paths:
model: /home/user/models
docs: /home/user/docs
```

## 安装

请使用如下命令安装helm(以`edgecraftrag`作为发布名为例):

```bash
cd kubernetes/helm
helm install edgecraftrag ./
```

如果有不同的集群可用,请使用指定的kube config安装chart,例如:

```bash
helm install edgecraftrag ./ --kubeconfig /home/user/.kube/nas.yaml
```

## 验证

### 访问Web界面

服务运行后,您可以通过浏览器访问UI。

1. **确认端口**:
查看 `edgecraftrag/values.yaml` 文件中配置的 `nodePort`。这是外部访问端口。

2. **确认IP**:
使用部署所运行的Kubernetes节点的IP地址。
* 如果在本地机器运行(如MicroK8s),使用 `localhost` 或您机器的局域网IP。
* 如果在远程集群运行,使用该节点的IP。

3. **在浏览器中打开**:
访问 `http://<NodeIP>:<NodePort>`
> 示例:`http://192.168.1.5:31234`

## 卸载

卸载/删除部署的`edgecraftrag`:

```bash
helm uninstall edgecraftrag
```

如果有不同的集群可用,请使用指定的kube config卸载chart,例如:

```bash
helm uninstall edgecraftrag --kubeconfig /home/user/.kube/nas.yaml
```
38 changes: 38 additions & 0 deletions EdgeCraftRAG/kubernetes/helm/templates/configmap-env.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
# Copyright (C) 2026 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: v1
kind: ConfigMap
metadata:
name: edgecraftrag-env
data:
# Common environment variables
no_proxy: "{{ .Values.env.no_proxy }}"
http_proxy: "{{ .Values.env.http_proxy }}"
https_proxy: "{{ .Values.env.https_proxy }}"
HOST_IP: "{{ .Values.env.HOST_IP }}"
ENABLE_BENCHMARK: "{{ .Values.env.ENABLE_BENCHMARK }}"
CHAT_HISTORY_ROUND: "{{ .Values.env.CHAT_HISTORY_ROUND }}"
METADATA_DATABASE_URL: "{{ .Values.env.METADATA_DATABASE_URL }}"
MEGA_SERVICE_PORT: "{{ .Values.ports.mega }}"
PIPELINE_SERVICE_HOST_IP: edgecraftrag-server
PIPELINE_SERVICE_PORT: "{{ .Values.ports.pipeline }}"
UI_SERVICE_PORT: "{{ .Values.ports.ui.port }}"
VLLM_SERVICE_PORT_B60: "{{ .Values.ports.vllm }}"

# llm-serving-xpu specific environment variables
LLM_MODEL: "{{ .Values.llm.LLM_MODEL }}"
DTYPE: "{{ .Values.llm.DTYPE }}"
ZE_AFFINITY_MASK: "{{ .Values.llm.ZE_AFFINITY_MASK }}"
ENFORCE_EAGER: "{{ .Values.llm.ENFORCE_EAGER }}"
TRUST_REMOTE_CODE: "{{ .Values.llm.TRUST_REMOTE_CODE }}"
DISABLE_SLIDING_WINDOW: "{{ .Values.llm.DISABLE_SLIDING_WINDOW }}"
GPU_MEMORY_UTIL: "{{ .Values.llm.GPU_MEMORY_UTIL }}"
NO_ENABLE_PREFIX_CACHING: "{{ .Values.llm.NO_ENABLE_PREFIX_CACHING }}"
MAX_NUM_BATCHED_TOKENS: "{{ .Values.llm.MAX_NUM_BATCHED_TOKENS }}"
MAX_MODEL_LEN: "{{ .Values.llm.MAX_MODEL_LEN }}"
DISABLE_LOG_REQUESTS: "{{ .Values.llm.DISABLE_LOG_REQUESTS }}"
BLOCK_SIZE: "{{ .Values.llm.BLOCK_SIZE }}"
QUANTIZATION: "{{ .Values.llm.QUANTIZATION }}"
TP: "{{ .Values.llm.TP }}"
DP: "{{ .Values.llm.DP }}"
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Copyright (C) 2026 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: edgecraftrag-server
spec:
selector:
matchLabels:
app: edgecraftrag-server
template:
metadata:
labels:
app: edgecraftrag-server
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
supplementalGroups:
- {{ .Values.gpu.groups.video }}
- {{ .Values.gpu.groups.render }}
containers:
- name: edgecraftrag-server
image: "{{ .Values.image.ecrag.registry }}/edgecraftrag-server:{{ .Values.image.ecrag.tag }}"
imagePullPolicy: IfNotPresent
envFrom:
- configMapRef:
name: edgecraftrag-env
env:
- name: PIPELINE_SERVICE_HOST_IP
value: "0.0.0.0"
ports:
- containerPort: {{ .Values.ports.pipeline }}
volumeMounts:
- name: model-path
mountPath: /home/user/models
- name: docs-path
mountPath: /home/user/docs
- name: tmpfile-path
mountPath: /home/user/ui_cache
- name: prompt-path
mountPath: /templates/custom
- name: dri-device
mountPath: /dev/dri
volumes:
- name: model-path
hostPath:
path: "{{ .Values.paths.model }}"
- name: docs-path
hostPath:
path: "{{ .Values.paths.docs }}"
- name: tmpfile-path
hostPath:
path: "{{ .Values.paths.tmpfile }}"
- name: prompt-path
hostPath:
path: "{{ .Values.paths.prompt }}"
- name: dri-device
hostPath:
path: /dev/dri
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# Copyright (C) 2026 Intel Corporation
# SPDX-License-Identifier: Apache-2.0

apiVersion: apps/v1
kind: DaemonSet
metadata:
name: llm-serving-xpu
spec:
selector:
matchLabels:
app: llm-serving-xpu
template:
metadata:
labels:
app: llm-serving-xpu
spec:
securityContext:
runAsUser: 1000
runAsGroup: 1000
supplementalGroups:
- {{ .Values.gpu.groups.video }}
- {{ .Values.gpu.groups.render }}
containers:
- name: llm-serving-xpu
image: "{{ .Values.image.vllm.registry }}/llm-scaler-vllm:{{ .Values.image.vllm.tag }}"
imagePullPolicy: IfNotPresent
command:
- "/bin/bash"
- "-c"
- "cd /workspace/vllm/models && source /opt/intel/oneapi/setvars.sh --force && \
VLLM_OFFLOAD_WEIGHTS_BEFORE_QUANT=1 TORCH_LLM_ALLREDUCE=1 VLLM_USE_V1=1 \
CCL_ZE_IPC_EXCHANGE=pidfd VLLM_ALLOW_LONG_MAX_MODEL_LEN=1 VLLM_WORKER_MULTIPROC_METHOD=spawn \
python3 -m vllm.entrypoints.openai.api_server \
--model $LLM_MODEL --dtype $DTYPE --enforce-eager --port $VLLM_SERVICE_PORT_B60 \
--trust-remote-code --disable-sliding-window --gpu-memory-util $GPU_MEMORY_UTIL \
--no-enable-prefix-caching --max-num-batched-tokens $MAX_NUM_BATCHED_TOKENS \
--disable-log-requests --max-model-len $MAX_MODEL_LEN --block-size $BLOCK_SIZE \
--quantization $QUANTIZATION -tp=$TP -dp=$DP"
envFrom:
- configMapRef:
name: edgecraftrag-env
ports:
- containerPort: {{ .Values.ports.vllm }}
securityContext:
privileged: true
volumeMounts:
- name: model-path
mountPath: /workspace/vllm/models
- name: dri-device
mountPath: /dev/dri
volumes:
- name: model-path
hostPath:
path: "{{ .Values.paths.model }}"
- name: dri-device
hostPath:
path: /dev/dri
tolerations:
- key: "gpu"
operator: "Exists"
effect: "NoSchedule"
Loading
Loading