LCORE-603: Lightspeed core needs to fully support Faiss VectorIO provider #935

are-ces · 2025-12-18T10:04:44Z

Description

Added small dummy rag db for e2e tests.
Created e2e tests for FAISS rag (making sure the retrieved chunks are from the docs in the db).
Added new required deps (pyproject.toml)
Updated docs/providers, revoked support for shields as they are unusable now.
Updated configs in examples folder, removed old unsupported configs (lls 0.2.x format).

Test:
Ask the agent "What is the title of the article from Paul?", expected answer: "great work"

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

NA

Related Tickets & Documents

Related Issue # LCORE-603
Closes # LCORE-603

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- Added many example run configurations for vLLM/RHEL/RHAIIS/RHOAI and consolidated provider/storage layouts; added FAISS e2e feature spec.
Bug Fixes
- Improved RAG storage persistence by mounting host volume into relevant services.
Documentation
- Updated provider support matrix: llama-guard marked unsupported; rag-runtime marked supported; clarified telemetry/storage config.
Chores
- Added dependency: einops>=0.8.1.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

- FAISS e2e test (for all inference providers) - Disable shields in all configurations - Dummy rag db

coderabbitai · 2025-12-18T10:04:56Z

Walkthrough

Consolidates and restructures many example and test YAML configs to centralize providers and env-backed storage; adds multiple new vLLM/RHEL/RHOAI example configs; removes several legacy examples; adds an E2E FAISS feature and test entry; adds a RAG volume mount to docker-compose files and a new dependency.

Changes

Cohort / File(s)	Summary
Docker Compose RAG Storage `docker-compose.yaml`, `docker-compose-library.yaml`	Added bind mount `./tests/e2e/rag` → `/opt/app-root/src/.llama/storage/rag:Z` for the llama/lightspeed stack services.
Provider docs `docs/providers.md`	Updated provider support table: `llama-guard` (Safety) marked unsupported; `rag-runtime` (Tool Runtime) marked supported.
Large example config restructures `examples/run.yaml`, `examples/azure-run.yaml`, `examples/vertexai-run.yaml`	Consolidated isolated provider blocks into a unified `providers` section, moved storage paths to env-backed `${env.SQLITE_STORE_DIR}` defaults, introduced `storage.backends`/`stores`, `vector_stores`, and adjusted `registered_resources`/`tool_groups`.
Removed example manifests `examples/gemini-run.yaml`, `examples/openai-faiss-run.yaml`, `examples/openai-pgvector-run.yaml`, `examples/vllm-granite-run.yaml`, `examples/vllm-llama-faiss-run.yaml`, `examples/vllm-qwen3-run.yaml`	Deleted legacy example configuration files.
New example manifests `examples/vllm-rhaiis-run.yaml`, `examples/vllm-rhelai-run.yaml`, `examples/vllm-rhoai-run.yaml`	Added new vLLM/RHEL/RHOAI example configs with unified providers, storage backends, vector stores, and registered resources.
Tests / E2E configs `tests/e2e-prow/rhoai/configs/run.yaml`, `tests/e2e/configs/run-azure.yaml`, `tests/e2e/configs/run-ci.yaml`, `tests/e2e/configs/run-rhaiis.yaml`, `tests/e2e/configs/run-rhelai.yaml`, `tests/e2e/configs/run-vertexai.yaml`, `tests/e2e/configs/*`	Consolidated providers, switched SQLite paths to env-backed templates, adjusted namespaces/stores, changed default embedding model to `nomic-ai/nomic-embed-text-v1.5`, removed `llama-guard` shields (annotated disabled), and added/updated vector_store defaults.
E2E tests and test list `tests/e2e/features/faiss.feature`, `tests/e2e/test_list.txt`	Added FAISS feature with two scenarios (vector_store registration + file_search query) and listed it in the test list.
pyproject / dependency `pyproject.toml`	Added dependency `einops>=0.8.1`.

Sequence Diagram(s)

(omitted — changes are configuration and file additions/removals without new multi-component control flow requiring a sequence diagram)

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

LCORE-333: Lightspeed core needs to fully support RHEL AI LLM provider #714 — modifies docker-compose llama-stack mounts; overlaps docker-compose RAG mount changes.
LCORE-642: better run.yaml structure #644 — related provider/storage consolidation applied to run.yaml and related configs.
LCORE-166: add e2e tests for the endpoints #425 — related to E2E test additions and test_list updates.

Suggested reviewers

radofuchs
asimurka
tisnik

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The PR title accurately describes the main objective: adding full support for the Faiss VectorIO provider. The changes throughout the PR consistently reflect this goal, including new FAISS configurations, e2e tests, and updated example configs.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 7

🧹 Nitpick comments (6)

examples/vllm-rhaiis-run.yaml (1)
33-37: Hardcoded port 8000 in URL.

The URL hardcodes port 8000: http://${env.RHAIIS_URL}:8000/v1/ unlike the RHEL AI config which uses ${env.RHEL_AI_PORT}. Consider using an environment variable for the port to maintain deployment flexibility:
url: http://${env.RHAIIS_URL}:${env.RHAIIS_PORT:=8000}/v1/
tests/e2e/features/faiss.feature (1)

8-27: Consider adding a note about the port difference.

This scenario correctly uses @skip-in-library-mode since it directly queries the llama-stack service on port 8321. The second scenario (lines 29-39) omits the hostname/port override, so it will use defaults to test through lightspeed-stack. This asymmetry is intentional but worth a brief inline comment for maintainability.
tests/e2e/configs/run-azure.yaml (1)
34-36: Consider parameterizing the Azure API base URL.

The api_base is hardcoded to a specific test endpoint. For flexibility across environments, consider using an environment variable with a default:
api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
examples/azure-run.yaml (2)
34-36: Consider using an environment variable for the API base URL.

The api_base is hardcoded while api_key uses an environment variable. For consistency and deployment flexibility across environments, consider parameterizing the base URL:
   config: 
     api_key: ${env.AZURE_API_KEY}
-    api_base: https://ols-test.openai.azure.com/
+    api_base: ${env.AZURE_API_BASE:=https://ols-test.openai.azure.com/}
     api_version: 2024-02-15-preview
111-116: Inconsistent storage path structure between backends.

kv_default uses ~/.llama/storage/rag/kv_store.db (includes /rag/ subfolder) while sql_default uses ~/.llama/storage/sql_store.db (no subfolder). This asymmetry may cause confusion. Consider aligning the path structure or documenting the rationale.
run.yaml (1)

119-125: Same storage path inconsistency as azure-run.yaml.

The kv_default backend uses ~/.llama/storage/rag/kv_store.db while sql_default uses ~/.llama/storage/sql_store.db. The comment on line 120 explains that registry and RAG are unified in kv_default, which provides context. However, consider whether the /rag/ subdirectory name is still appropriate given it stores more than just RAG data.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 96888ca and 85c9528.

⛔ Files ignored due to path filters (2)

tests/e2e/rag/kv_store.db is excluded by !**/*.db
uv.lock is excluded by !**/*.lock

📒 Files selected for processing (25)

docker-compose-library.yaml (1 hunks)
docker-compose.yaml (1 hunks)
docs/providers.md (2 hunks)
examples/azure-run.yaml (1 hunks)
examples/gemini-run.yaml (0 hunks)
examples/openai-faiss-run.yaml (0 hunks)
examples/openai-pgvector-run.yaml (0 hunks)
examples/run.yaml (1 hunks)
examples/vertexai-run.yaml (4 hunks)
examples/vllm-granite-run.yaml (0 hunks)
examples/vllm-llama-faiss-run.yaml (0 hunks)
examples/vllm-qwen3-run.yaml (0 hunks)
examples/vllm-rhaiis-run.yaml (1 hunks)
examples/vllm-rhelai-run.yaml (1 hunks)
examples/vllm-rhoai-run.yaml (1 hunks)
pyproject.toml (1 hunks)
run.yaml (3 hunks)
tests/e2e-prow/rhoai/configs/run.yaml (4 hunks)
tests/e2e/configs/run-azure.yaml (4 hunks)
tests/e2e/configs/run-ci.yaml (3 hunks)
tests/e2e/configs/run-rhaiis.yaml (4 hunks)
tests/e2e/configs/run-rhelai.yaml (4 hunks)
tests/e2e/configs/run-vertexai.yaml (4 hunks)
tests/e2e/features/faiss.feature (1 hunks)
tests/e2e/test_list.txt (1 hunks)

💤 Files with no reviewable changes (6)

examples/gemini-run.yaml
examples/vllm-granite-run.yaml
examples/vllm-llama-faiss-run.yaml
examples/vllm-qwen3-run.yaml
examples/openai-faiss-run.yaml
examples/openai-pgvector-run.yaml

🧰 Additional context used

📓 Path-based instructions (3)

tests/e2e/test_list.txt

📄 CodeRabbit inference engine (CLAUDE.md)

Maintain test list in tests/e2e/test_list.txt for end-to-end tests

Files:

tests/e2e/test_list.txt

tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

tests/e2e/features/faiss.feature

pyproject.toml

📄 CodeRabbit inference engine (CLAUDE.md)

pyproject.toml: Configure pylint with source-roots = "src"
Exclude src/auth/k8s.py from pyright type checking

Files:

pyproject.toml

🧠 Learnings (3)

📚 Learning: 2025-11-24T16:58:04.410Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/test_list.txt : Maintain test list in `tests/e2e/test_list.txt` for end-to-end tests

Applied to files:

tests/e2e/test_list.txt

📚 Learning: 2025-11-24T16:58:04.410Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Applies to tests/e2e/features/**/*.feature : Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Applied to files:

tests/e2e/test_list.txt

📚 Learning: 2025-09-02T11:09:40.404Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/environment.py:87-95
Timestamp: 2025-09-02T11:09:40.404Z
Learning: In the lightspeed-stack e2e tests, noop authentication tests use the default lightspeed-stack.yaml configuration, while noop-with-token tests use the Authorized tag to trigger a config swap to the specialized noop-with-token configuration file.

Applied to files:

tests/e2e/test_list.txt

🪛 YAMLlint (1.37.1)

tests/e2e/configs/run-rhaiis.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)

[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)

[error] 100-100: duplication of key "config" in mapping

(key-duplicates)

[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

tests/e2e/configs/run-rhelai.yaml

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)

[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)

[error] 100-100: duplication of key "config" in mapping

(key-duplicates)

[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🔇 Additional comments (30)

pyproject.toml (1)
55-55: Consider using a flexible version constraint for urllib3 or explain the exact version pin.

urllib3 2.6.1 is the latest non-vulnerable version, but 2.6.2 is available as a bug fix. The exact version pin (==2.6.1) is unnecessarily restrictive. If there's no specific compatibility reason for avoiding 2.6.2, consider using >=2.6.1,<2.7 instead. If an exact pin is necessary, document the reason in a comment above the dependency.

The einops dependency is appropriate: einops 0.8.1 is the latest version with no known vulnerabilities.
⛔ Skipped due to learnings
Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2025-11-24T16:58:04.410Z
Learning: Always check `pyproject.toml` for existing dependencies and versions before adding new ones
docs/providers.md (2)

103-103: Documentation accurately reflects the disabled llama-guard shield.

The change from ✅ to ❌ for llama-guard aligns with the configuration changes across the example files where shields are disabled with the comment about "infinite loop issue with LLM calls".

160-160: rag-runtime is now correctly marked as supported.

The rag-runtime provider is properly documented as supported (✅), which aligns with its configuration in all the new example YAML files.

examples/vllm-rhelai-run.yaml (1)

1-151: New RHEL AI configuration file is well-structured.

The configuration follows the consolidated provider pattern with:

Environment-driven paths and credentials

FAISS vector store with sentence-transformers embeddings

Properly disabled shields with documented reasoning

Consistent storage backend configuration

examples/vertexai-run.yaml (2)

30-67: Provider configuration is well-organized.

The inference providers are properly configured with:

Google Vertex AI remote provider with project/location from environment variables

Sentence-transformers for embeddings

FAISS vector store with persistence

132-141: Empty models array may need attention.

The registered_resources.models array is empty while the Vertex AI provider specifies allowed_models: ["google/gemini-2.5-flash"]. Verify if a model registration is needed here, or if the allowed_models configuration in the provider is sufficient for Vertex AI.
examples/vllm-rhoai-run.yaml (2)
134-138: Verify the hardcoded model configuration and null provider_model_id.

Unlike other example configs that use environment variables (e.g., ${env.RHEL_AI_MODEL}), this file hardcodes model_id: meta-llama/Llama-3.2-1B-Instruct and sets provider_model_id: null.

Consider using environment variables for consistency:
- model_id: ${env.RHOAI_MODEL}
  provider_id: vllm
  model_type: llm
  provider_model_id: ${env.RHOAI_MODEL}
37-37: Different max_tokens value than other vLLM configs.

This config uses max_tokens: 1024 while vllm-rhelai-run.yaml and vllm-rhaiis-run.yaml use max_tokens: 2048. Verify if this is intentional for RHOAI deployments.
examples/vllm-rhaiis-run.yaml (1)

1-151: RHAIIS configuration follows the established pattern.

The configuration is well-structured and consistent with other example configurations in this PR.

tests/e2e/configs/run-rhelai.yaml (1)

1-27: Configuration header and store definitions look correct.

The version, image_name, APIs, and store configurations follow the established pattern with proper environment variable substitution.

tests/e2e-prow/rhoai/configs/run.yaml (1)

1-150: Configuration structure looks well-organized.

The consolidated provider configuration with environment-variable-backed paths is consistent. The explicit warnings about disabled shields (lines 49, 138) document the known issue clearly.

tests/e2e/configs/run-vertexai.yaml (2)

30-67: Provider configuration looks correct.

The Vertex AI inference provider is properly configured with project/location environment variables, and the FAISS vector_io provider follows the correct structure with persistence settings.

132-146: Registered resources and vector stores configuration is consistent.

Empty models list is appropriate here since allowed_models is specified in the provider configuration (line 36). The vector_stores defaults align with the FAISS provider setup.

tests/e2e/configs/run-ci.yaml (2)

53-74: Scoring and vector_io providers are well-configured.

The scoring providers include useful documentation comments (e.g., "Enable the RAG tool", "Define the storage backend for RAG"). The placeholder '********' for openai_api_key is appropriate for example configurations.

114-155: Telemetry and storage configuration additions look good.

The inline documentation (line 120, 153) explaining the storage backend purpose improves maintainability. The vector_stores configuration with FAISS defaults is consistent with the PR's objective.

examples/run.yaml (3)

1-6: Good documentation header.

The introductory comments clearly explain the configuration purpose and prerequisites (OpenAI API key, rag-content tool reference). This helps users understand how to use this example.

74-80: FAISS vector_io configuration is correct.

The persistence configuration with namespace: vector_io::faiss and backend: kv_default properly integrates with the storage backends defined below. This aligns with the PR's goal of fully supporting the FAISS VectorIO provider.

147-161: Registered resources and vector stores complete the FAISS integration.

The vector_stores section properly configures:

default_provider_id: faiss matching the vector_io provider

default_embedding_model using sentence-transformers with the nomic model

This provides a complete working example for FAISS-based RAG.

tests/e2e/test_list.txt (1)

1-1: LGTM!

The new faiss.feature is correctly added to the test list, following the coding guidelines for maintaining end-to-end tests in this file.

docker-compose-library.yaml (1)

16-16: LGTM!

The volume mount correctly exposes the test RAG database to the container, enabling FAISS-based e2e tests.

docker-compose.yaml (1)

14-14: LGTM!

Consistent with the corresponding mount in docker-compose-library.yaml, ensuring FAISS e2e tests work in both library and non-library modes.

tests/e2e/features/faiss.feature (1)

1-39: Feature file follows BDD conventions.

The Gherkin structure is correct and scenarios appropriately test FAISS vector store registration and retrieval. As per coding guidelines, this uses the behave framework with proper Given/When/Then steps.

tests/e2e/configs/run-azure.yaml (2)

48-52: Safety providers disabled with documented reason.

The commented-out safety configuration includes a clear warning about the infinite loop issue. This is good practice for documenting why a feature is disabled.

61-67: FAISS vector_io provider correctly configured.

The FAISS provider configuration with persistence to kv_default backend aligns with the storage configuration and the mounted RAG volume path.

examples/azure-run.yaml (3)

4-14: Missing telemetry API compared to run.yaml.

The run.yaml includes telemetry in the APIs list and has a telemetry: enabled: true section, but this Azure configuration omits both. If telemetry should be available in Azure deployments, add it for consistency.

48-52: Shields disabled with documented warning — acceptable for known issue.

The safety providers array is empty with a clear warning comment about the infinite loop issue. The corresponding shields: [] in registered_resources is consistent. This is appropriately documented.

Also applies to: 138-138

132-150: Registered resources and vector_stores configuration looks correct.

The configuration properly:

Registers the gpt-4o-mini model with the Azure provider

Sets up the RAG toolgroup with rag-runtime

Configures FAISS as the default vector store provider with nomic-embed-text-v1.5 as the embedding model

run.yaml (3)

31-39: Well-structured inference providers with environment-driven configuration.

The OpenAI provider correctly uses environment variables for both the API key and allowed models with sensible defaults. The inline sentence-transformers provider is appropriately configured for embedding support.

68-74: FAISS vector_io provider correctly configured for persistence.

The FAISS provider is properly set up with a persistence backend pointing to kv_default, which aligns with the storage configuration. This enables the RAG functionality as intended by the PR.

148-155: RAG toolgroup and vector_stores configuration complete.

The configuration properly:

Registers builtin::rag toolgroup with the rag-runtime provider

Sets FAISS as the default vector store provider

Configures nomic-ai/nomic-embed-text-v1.5 as the default embedding model via sentence-transformers

This completes the FAISS VectorIO provider setup required by the PR objectives.

pyproject.toml

coderabbitai · 2025-12-18T10:14:50Z

run.yaml

+  - provider_id: braintrust
+    provider_type: inline::braintrust
+    config:
+      openai_api_key: '********'


⚠️ Potential issue | 🟡 Minor

Placeholder API key should use an environment variable.

The braintrust provider has a hardcoded placeholder '********' for openai_api_key. This will fail at runtime if braintrust scoring is used. Either use an environment variable pattern consistent with other providers, or document that this provider requires manual configuration.

- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- provider_id: braintrust

provider_type: inline::braintrust

config:

openai_api_key: '********'

- provider_id: braintrust

provider_type: inline::braintrust

config:

openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY:=}

🤖 Prompt for AI Agents

In run.yaml around lines 60 to 63, the braintrust provider currently uses a hardcoded placeholder openai_api_key '********' which will fail at runtime; replace that value with an environment variable reference (follow the pattern used by other providers in this repo — e.g. use a typed env var like ${BRAINTRUST_OPENAI_API_KEY} or ${OPENAI_API_KEY}) and update any README/config docs to state this provider requires that environment variable to be set.

coderabbitai · 2025-12-18T10:14:50Z

run.yaml

+telemetry:
+  enabled: true


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Find run.yaml file fd -n "run.yaml" --type f

Repository: lightspeed-core/lightspeed-stack

Length of output: 306

🏁 Script executed:

#!/bin/bash # Find run.yaml and check its structure fd -type f -name "*.yaml" | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 246

🏁 Script executed:

find . -name "run.yaml" -type f 2>/dev/null | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 177

🏁 Script executed:

#!/bin/bash # Search for run.yaml files find . -type f -name "run.yaml" 2>/dev/null # Also search for telemetry configuration patterns rg -l "telemetry:" --type yaml 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 340

🏁 Script executed:

#!/bin/bash # Locate run.yaml and examine structure around telemetry config if [ -f "./run.yaml" ]; then echo "=== Found run.yaml ===" wc -l ./run.yaml echo "=== Lines 100-130 ===" sed -n '100,130p' ./run.yaml echo "=== Searching for providers section ===" grep -n "providers:" ./run.yaml || echo "No explicit providers section found" elif [ -f "run.yaml" ]; then wc -l run.yaml sed -n '100,130p' run.yaml grep -n "providers:" run.yaml || echo "No explicit providers section found" else find . -name "run.yaml" -type f 2>/dev/null -exec sh -c 'echo "=== Found: $1 ==="; wc -l "$1"; sed -n "100,130p" "$1"; grep -n "providers:" "$1" || echo "No providers section"' _ {} \; fi

Repository: lightspeed-core/lightspeed-stack

Length of output: 1099

🏁 Script executed:

#!/bin/bash # Read the providers section sed -n '1,80p' ./run.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 2155

🏁 Script executed:

#!/bin/bash # Check entire telemetry section and nearby context sed -n '110,125p' ./run.yaml # Search for any backend/exporter configurations grep -i "backend\|exporter\|endpoint" ./run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 1284

🏁 Script executed:

#!/bin/bash # Check if telemetry is listed in apis and look for telemetry-related documentation sed -n '1,20p' ./run.yaml # Search for any telemetry configuration or documentation rg -i "telemetry" --type yaml -B2 -A2 ./run.yaml | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 504

🏁 Script executed:

#!/bin/bash # Search for telemetry-related documentation, READMEs, or code comments find . -type f $ -name "README*" -o -name "*.md" -o -name "telemetry*" $ | head -20 rg -i "telemetry.*provider|telemetry.*backend|telemetry.*config" --type yaml --type python -B1 -A1 | head -30

Repository: lightspeed-core/lightspeed-stack

Length of output: 617

🏁 Script executed:

#!/bin/bash # Search for telemetry documentation rg -i "telemetry" docs/ -l 2>/dev/null | head -10 # Also check config documentation grep -i "telemetry" README.md docs/config.md 2>/dev/null

Repository: lightspeed-core/lightspeed-stack

Length of output: 319

🏁 Script executed:

#!/bin/bash # Check providers documentation for telemetry config grep -i "telemetry" docs/providers.md -B2 -A5 | head -40 # Also check deployment guide grep -i "telemetry" docs/deployment_guide.md -B2 -A5 | head -40

Repository: lightspeed-core/lightspeed-stack

Length of output: 2149

🏁 Script executed:

#!/bin/bash # Get more context from deployment_guide.md about telemetry configuration sed -n '1,2000p' docs/deployment_guide.md | grep -i "telemetry" -B5 -A10 | head -60

Repository: lightspeed-core/lightspeed-stack

Length of output: 2874

🏁 Script executed:

#!/bin/bash # Check other run.yaml examples to see how telemetry should be configured echo "=== Checking examples/run.yaml ===" grep -A5 "telemetry:" examples/run.yaml | head -20 echo "" echo "=== Checking tests/configuration/run.yaml ===" grep -A5 "telemetry:" tests/configuration/run.yaml | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 391

🏁 Script executed:

#!/bin/bash # Check if there are any code comments or documentation about when telemetry provider config is required rg -i "telemetry.*provider|telemetry.*config|enabled.*true" docs/ -B2 -A2 | head -40 # Also check the main config documentation head -100 docs/config.md

Repository: lightspeed-core/lightspeed-stack

Length of output: 4089

Add telemetry provider configuration to match test and documentation patterns.

Telemetry is enabled but lacks an explicit provider configuration under the providers section, unlike other APIs (inference, safety, scoring, etc.). The test configuration at tests/configuration/run.yaml shows the proper pattern with providers.telemetry using the meta-reference provider. Either add explicit telemetry provider config or document why using a default backend is acceptable.

🤖 Prompt for AI Agents

In run.yaml around lines 114 to 115, telemetry is enabled but missing an explicit providers.telemetry entry; add a providers.telemetry block mirroring the pattern used by inference/safety/scoring (e.g., include a telemetry provider key such as meta-reference with its required connection/config fields used in tests) so telemetry has an explicit provider configuration, or update docs to justify relying on a default backend—ensure the YAML structure follows the existing providers.<provider> pattern used elsewhere.

provider config for telemetry is not supported in 0.3.x

@are-ces, thank you for the clarification! I've noted that telemetry provider configuration is not supported in version 0.3.x, which explains why the configuration is valid as-is with just the enabled: true flag.

✏️ Learnings added

Learnt from: are-ces Repo: lightspeed-core/lightspeed-stack PR: 935 File: run.yaml:114-115 Timestamp: 2025-12-18T10:21:03.056Z Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

coderabbitai · 2025-12-18T10:14:50Z

tests/e2e/configs/run-azure.yaml

+vector_stores:
+  default_provider_id: faiss
+  default_embedding_model:
+    provider_id: sentence-transformers
+    model_id: nomic-ai/nomic-embed-text-v1.5


⚠️ Potential issue | 🟡 Minor

Vector stores configuration uses v1.5, though v2 is now available as a newer alternative.

The FAISS and sentence-transformers configuration is properly set. However, Nomic Embed Text V2 is now available for production workloads as a successor to v1.5, offering multilingual support if needed. Additionally, ensure that when embeddings are generated at runtime, the text prompt includes a task instruction prefix (e.g., search_document: or search_query: for RAG applications) to align with the model's requirements.

🤖 Prompt for AI Agents

In tests/e2e/configs/run-azure.yaml around lines 146 to 150, the vector store config pins the embedding model to nomic-ai/nomic-embed-text-v1.5; update model_id to the v2 release (e.g., nomic-ai/nomic-embed-text-v2) to use the newer multilingual production model and adjust any downstream compatibility settings if required, and ensure that runtime embedding calls prepend the appropriate task instruction prefix (for example "search_document:" or "search_query:") to the text passed to the embedding API so the model receives the required task context for RAG/search use cases.

coderabbitai · 2025-12-18T10:14:50Z

tests/e2e/configs/run-rhaiis.yaml

  datasetio:
-  - provider_id: huggingface
-    provider_type: remote::huggingface
-    config:
+  - config:
      kvstore:
-        namespace: datasetio::huggingface
+        namespace: huggingface_datasetio
        backend: kv_default
-  - provider_id: localfs
-    provider_type: inline::localfs
+    provider_id: huggingface
+    provider_type: remote::huggingface
    config:
      kvstore:
-        namespace: datasetio::localfs
+        namespace: localfs_datasetio
        backend: kv_default
-  scoring:
-  - provider_id: basic
-    provider_type: inline::basic
-    config: {}
-  tool_runtime:
-  - provider_id: rag-runtime
-    provider_type: inline::rag-runtime
-    config: {}
-  - provider_id: model-context-protocol
-    provider_type: remote::model-context-protocol
-    config: {}
-  batches:
-  - provider_id: reference
-    provider_type: inline::reference
+    provider_id: localfs
+    provider_type: inline::localfs
    config:
      kvstore:
-        namespace: batches
+        namespace: eval_store
        backend: kv_default
+    provider_id: meta-reference
+    provider_type: inline::meta-reference


⚠️ Potential issue | 🔴 Critical

Critical: Missing list item markers cause duplicate key errors.

The datasetio and eval provider entries are missing the - list item markers, causing YAML duplicate key errors. This will break configuration parsing at runtime.

🔎 Apply this diff to fix the YAML structure:

datasetio: - - config: + - config: kvstore: namespace: huggingface_datasetio backend: kv_default provider_id: huggingface provider_type: remote::huggingface + - config: + kvstore: + namespace: localfs_datasetio + backend: kv_default + provider_id: localfs + provider_type: inline::localfs + eval: + - config: + kvstore: + namespace: eval_store + backend: kv_default + provider_id: meta-reference + provider_type: inline::meta-reference - config: - kvstore: - namespace: localfs_datasetio - backend: kv_default - provider_id: localfs - provider_type: inline::localfs - config: - kvstore: - namespace: eval_store - backend: kv_default - provider_id: meta-reference - provider_type: inline::meta-reference

Compare with the correctly structured examples/run.yaml (lines 99-118) which uses proper list syntax with - markers for each provider entry.

🧰 Tools

🪛 YAMLlint (1.37.1)

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)

[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)

[error] 100-100: duplication of key "config" in mapping

(key-duplicates)

[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

🤖 Prompt for AI Agents

In tests/e2e/configs/run-rhaiis.yaml around lines 87 to 105, the datasetio (and similarly eval) provider entries are written as repeated keys instead of YAML list items, causing duplicate key parse errors; fix by converting each provider block into a proper list entry: add a leading '-' before each provider object (e.g., before each config/provider_id/provider_type group), ensure consistent indentation for the list items and their child keys to match examples/run.yaml (lines ~99-118), and validate the file parses as YAML after the change.

coderabbitai · 2025-12-18T10:14:50Z

tests/e2e/configs/run-rhelai.yaml

  datasetio:
-  - provider_id: huggingface
-    provider_type: remote::huggingface
-    config:
+  - config:
      kvstore:
-        namespace: datasetio::huggingface
+        namespace: huggingface_datasetio
        backend: kv_default
-  - provider_id: localfs
-    provider_type: inline::localfs
+    provider_id: huggingface
+    provider_type: remote::huggingface
    config:
      kvstore:
-        namespace: datasetio::localfs
+        namespace: localfs_datasetio
        backend: kv_default
-  scoring:
-  - provider_id: basic
-    provider_type: inline::basic
-    config: {}
-  tool_runtime:
-  - provider_id: rag-runtime
-    provider_type: inline::rag-runtime
-    config: {}
-  - provider_id: model-context-protocol
-    provider_type: remote::model-context-protocol
-    config: {}
-  batches:
-  - provider_id: reference
-    provider_type: inline::reference
+    provider_id: localfs
+    provider_type: inline::localfs
    config:
      kvstore:
-        namespace: batches
+        namespace: eval_store
        backend: kv_default
+    provider_id: meta-reference
+    provider_type: inline::meta-reference


⚠️ Potential issue | 🔴 Critical

Critical: YAML structure is malformed with duplicate keys.

The datasetio and eval sections have broken YAML structure due to missing list item markers (-). Static analysis correctly identified duplicate keys (config, provider_id, provider_type) which indicates the second datasetio provider and the eval provider are incorrectly merged.

🔎 Apply this diff to fix the YAML structure:

datasetio: - - config: - kvstore: - namespace: huggingface_datasetio - backend: kv_default - provider_id: huggingface - provider_type: remote::huggingface + - config: + kvstore: + namespace: huggingface_datasetio + backend: kv_default + provider_id: huggingface + provider_type: remote::huggingface + - config: + kvstore: + namespace: localfs_datasetio + backend: kv_default + provider_id: localfs + provider_type: inline::localfs + eval: + - config: + kvstore: + namespace: eval_store + backend: kv_default + provider_id: meta-reference + provider_type: inline::meta-reference - config: - kvstore: - namespace: localfs_datasetio - backend: kv_default - provider_id: localfs - provider_type: inline::localfs - config: - kvstore: - namespace: eval_store - backend: kv_default - provider_id: meta-reference - provider_type: inline::meta-reference

Compare with the correct structure in examples/vllm-rhelai-run.yaml (lines 87-106) for reference.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

datasetio:

- provider_id: huggingface

provider_type: remote::huggingface

config:

- config:

kvstore:

namespace: datasetio::huggingface

namespace: huggingface_datasetio

backend: kv_default

- provider_id: localfs

provider_type: inline::localfs

provider_id: huggingface

provider_type: remote::huggingface

config:

kvstore:

namespace: datasetio::localfs

namespace: localfs_datasetio

backend: kv_default

scoring:

- provider_id: basic

provider_type: inline::basic

config: {}

tool_runtime:

- provider_id: rag-runtime

provider_type: inline::rag-runtime

config: {}

- provider_id: model-context-protocol

provider_type: remote::model-context-protocol

config: {}

batches:

- provider_id: reference

provider_type: inline::reference

provider_id: localfs

provider_type: inline::localfs

config:

kvstore:

namespace: batches

namespace: eval_store

backend: kv_default

provider_id: meta-reference

provider_type: inline::meta-reference

datasetio:

- config:

kvstore:

namespace: huggingface_datasetio

backend: kv_default

provider_id: huggingface

provider_type: remote::huggingface

- config:

kvstore:

namespace: localfs_datasetio

backend: kv_default

provider_id: localfs

provider_type: inline::localfs

eval:

- config:

kvstore:

namespace: eval_store

backend: kv_default

provider_id: meta-reference

provider_type: inline::meta-reference

🧰 Tools

🪛 YAMLlint (1.37.1)

[error] 94-94: duplication of key "config" in mapping

(key-duplicates)

[error] 98-98: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 99-99: duplication of key "provider_type" in mapping

(key-duplicates)

[error] 100-100: duplication of key "config" in mapping

(key-duplicates)

[error] 104-104: duplication of key "provider_id" in mapping

(key-duplicates)

[error] 105-105: duplication of key "provider_type" in mapping

(key-duplicates)

🤖 Prompt for AI Agents

In tests/e2e/configs/run-rhelai.yaml around lines 87 to 105, the YAML list items under datasetio (and the eval section) are missing leading '-' markers, causing duplicate mapping keys (config, provider_id, provider_type) to be merged incorrectly; fix by converting each provider block into a proper list entry (prepend '-' to each provider object under datasetio and ensure eval is a separate mapping, not merged), align indentation so each entry contains its own config/kvstore block, provider_id, and provider_type, and verify the final structure matches examples/vllm-rhelai-run.yaml lines 87-106.

coderabbitai · 2025-12-18T10:14:51Z

tests/e2e/features/faiss.feature

+    And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
+    When I use "query" to ask question with authorization header


⚠️ Potential issue | 🟡 Minor

🧩 Analysis chain

🏁 Script executed:

cat -n tests/e2e/features/faiss.feature | sed -n '30,35p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 542

Complete the JWT token with all three segments.

A well-formed JWT consists of three concatenated Base64url-encoded strings, separated by dots. The token on line 31 is missing the signature segment and the payload appears truncated. Provide a complete, valid JWT token with all three parts (header.payload.signature).

🤖 Prompt for AI Agents

In tests/e2e/features/faiss.feature around lines 31 to 32, the Authorization Bearer token is incomplete (missing the third signature segment and has a truncated payload); replace that string with a well-formed JWT consisting of three Base64url-encoded segments separated by dots (header.payload.signature). Ensure the token used in the test is complete and valid for your test environment (either a real signed token or a locally generated test token with matching server validation), and update the line to include the full token string.

pyproject.toml

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @tests/e2e/configs/run-ci.yaml:
- Around line 60-63: The Braintrust provider config currently hardcodes
openai_api_key: '********' which will fail at runtime; update the provider
config for provider_type inline::braintrust to read the API key from an
environment variable (e.g., using the config expression or secret interpolation
used elsewhere in CI configs) instead of the literal '********', ensuring the
provider_id/provider_type block uses the env var name so the runtime picks up
the real key.
- Around line 160-164: registered_resources has malformed YAML: move the model
entries currently under the top-level list into a nested models: mapping and
ensure shields: [] remains a key with a models: list beneath it; specifically,
under registered_resources->shields keep "shields: []" as an empty sequence or
mapping and create a "models:" key containing entries with "provider_id:" and
"model_id:" (e.g., add an entry for provider_id: sentence-transformers and
model_id: nomic-ai/nomic-embed-text-v1.5) so the default_embedding_model
referenced in vector_stores is registered; validate indentation and YAML
structure so the models are children of the models: key, not sibling keys at the
same level as shields.

🧹 Nitpick comments (2)

tests/e2e/configs/run-ci.yaml (2)
48-52: Track the shields/safety infinite loop issue.

Safety providers and shields are disabled due to an "infinite loop issue with LLM calls". While this may be acceptable for E2E testing, ensure this known issue is tracked and prioritized for resolution before production deployment.

Is there an open issue tracking the shields infinite loop problem? If not, would you like me to help create one?

120-120: Consider breaking the long comment into multiple lines.

The comment explaining the unified storage backend is informative but spans over 100 characters, which impacts readability.
♻️ Suggested refactor
-    kv_default: # Define the storage backend type for RAG, in this case registry and RAG are unified i.e. information on registered resources (e.g. models, vector_stores) are saved together with the RAG chunks
+    kv_default:
+      # Define the storage backend type for RAG. In this case, registry and RAG are unified,
+      # i.e., information on registered resources (e.g., models, vector_stores) are saved
+      # together with the RAG chunks.
       type: kv_sqlite
       db_path: ${env.SQLITE_STORE_DIR:=~/.llama/storage/rag/kv_store.db}

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between ca83927 and c80a78d.

⛔ Files ignored due to path filters (1)

uv.lock is excluded by !**/*.lock

📒 Files selected for processing (1)

tests/e2e/configs/run-ci.yaml

🧰 Additional context used

🧠 Learnings (1)

📚 Learning: 2025-12-18T10:21:09.038Z

Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

tests/e2e/configs/run-ci.yaml

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: server mode / vertexai
GitHub Check: E2E: server mode / azure
GitHub Check: E2E: library mode / azure
GitHub Check: E2E: library mode / vertexai

🔇 Additional comments (2)

tests/e2e/configs/run-ci.yaml (2)

31-36: LGTM: Inference provider properly configured.

The OpenAI inference provider configuration correctly uses environment variables for the API key and model selection, with a sensible default for E2E testing.

68-74: LGTM: FAISS vector_io provider properly configured.

The FAISS provider configuration correctly references the kv_default backend for persistence, which aligns with the PR objective to fully support FAISS.

coderabbitai · 2026-01-07T09:50:09Z

tests/e2e/configs/run-ci.yaml

+  - provider_id: braintrust
+    provider_type: inline::braintrust
+    config:
+      openai_api_key: '********'


⚠️ Potential issue | 🟠 Major

Use environment variable for the Braintrust API key.

The hardcoded placeholder openai_api_key: '********' will cause the Braintrust scoring provider to fail at runtime. Replace it with an environment variable reference.

🔧 Proposed fix

- provider_id: braintrust provider_type: inline::braintrust config: - openai_api_key: '********' + openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY}

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

- provider_id: braintrust

provider_type: inline::braintrust

config:

openai_api_key: '********'

- provider_id: braintrust

provider_type: inline::braintrust

config:

openai_api_key: ${env.BRAINTRUST_OPENAI_API_KEY}

🤖 Prompt for AI Agents

In @tests/e2e/configs/run-ci.yaml around lines 60 - 63, The Braintrust provider config currently hardcodes openai_api_key: '********' which will fail at runtime; update the provider config for provider_type inline::braintrust to read the API key from an environment variable (e.g., using the config expression or secret interpolation used elsewhere in CI configs) instead of the literal '********', ensuring the provider_id/provider_type block uses the env var name so the runtime picks up the real key.

coderabbitai · 2026-01-07T09:50:09Z

tests/e2e/configs/run-ci.yaml

+vector_stores:
+  default_provider_id: faiss
+  default_embedding_model: # Define the default embedding model for RAG
+    provider_id: sentence-transformers
+    model_id: nomic-ai/nomic-embed-text-v1.5


⚠️ Potential issue | 🔴 Critical

🧩 Analysis chain

🏁 Script executed:

# First, let's look at the full context of the YAML file git ls-files "tests/e2e/configs/run-ci.yaml" wc -l tests/e2e/configs/run-ci.yaml

Repository: lightspeed-core/lightspeed-stack

Length of output: 139

🏁 Script executed:

# View the relevant sections of the YAML file head -n 170 tests/e2e/configs/run-ci.yaml | tail -n 30

Repository: lightspeed-core/lightspeed-stack

Length of output: 1028

🏁 Script executed:

# Search for how embedding models are initialized/used in the codebase rg -n "default_embedding_model|embedding_model|nomic-embed" --type yaml -C3

Repository: lightspeed-core/lightspeed-stack

Length of output: 6354

🏁 Script executed:

# Look for how registered_resources is used in the codebase rg -n "registered_resources" --type python -A2 -B2 | head -50

Repository: lightspeed-core/lightspeed-stack

Length of output: 106

🏁 Script executed:

# Check if there's any model loading/initialization logic rg -n "sentence-transformers|embedding" --type python -C2 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 106

🏁 Script executed:

# Search for how registered_resources is used in Python code rg -n "registered_resources|default_embedding_model" --type-list

Repository: lightspeed-core/lightspeed-stack

Length of output: 465

🏁 Script executed:

# Look for Python files and understand model initialization fd -e py -e py.bak | head -20

Repository: lightspeed-core/lightspeed-stack

Length of output: 633

🏁 Script executed:

# Search for how embedding models are loaded in the source code rg -n "embedding_model|default_embedding" -C3 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 7278

🏁 Script executed:

# Look at the structure of registered_resources vs default_embedding_model rg -n "registered_resources|vector_stores" tests/e2e/configs/run-ci.yaml -B2 -A20

Repository: lightspeed-core/lightspeed-stack

Length of output: 1859

🏁 Script executed:

# Check if there are tests that validate this configuration fd -e py -path "*/test*" | xargs rg -l "run-ci.yaml|embedding_model" 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 601

🏁 Script executed:

# Search for how the default_embedding_model is used rg -n "default_embedding_model" -A5 -B5

Repository: lightspeed-core/lightspeed-stack

Length of output: 8310

🏁 Script executed:

# Check the config.py file to understand model registration cat -n src/models/config.py | grep -A10 -B10 "DEFAULT_EMBEDDING_MODEL\|embedding_model"

Repository: lightspeed-core/lightspeed-stack

Length of output: 825

🏁 Script executed:

# Search for where registered_resources models are actually referenced rg -n "registered_resources" -A3 -B3 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 4043

🏁 Script executed:

# Look for vector_stores configuration handling rg -n "vector_stores" -A5 -B2

Repository: lightspeed-core/lightspeed-stack

Length of output: 37045

🏁 Script executed:

# Check if there's any validation that registered models must match used models rg -n "sentence-transformers/all-mpnet-base-v2\|nomic-ai" -C3

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Look for configuration validation or initialization code rg -n "default_embedding_model|vector_stores" src/ -A5 -B2 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 3088

🏁 Script executed:

# Search for how registered_resources models are validated rg -n "registered_resources.*model\|validate.*model" src/ -C3 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Check if there's a schema or validator that checks embedding models fd -e py | xargs grep -l "embedding.*model\|registered.*resources" 2>/dev/null | head -10

Repository: lightspeed-core/lightspeed-stack

Length of output: 285

🏁 Script executed:

# Look for environment setup or initialization that loads embedding models rg -n "embedding.*model\|sentence-transformers\|nomic" src/ -C2 | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Check the constants file cat -n src/constants.py | head -100

Repository: lightspeed-core/lightspeed-stack

Length of output: 3722

🏁 Script executed:

# Look at the complete registered_resources structure in run-ci.yaml cat -n tests/e2e/configs/run-ci.yaml | sed -n '141,165p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1047

🏁 Script executed:

# Search for how embedded models are actually resolved or loaded rg -n "sentence-transformers\|embedding.*model\|default_embedding" src/ -A3 -B1

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Check if there are any tests that show the relationship between registered models and default_embedding_model fd test.*yaml | xargs grep -l "embedding_model\|registered_resources" 2>/dev/null

Repository: lightspeed-core/lightspeed-stack

Length of output: 60

🏁 Script executed:

# Look for any documentation or README files about configuration fd -e md -e rst | xargs grep -l "registered_resources\|embedding.*model" 2>/dev/null | head -5

Repository: lightspeed-core/lightspeed-stack

Length of output: 169

🏁 Script executed:

# Check if there's a configuration schema or validation that relates these rg -n "registered_resources" tests/unit/ -A10 -B2 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Look for comments in YAML files that explain registered_resources vs vector_stores rg -n "registered_resources\|vector_stores" examples/run.yaml -B5 -A10

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Check if the nomic model is pre-downloaded or if it's dynamically loaded rg -n "nomic\|sentence-transformers" tests/ -C2 | head -80

Repository: lightspeed-core/lightspeed-stack

Length of output: 58

🏁 Script executed:

# Look at the full structure of another config file to compare patterns cat -n examples/run.yaml | sed -n '141,165p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 856

🏁 Script executed:

# Check the exact YAML structure in run-ci.yaml around registered_resources cat -n tests/e2e/configs/run-ci.yaml | sed -n '138,165p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 1132

🏁 Script executed:

# Compare with examples/run.yaml to see the correct structure cat -n examples/run.yaml | sed -n '138,170p'

Repository: lightspeed-core/lightspeed-stack

Length of output: 959

🏁 Script executed:

# Validate the YAML syntax of both files python3 << 'EOF' import yaml # Check run-ci.yaml try: with open('tests/e2e/configs/run-ci.yaml', 'r') as f: data_ci = yaml.safe_load(f) print("run-ci.yaml: Valid YAML") print(f"registered_resources keys: {list(data_ci.get('registered_resources', {}).keys())}") # Check if there are models defined if 'registered_resources' in data_ci: rr = data_ci['registered_resources'] # Look for model entries if isinstance(rr, dict): print(f" - Has 'models' key: {'models' in rr}") # Check for list items at root level for key in rr: if isinstance(rr[key], list) and rr[key] and isinstance(rr[key][0], dict) and 'model_id' in rr[key][0]: print(f" - Models under key '{key}': {len(rr[key])} items") except Exception as e: print(f"run-ci.yaml: Error - {e}") print() # Check examples/run.yaml try: with open('examples/run.yaml', 'r') as f: data_ex = yaml.safe_load(f) print("examples/run.yaml: Valid YAML") print(f"registered_resources keys: {list(data_ex.get('registered_resources', {}).keys())}") if 'registered_resources' in data_ex: rr = data_ex['registered_resources'] if isinstance(rr, dict): print(f" - Has 'models' key: {'models' in rr}") if 'models' in rr: print(f" - Models: {len(rr['models'])} items") except Exception as e: print(f"examples/run.yaml: Error - {e}") print() print("=== Checking vector_stores section ===") if 'vector_stores' in data_ci: print("run-ci.yaml vector_stores:") print(f" - default_embedding_model model_id: {data_ci['vector_stores'].get('default_embedding_model', {}).get('model_id')}") if 'vector_stores' in data_ex: print("examples/run.yaml vector_stores:") print(f" - default_embedding_model model_id: {data_ex['vector_stores'].get('default_embedding_model', {}).get('model_id')}") EOF

Repository: lightspeed-core/lightspeed-stack

Length of output: 671

🏁 Script executed:

# Show the exact indentation issue cat -n tests/e2e/configs/run-ci.yaml | sed -n '141,154p' | cat -A

Repository: lightspeed-core/lightspeed-stack

Length of output: 671

🏁 Script executed:

# Compare with the valid structure cat -n examples/run.yaml | sed -n '147,153p' | cat -A

Repository: lightspeed-core/lightspeed-stack

Length of output: 324

🏁 Script executed:

# Verify the syntax error in detail python3 << 'EOF' import yaml print("=== Checking run-ci.yaml YAML validity ===") try: with open('tests/e2e/configs/run-ci.yaml', 'r') as f: yaml.safe_load(f) print("✓ Valid YAML") except yaml.YAMLError as e: print(f"✗ YAML Syntax Error:") print(f" Line: {e.problem_mark.line + 1}") print(f" Column: {e.problem_mark.column + 1}") print(f" Problem: {e.problem}") print("\n=== Checking examples/run.yaml YAML validity ===") try: with open('examples/run.yaml', 'r') as f: yaml.safe_load(f) print("✓ Valid YAML") except yaml.YAMLError as e: print(f"✗ YAML Syntax Error: {e.problem}") EOF

Repository: lightspeed-core/lightspeed-stack

Length of output: 273

Fix critical YAML syntax error in registered_resources section.

The registered_resources section has a malformed YAML structure. On line 142, shields: [] is defined as a dictionary key, but line 143 immediately starts a list with - model_id: at the same indentation level. This is invalid YAML and will cause the configuration file to fail parsing.

Models must be under a models: key (as shown in examples/run.yaml). Additionally, the default embedding model (nomic-ai/nomic-embed-text-v1.5) is not registered in the models list.

Required fix

registered_resources: + models: + - model_id: sentence-transformers/all-mpnet-base-v2 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: sentence-transformers/all-mpnet-base-v2 + metadata: + embedding_dimension: 768 + - model_id: nomic-ai/nomic-embed-text-v1.5 + model_type: embedding + provider_id: sentence-transformers + provider_model_id: nomic-ai/nomic-embed-text-v1.5 + metadata: + embedding_dimension: 768 + - model_id: gpt-4o-mini + provider_id: openai + model_type: llm + provider_model_id: gpt-4o-mini shields: [] # WARNING: Shields disabled due to infinite loop issue with LLM calls - - model_id: sentence-transformers/all-mpnet-base-v2 - model_type: embedding - provider_id: sentence-transformers - provider_model_id: sentence-transformers/all-mpnet-base-v2 - metadata: - embedding_dimension: 768 - - model_id: gpt-4o-mini - provider_id: openai - model_type: llm - provider_model_id: gpt-4o-mini vector_dbs: [] datasets: [] scoring_fns: [] benchmarks: []

Committable suggestion skipped: line range outside the PR's diff.

🤖 Prompt for AI Agents

In @tests/e2e/configs/run-ci.yaml around lines 160 - 164, registered_resources has malformed YAML: move the model entries currently under the top-level list into a nested models: mapping and ensure shields: [] remains a key with a models: list beneath it; specifically, under registered_resources->shields keep "shields: []" as an empty sequence or mapping and create a "models:" key containing entries with "provider_id:" and "model_id:" (e.g., add an entry for provider_id: sentence-transformers and model_id: nomic-ai/nomic-embed-text-v1.5) so the default_embedding_model referenced in vector_stores is registered; validate indentation and YAML structure so the models are children of the models: key, not sibling keys at the same level as shields.

tisnik

I'm not sure if the deletion of "old" examples should be part of this PR. Otherwise looks OK, thank you very much.

radofuchs · 2026-01-07T10:28:20Z

tests/e2e/features/faiss.feature

+      And REST API service prefix is /v1
+
+  @skip-in-library-mode
+  Scenario: Verify vector store is registered


this scenario is testing llama-stack behavior, it was useful in debugging and could be added as a single step to make sure the vector_store exists but entire test should not be there

are-ces added 2 commits December 18, 2025 10:10

Added FAISS support

431ab65

- FAISS e2e test (for all inference providers) - Disable shields in all configurations - Dummy rag db

Update/remove old configuration examples

b8475e2

are-ces requested review from asimurka, radofuchs and tisnik December 18, 2025 10:04

Merge branch 'main' into faiss-support-final

85c9528

coderabbitai bot reviewed Dec 18, 2025

View reviewed changes

tisnik requested changes Dec 18, 2025

View reviewed changes

pyproject.toml Outdated Show resolved Hide resolved

update lockfile

ca83927

are-ces force-pushed the faiss-support-final branch from 160b644 to ca83927 Compare December 18, 2025 10:32

Merge branch 'main' into faiss-support-final

c80a78d

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

tisnik reviewed Jan 7, 2026

View reviewed changes

radofuchs reviewed Jan 7, 2026

View reviewed changes

		And I set the Authorization header to Bearer eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9.eyJzdWIiOiIxMjM0NTY3ODkwIiwibmFtZSI6Ikpva
		When I use "query" to ask question with authorization header

LCORE-603: Lightspeed core needs to fully support Faiss VectorIO provider #935

Are you sure you want to change the base?

LCORE-603: Lightspeed core needs to fully support Faiss VectorIO provider #935

Uh oh!

Conversation

are-ces commented Dec 18, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

are-ces Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Dec 18, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

radofuchs Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

are-ces commented Dec 18, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 18, 2025 •

edited

Loading

coderabbitai bot Dec 18, 2025 •

edited

Loading