LCORE-332: Lightspeed core needs to fully support WatsonX LLM provider #943

are-ces · 2025-12-22T09:19:34Z

Description

Added e2e tests for WatsonX provider
Updated documentation
updated e2e tests for the issue mentioned below
updated query.py for the issue mentioned below

When calling query without explicitly specifying a model or provider, query.py in LCS checks whether the model is registered. In this case, LCS fails to find the WatsonX model. This happens because WatsonX model identifiers are not built in the expected <provider_id>/<model_id> format. WatsonX is one of the few providers that registers a model_id containing a /, for example: meta-llama/llama-4-maverick-17b-128e-instruct-fp8, which in llama-stack is stored as the model.identifier, while LCS expects it to be watsonx/meta-llama/llama-4-maverick-17b-128e-instruct-fp8.

As a workaround, I propose using environment variables to override the model and provider used in end-to-end tests (E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE). Additionally, the e2e tests should explicitly specify the model and provider instead of sending only the query field.

We can keep the existing tests in query.feature and streaming.feature that verify calling the endpoint without specifying a provider or model. These tests will continue to fail for WatsonX until the issue is fixed upstream.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)
NA

Related Tickets & Documents

Related Issue # LCORE-332
Closes # LCORE-332

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

New Features
- WatsonX provider added (configurable endpoint, project ID, API key) and improved model/provider selection.
- Agent-to-Agent (A2A) protocol support with agent card, streaming A2A endpoints, and persistent A2A state options.
Documentation
- New A2A protocol docs, OpenAPI /v1/infer doc, and Database structure/navigation pages; provider tables updated for WatsonX.
Tests
- Expanded end-to-end and CI workflows to cover WatsonX and A2A scenarios.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2025-12-22T09:19:44Z

Warning

Rate limit exceeded

@are-ces has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 45 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

📥 Commits

Reviewing files that changed from the base of the PR and between cf81fea and 60de883.

📒 Files selected for processing (13)

.github/workflows/e2e_tests_providers.yaml
README.md
docker-compose-library.yaml
docker-compose.yaml
docs/providers.md
examples/watsonx-run.yaml
src/app/endpoints/query.py
tests/e2e/configs/run-watsonx.yaml
tests/e2e/features/conversations.feature
tests/e2e/features/environment.py
tests/e2e/features/query.feature
tests/e2e/features/steps/feedback.py
tests/e2e/features/streaming_query.feature

Walkthrough

Adds A2A (Agent-to-Agent) protocol support, A2A persistent state storage backends, shield moderation pre-LLM, WatsonX provider/docs/config, many OpenAPI/A2A/OpenAPI schema and docs updates, extensive test additions (E2E and unit), CI workflow additions for multi-environment E2E runs, and assorted typing/docstring/typing-style cleanups across the codebase.

Changes

Cohort / File(s)	Summary
A2A Endpoints & Runtime `src/app/endpoints/a2a.py`, `src/app/routers.py`, `examples/lightspeed-stack-a2a-state-*.yaml`, `lightspeed-stack.yaml`, `docs/a2a_protocol.md`, `docs/config.md`, `docs/config.html`, `docs/openapi.md`, `docs/openapi.json`	New A2A endpoint implementation, agent-card serving, A2A JSON-RPC handler (streaming & non‑streaming), health endpoint, router inclusion, new examples and config fields (service.base_url, customization.agent_card_*), and OpenAPI/A2A schema/docs additions.
A2A Storage Backends & Factory `src/a2a_storage/*`, `src/a2a_storage/...`	New A2A storage package with abstract `A2AContextStore`, in-memory/SQLite/Postgres implementations, and `A2AStorageFactory` to create/initialize task/context stores and manage engines.
Shield Moderation & Related Utilities `src/utils/shields.py`, `src/utils/types.py`, `src/app/endpoints/streaming_query_v2.py`, `src/app/endpoints/query_v2.py`, `src/app/endpoints/streaming_query.py`	Introduced pre-LLM shield moderation flow (run_shield_moderation, append_turn_to_conversation), ShieldModerationResult type, violation streaming, and integration into query_v2/streaming_query_v2; removed prior guardrail attachment pattern.
Token Usage / Quota Integration `src/utils/quota.py`, `src/app/endpoints/query.py`, `src/app/endpoints/streaming_query_v2.py`, tests related	Extended consume_tokens to accept token_usage_history, model_id, provider_id and updated call sites to pass token usage history and identifiers.
Providers / WatsonX Additions `README.md`, `docs/providers.md`, `docker-compose.yaml`, `docker-compose-library.yaml`, `examples/watsonx-run.yaml`, `tests/e2e/configs/run-watsonx.yaml`	Added WatsonX entries to docs/tables, new example/test run YAMLs for watsonx, and introduced WATSONX_* env vars plus E2E_OPENAI_MODEL default changes (gpt-4-turbo → gpt-4o-mini).
E2E Tests & Behave Steps `tests/e2e/features/.feature`, `tests/e2e/features/environment.py`, `tests/e2e/features/steps/.py`, `tests/e2e/configs/run-*.yaml`, `.github/workflows/e2e_tests_providers.yaml`	Added model/provider placeholders to request payloads, added fallback/override logic for default model/provider, new watsonx run config, and new provider-aware GitHub Actions E2E workflow matrix.
OpenAPI / API Schema & Docs `docs/openapi.json`, `docs/openapi.md`, `src/app/endpoints/README.md`	Large OpenAPI additions including /v1/infer and many A2A/rlsapi schemas, plus endpoint README updates.
DB Schema Documentation `docs/DB/*`	Added SchemaSpy-generated static DB documentation pages (index, tables/columns/constraints/relationships/orphans/anomalies/routines).
Cache & Postgres Schema Namespace `src/cache/postgres_cache.py`, tests/unit/cache/test_postgres_cache.py`	Add namespace-aware schema handling for Postgres cache (namespace validation, search_path, CREATE SCHEMA) and tests for namespace validation.
Main App Lifecycle / Cleanup `src/app/main.py`	Calls `A2AStorageFactory.cleanup()` on shutdown to dispose engines/resources.
Config & Models Changes `src/models/config.py`, `src/models/responses.py`, `src/models/requests.py`, `src/models/rlsapi/responses.py`	Added A2AStateConfiguration, new ServiceConfiguration base_url and Customization agent_card fields; multiple typing changes switching union-with-None to `Optional[...]` and new enum Action values for A2A.
Tests: Unit & Integration Additions/Updates `tests/unit/`, `tests/integration/`	Many new unit tests for A2A storage, A2A endpoints, shields; updates to numerous tests to integrate moderation, model/provider changes, typing adjustments, and extended docstrings.
CI / Tekton / Build Files `.github/workflows/e2e_tests.yaml`, `.github/workflows/e2e_tests_providers.yaml`, `.tekton/*.yaml`, `pyproject.toml`, `requirements-build.txt`, `Makefile`, `Containerfile`	New/updated E2E workflows (providers matrix), removed non-ci matrix branches, Tekton task bundle/digest updates and expanded prefetch input, dependency pins and build requirements file, small Makefile and Containerfile tweaks.
Misc Typing & Docstring Cleanup many `src/` and `tests/`	Widespread docstring expansions, migration from `X
Ignored Files `.gitignore`	Added `requirements.*.backup` ignore pattern.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client
participant A2A_API as A2A Endpoint (/a2a)
participant Executor as A2AAgentExecutor
participant Llama as Llama Stack (Responses API)
participant ContextStore as A2AContextStore
participant TaskStore as TaskStore
Client->>A2A_API: POST /a2a JSON-RPC (streaming or buffered)
A2A_API->>Executor: validate request, select model/provider, authenticate
Executor->>ContextStore: resolve context_id -> conversation_id
Executor->>TaskStore: create Task, emit initial TaskStatus
Executor->>Llama: call Responses API (streaming)
Llama-->>Executor: stream chunks/events
Executor->>TaskStore: update task status/events
Executor-->>Client: stream A2A events (TaskStatusUpdate/ArtifactUpdate)
Executor->>ContextStore: persist context->conversation mapping (if needed)
Note over Executor,TaskStore: on completion aggregate final TaskResult and store result

Estimated code review effort

🎯 5 (Critical) | ⏱️ ~120 minutes

Possibly related PRs

LCORE-275: Add A2A Support #727 — Adds A2A endpoint, a2a_storage, configuration fields and tests; strong overlap with this PR’s A2A implementation and storage factory.
LCORE-849: Shields implementation using Moderations API #979 — Implements shields → moderation → create_violation_stream workflow and related utilities; directly related to the shield moderation changes here.
LCORE-1137: store info about consumed tokens into token usage history #958 — Changes to quota/token consumption API to include token_usage_history, model_id, and provider_id; directly related to quota integration updates in this PR.

Suggested labels

ok-to-test

Suggested reviewers

tisnik

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly identifies the primary change: adding WatsonX LLM provider support. It's concise, specific, and directly related to the main objective of the changeset.
Docstring Coverage	✅ Passed	Docstring coverage is 98.47% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

are-ces · 2025-12-22T09:22:24Z

PS: Please ignore issues with VertexAI provider as they are independent of these changes; the project has been migrated by the cloud team, I will need to talk to them to re-enable it.

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (6)

.github/workflows/e2e_tests.yaml (1)
263-268: Document the watsonx override as temporary technical debt.

The override E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instruct duplicates the provider name, which is a workaround for WatsonX's model identifier format containing /. Per the PR description, this is intentional until an upstream fix is applied.

Consider tracking this as technical debt with a TODO comment or issue reference to ensure it's revisited when the upstream fix is available.
🔎 Suggested improvement
 # watsonx has a different convention than "<provider>/<model>"
+# TODO(LCORE-XXX): Remove this workaround once Llama Stack properly handles WatsonX model identifiers with embedded slashes
 - name: Set watsonx test overrides
   if: matrix.environment == 'watsonx'
   run: |
     echo "E2E_DEFAULT_MODEL_OVERRIDE=watsonx/watsonx/meta-llama/llama-3-3-70b-instruct" >> $GITHUB_ENV
     echo "E2E_DEFAULT_PROVIDER_OVERRIDE=watsonx" >> $GITHUB_ENV
tests/e2e/features/environment.py (1)
61-86: Consider validating partial override configuration.

The current logic requires both E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE to be set (line 65). If only one is provided, both are silently ignored and the code falls back to service detection. This could be confusing for users who expect a partial override to work.

Consider adding a warning when only one override is set, or raising an error if they should be used together.
🔎 Proposed validation for partial overrides
 # Check for environment variable overrides first
 model_override = os.getenv("E2E_DEFAULT_MODEL_OVERRIDE")
 provider_override = os.getenv("E2E_DEFAULT_PROVIDER_OVERRIDE")
 
+# Validate that overrides are provided together
+if bool(model_override) != bool(provider_override):
+    print(
+        "⚠ Warning: Both E2E_DEFAULT_MODEL_OVERRIDE and E2E_DEFAULT_PROVIDER_OVERRIDE "
+        "must be set together. Falling back to service detection."
+    )
+
 if model_override and provider_override:
     context.default_model = model_override
     context.default_provider = provider_override
     print(
         f"Using override LLM: {context.default_model} (provider: {context.default_provider})"
     )
examples/watsonx-run.yaml (2)

65-65: Clarify the hardcoded asterisks for openai_api_key.

The openai_api_key is set to a literal string of asterisks '********'. If the braintrust scoring provider is intended to be functional, this will fail authentication. If braintrust is disabled or not used in this example, consider removing this provider entry or adding a comment explaining it's a placeholder.

50-50: Track the workaround for disabled safety shields.

Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This is a significant security/safety feature being disabled.

Consider creating a tracking issue for this known limitation so it can be re-enabled once the upstream issue is resolved.

Would you like me to help create a GitHub issue to track this limitation?

Also applies to: 149-149

tests/e2e/configs/run-watsonx.yaml (2)

65-65: Clarify the hardcoded asterisks for openai_api_key.

The openai_api_key is set to a literal string of asterisks '********'. If the braintrust scoring provider is needed for E2E tests, this will fail authentication. If braintrust is not used in these tests, consider removing this provider entry or documenting that it's intentionally disabled.

50-50: Document the workaround for disabled safety shields in E2E tests.

Safety shields are disabled with a warning comment about infinite loop issues with LLM calls (lines 50 and 149). This affects the test coverage for safety features.

Consider adding a comment or documentation about which safety-related test scenarios are skipped due to this limitation.

Also applies to: 149-149

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between cd795b3 and 9dbf66b.

📒 Files selected for processing (12)

.github/workflows/e2e_tests.yaml
README.md
docker-compose-library.yaml
docker-compose.yaml
docs/providers.md
examples/watsonx-run.yaml
src/app/endpoints/query.py
tests/e2e/configs/run-watsonx.yaml
tests/e2e/features/conversations.feature
tests/e2e/features/environment.py
tests/e2e/features/query.feature
tests/e2e/features/streaming_query.feature

🧰 Additional context used

📓 Path-based instructions (5)

tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

tests/e2e/features/conversations.feature
tests/e2e/features/streaming_query.feature
tests/e2e/features/query.feature

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use absolute imports for internal modules in LCS project (e.g., from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Use logger = logging.getLogger(__name__) pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) and Optional[Type] or Type | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Use async def for I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes: Configuration for config classes, Error/Exception for exceptions, Resolver for strategy patterns, Interface for abstract base classes
Abstract classes must use ABC with @abstractmethod decorators
Include complete type annotations for all class attributes in Python classes
Use import logging and module logger pattern with standard log levels: debug, info, warning, error

Files:

src/app/endpoints/query.py

src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/query.py

src/**/{client,app/endpoints/**}.py

📄 CodeRabbit inference engine (CLAUDE.md)

Handle APIConnectionError from Llama Stack in integration code

Files:

src/app/endpoints/query.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest-mock with AsyncMock objects for mocking in tests

Files:

tests/e2e/features/environment.py

🧠 Learnings (1)

📚 Learning: 2025-12-18T10:21:09.038Z

Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

examples/watsonx-run.yaml
tests/e2e/configs/run-watsonx.yaml

🧬 Code graph analysis (2)

src/app/endpoints/query.py (2)

tests/unit/app/endpoints/test_streaming_query.py (1)

ToolExecutionStep (207-221)

src/models/responses.py (1)

QueryResponse (341-452)

tests/e2e/features/environment.py (1)

src/models/responses.py (1)

model_override (1497-1516)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (10)

GitHub Check: unit_tests (3.13)
GitHub Check: Pyright
GitHub Check: Pylinter
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: E2E: library mode / vertexai
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: server mode / vertexai
GitHub Check: E2E: library mode / azure
GitHub Check: E2E: server mode / azure
GitHub Check: E2E: library mode / ci

🔇 Additional comments (11)

docs/providers.md (1)

58-58: LGTM!

The provider change from ibm_watsonx_ai to litellm correctly reflects the implementation approach for WatsonX support, with the supported status appropriately maintained.

README.md (1)

125-125: LGTM!

The WatsonX provider documentation additions are properly formatted and include appropriate references to setup documentation and example configurations.

Also applies to: 181-181

.github/workflows/e2e_tests.yaml (1)

13-13: LGTM!

The watsonx environment additions to the test matrix and the credential configuration for both server and library modes are properly structured and consistent with other provider configurations.

Also applies to: 203-204, 231-232

docker-compose.yaml (1)

35-38: LGTM!

The WatsonX environment variables follow the same pattern as other provider configurations, with appropriate default empty values.

tests/e2e/features/conversations.feature (1)

14-14: LGTM!

The payload extensions to include explicit model and provider fields align with the PR's goal of supporting provider-specific testing. The placeholder substitution pattern is consistent across all scenarios.

Also applies to: 31-31, 53-53, 100-100, 138-138, 152-152, 190-190

docker-compose-library.yaml (1)

37-40: LGTM!

The WatsonX environment variables are consistently configured for library mode, mirroring the server mode configuration in docker-compose.yaml.

tests/e2e/features/query.feature (1)

13-13: LGTM!

The test payload extensions appropriately add explicit model and provider fields to positive test scenarios while preserving error-condition tests that validate missing parameters. This aligns with the PR's goal of supporting explicit provider/model specification.

Also applies to: 25-25, 37-37, 54-54, 72-72

src/app/endpoints/query.py (1)

542-551: Provider filtering provides sufficient protection against identifier collisions.

The matching logic at line 543-544 uses an AND condition between the identifier check and provider_id check: m.identifier in (llama_stack_model_id, model_id) and m.provider_id == provider_id. This ensures that only models from the specified provider are matched, preventing unintended collisions across providers even when multiple providers have models with the same plain identifier (e.g., "custom-model"). While the logic accepts both provider_id/model_id format and plain model_id, the strict provider_id filter isolates the search to a single provider, eliminating the collision risk.

examples/watsonx-run.yaml (1)

144-148: Verify the provider_model_id format aligns with the upstream fix.

The provider_model_id on line 148 uses the format watsonx/meta-llama/llama-3-3-70b-instruct, which includes the provider prefix. According to the PR description, this is a workaround because WatsonX model identifiers contain slashes that cause parsing issues.

Ensure that when the upstream fix is applied, this configuration format remains compatible or that migration guidance is provided.

tests/e2e/configs/run-watsonx.yaml (1)

144-148: Verify the provider_model_id format aligns with the upstream fix.

The provider_model_id on line 148 uses the format watsonx/meta-llama/llama-3-3-70b-instruct, which includes the provider prefix. According to the PR description, this is a workaround because WatsonX model identifiers contain slashes that cause parsing issues.

Ensure that when the upstream fix is applied, this E2E test configuration format remains compatible or is updated accordingly.

tests/e2e/features/streaming_query.feature (1)

14-14: Placeholder substitution is properly configured.

The {MODEL} and {PROVIDER} placeholders are correctly substituted at runtime. The before_all hook in environment.py initializes context.default_model and context.default_provider, and the replace_placeholders() function is consistently called by all step implementations (in llm_query_response.py and common_http.py) before processing JSON payloads. The mechanism is working as intended.

tisnik

LGTM

tisnik

This PR looks ok. Added just two comments, probably minor ones.

tests/e2e/features/environment.py

tisnik · 2026-01-02T08:33:55Z

src/app/endpoints/query.py

    # TODO: Create sepparate validation of provider
    if not any(
-        m.identifier == llama_stack_model_id and m.provider_id == provider_id
+        m.identifier in (llama_stack_model_id, model_id)


the whole model/provider handling is a bit wacky, but it is OT: not a problem for this PR

Yep I agree

radofuchs

I have two issues here,

we need to keep one query test to test the default behavior without the specified model
the feedback also has a query call that requires the model to be set

are-ces

Reviewed @tisnik comments

are-ces · 2026-01-07T11:17:01Z

I have two issues here,

1. we need to keep one query test to test the default behavior without the specified model

2. the feedback also has a query call that requires the model to be set

@radofuchs streaming query and query endpoints have tests in place for missing provider and model. Do you mean that the conversations test should also have such scenario?

coderabbitai

Actionable comments posted: 1

🤖 Fix all issues with AI agents

In @.github/workflows/e2e_tests.yaml:
- Around line 263-268: The E2E_DEFAULT_MODEL_OVERRIDE value currently includes a
duplicated "watsonx/" prefix that doesn't match the provider_model_id in
tests/e2e/configs/run-watsonx.yaml; update the override for the watsonx matrix
case so E2E_DEFAULT_MODEL_OVERRIDE equals the same identifier used in
run-watsonx.yaml (provider_model_id: watsonx/meta-llama/llama-3-3-70b-instruct)
— i.e., remove the extra leading "watsonx/" from the echo that sets
E2E_DEFAULT_MODEL_OVERRIDE in the Set watsonx test overrides step so the
environment var and config match exactly.

🧹 Nitpick comments (1)

tests/e2e/configs/run-watsonx.yaml (1)

62-65: Consider removing unused Braintrust scoring provider.

The Braintrust scoring provider with a masked OpenAI API key may not be relevant to WatsonX-specific e2e tests. If this provider isn't exercised in the WatsonX test scenarios, consider removing it to keep the configuration focused.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 9dbf66b and 1d03c1a.

📒 Files selected for processing (12)

.github/workflows/e2e_tests.yaml
README.md
docker-compose-library.yaml
docker-compose.yaml
docs/providers.md
examples/watsonx-run.yaml
src/app/endpoints/query.py
tests/e2e/configs/run-watsonx.yaml
tests/e2e/features/conversations.feature
tests/e2e/features/environment.py
tests/e2e/features/query.feature
tests/e2e/features/streaming_query.feature

🚧 Files skipped from review as they are similar to previous changes (3)

tests/e2e/features/query.feature
docs/providers.md
docker-compose-library.yaml

🧰 Additional context used

📓 Path-based instructions (5)

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use pytest-mock with AsyncMock objects for mocking in tests

Files:

tests/e2e/features/environment.py

src/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

src/**/*.py: Use absolute imports for internal modules in LCS project (e.g., from auth import get_auth_dependency)
All modules must start with descriptive docstrings explaining their purpose
Use logger = logging.getLogger(__name__) pattern for module logging
All functions must include complete type annotations for parameters and return types, using modern syntax (str | int) and Optional[Type] or Type | None
All functions must have docstrings with brief descriptions following Google Python docstring conventions
Function names must use snake_case with descriptive, action-oriented names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead of modifying input parameters
Use async def for I/O operations and external API calls
All classes must include descriptive docstrings explaining their purpose following Google Python docstring conventions
Class names must use PascalCase with descriptive names and standard suffixes: Configuration for config classes, Error/Exception for exceptions, Resolver for strategy patterns, Interface for abstract base classes
Abstract classes must use ABC with @abstractmethod decorators
Include complete type annotations for all class attributes in Python classes
Use import logging and module logger pattern with standard log levels: debug, info, warning, error

Files:

src/app/endpoints/query.py

src/app/endpoints/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Use FastAPI HTTPException with appropriate status codes for API endpoint error handling

Files:

src/app/endpoints/query.py

src/**/{client,app/endpoints/**}.py

📄 CodeRabbit inference engine (CLAUDE.md)

Handle APIConnectionError from Llama Stack in integration code

Files:

src/app/endpoints/query.py

tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

tests/e2e/features/streaming_query.feature
tests/e2e/features/conversations.feature

🧠 Learnings (1)

📚 Learning: 2025-12-18T10:21:09.038Z

Learnt from: are-ces
Repo: lightspeed-core/lightspeed-stack PR: 935
File: run.yaml:114-115
Timestamp: 2025-12-18T10:21:09.038Z
Learning: In Llama Stack version 0.3.x, telemetry provider configuration is not supported under the `providers` section in run.yaml configuration files. Telemetry can be enabled with just `telemetry.enabled: true` without requiring an explicit provider block.

Applied to files:

tests/e2e/configs/run-watsonx.yaml
examples/watsonx-run.yaml

🧬 Code graph analysis (2)

tests/e2e/features/environment.py (1)

src/models/responses.py (1)

model_override (1497-1516)

src/app/endpoints/query.py (2)

tests/unit/app/endpoints/test_streaming_query.py (1)

ToolExecutionStep (207-221)

src/models/responses.py (1)

QueryResponse (341-452)

🪛 markdownlint-cli2 (0.18.1)

README.md

125-125: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (3)

GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: E2E: library mode / ci
GitHub Check: E2E: server mode / ci

🔇 Additional comments (17)

README.md (2)

125-125: LGTM - Documentation follows existing table format.

The bare URL flagged by markdownlint is consistent with all other provider URLs in this table (lines 122-127). No change needed for consistency.

181-181: WatsonX provider entry looks correct.

The new table row follows the established format and correctly references the example configuration file.

docker-compose.yaml (1)

35-38: LGTM - WatsonX environment variables follow established patterns.

The new environment variables are correctly structured with empty defaults, consistent with other provider configurations (e.g., VertexAI on lines 32-34).

src/app/endpoints/query.py (2)

23-23: Import consolidation looks good.

The ToolExecutionStep import is correctly moved to follow the other llama_stack_client.types.alpha imports.

542-545: Logic change correctly addresses WatsonX model identifier format.

The updated validation now accepts models registered with either:

Full format: provider_id/model_id (e.g., watsonx/meta-llama/llama-3-3-70b-instruct)

Plain model_id that may itself contain / (e.g., meta-llama/llama-3-3-70b-instruct)

The provider_id check on line 544 ensures we don't accidentally match a model from a different provider, so this relaxed matching is safe.

examples/watsonx-run.yaml (1)

1-161: Well-structured WatsonX example configuration.

The configuration follows established patterns from other example files. A few notes:

The safety provider is disabled with appropriate warning comments (lines 50, 149)

Placeholder credentials (key-not-set, project-not-set, ********) are suitable for example files

Per learnings, telemetry configuration (line 117) correctly uses enabled: true without a provider block for Llama Stack 0.3.x

tests/e2e/features/conversations.feature (1)

14-14: Test payload updates correctly include model/provider fields.

The addition of "model": "{MODEL}", "provider": "{PROVIDER}" to query payloads enables explicit model/provider routing for WatsonX tests where the default lookup would fail due to the / character in model identifiers. The placeholder substitution pattern is appropriate for Gherkin feature files.

tests/e2e/features/streaming_query.feature (1)

14-14: Streaming query tests appropriately extended with model/provider fields.

All scenarios that should specify model/provider now include them, while the validation test scenarios (lines 69-95) correctly test edge cases for missing fields.

.github/workflows/e2e_tests.yaml (2)

13-13: WatsonX environment correctly added to test matrix.

The matrix expansion enables WatsonX E2E tests to run in both server and library modes.

203-204: WatsonX credentials properly sourced from secrets.

Credentials are correctly passed via secrets.WATSONX_PROJECT_ID and secrets.WATSONX_API_KEY, following the established pattern for other provider credentials in this workflow.

tests/e2e/features/environment.py (3)

25-26: LGTM!

The fallback constants are well-named and provide sensible defaults for development environments when model detection fails.

64-74: LGTM!

The override logic correctly requires both environment variables to be set (non-empty) before applying overrides. The truthiness check on line 69 properly handles empty strings, and the log message clearly indicates when overrides are active.

76-90: LGTM!

The three-tier fallback strategy (override → service detection → hardcoded fallback) is well-implemented. The service detection function has proper error handling, and the fallback scenario is clearly logged with a warning indicator.

tests/e2e/configs/run-watsonx.yaml (4)

35-38: Verify the timeout value is intentional.

The WatsonX provider is configured with a 1200-second (20-minute) timeout, which is unusually high for API calls. Confirm this is intentional for WatsonX-specific requirements (e.g., model loading, cold starts) rather than a placeholder value.

50-50: Note: Safety shields intentionally disabled.

The configuration disables safety shields with a documented warning about infinite loop issues. This aligns with the PR's focus on WatsonX support, with the understanding that shield issues will be addressed separately.

144-148: LGTM!

The registered model demonstrates the WatsonX provider_model_id format (watsonx/meta-llama/llama-3-3-70b-instruct) that contains multiple slashes, which aligns with the PR's documented workaround for the model lookup issue. The custom model_id provides a stable reference for testing.

116-117: LGTM!

The telemetry configuration correctly uses enabled: true without an explicit provider block, which aligns with the supported format for Llama Stack 0.3.x. Based on learnings, this is the proper configuration approach.

.github/workflows/e2e_tests.yaml

coderabbitai

Actionable comments posted: 0

🧹 Nitpick comments (1)

README.md (1)

126-126: Documentation addition looks good!

The WatsonX provider entry is correctly formatted and consistent with the existing table structure.

Note: The static analysis tool flags this line for using a bare URL (MD034). While technically valid, this is a pre-existing pattern across all rows in this table (lines 123-128). If desired, the entire table could be updated to use proper Markdown link syntax [text](URL) in a separate refactoring effort.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d03c1a and 43edee7.

📒 Files selected for processing (6)

.github/workflows/e2e_tests.yaml
README.md
docker-compose-library.yaml
docker-compose.yaml
src/app/endpoints/query.py
tests/e2e/features/environment.py

🚧 Files skipped from review as they are similar to previous changes (4)

.github/workflows/e2e_tests.yaml
tests/e2e/features/environment.py
docker-compose.yaml
src/app/endpoints/query.py

🧰 Additional context used

🪛 markdownlint-cli2 (0.18.1)

README.md

126-126: Bare URL used

(MD034, no-bare-urls)

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request

🔇 Additional comments (2)

docker-compose-library.yaml (1)

37-40: LGTM!

The WatsonX environment variables follow the established pattern used by other providers in this file (empty-string defaults, consistent naming convention). The three variables (WATSONX_BASE_URL, WATSONX_PROJECT_ID, WATSONX_API_KEY) cover the standard configuration required for WatsonX authentication.

README.md (1)

182-182: LGTM!

The WatsonX entry in the LLM compatibility table is correctly formatted with all required fields: provider name, model identifier, tool calling support, provider type, and example configuration reference.

coderabbitai

Actionable comments posted: 2

🤖 Fix all issues with AI agents

In @.github/workflows/e2e_tests_providers.yaml:
- Around line 268-274: In the "Set watsonx test overrides" workflow step (the
block guarded by if: matrix.environment == 'watsonx'), replace the
provider_model_id path value with the registered model_id by setting
E2E_DEFAULT_MODEL_OVERRIDE=custom-watsonx-model (keep
E2E_DEFAULT_PROVIDER_OVERRIDE=watsonx); this ensures context.default_model is
set to the registered model name rather than the provider_model_id like
watsonx/watsonx/... which causes the lookup failure.

In @tests/e2e/features/steps/feedback.py:
- Line 105: The payload assignment for variable payload in feedback.py exceeds
Black's 88-char line limit; reformat the dictionary literal across multiple
lines (one key per line, e.g., "query", "system_prompt", "model", "provider") so
the line is wrapped and Black passes, keeping the same keys and values (use
context.default_model and context.default_provider as before).

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 43edee7 and cf81fea.

📒 Files selected for processing (4)

.github/workflows/e2e_tests_providers.yaml
tests/e2e/features/query.feature
tests/e2e/features/steps/feedback.py
tests/e2e/features/streaming_query.feature

🚧 Files skipped from review as they are similar to previous changes (1)

tests/e2e/features/query.feature

🧰 Additional context used

📓 Path-based instructions (4)

tests/e2e/features/**/*.feature

📄 CodeRabbit inference engine (CLAUDE.md)

Use behave (BDD) framework with Gherkin feature files for end-to-end tests

Files:

tests/e2e/features/streaming_query.feature

**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

**/*.py: Use absolute imports for internal modules: from authentication import get_auth_dependency
Use FastAPI dependencies: from fastapi import APIRouter, HTTPException, Request, status, Depends
Use Llama Stack imports: from llama_stack_client import AsyncLlamaStackClient
Check constants.py for shared constants before defining new ones
All modules start with descriptive docstrings explaining purpose
Use logger = logging.getLogger(__name__) pattern for module logging
Type aliases defined at module level for clarity
All functions require docstrings with brief descriptions
Complete type annotations for function parameters and return types, using typing_extensions.Self for model validators
Use union types with modern syntax: str | int instead of Union[str, int]
Use Optional[Type] for optional parameters
Use snake_case with descriptive, action-oriented function names (get_, validate_, check_)
Avoid in-place parameter modification anti-patterns; return new data structures instead
Use async def for I/O operations and external API calls
Handle APIConnectionError from Llama Stack in error handling
Use logger.debug() for detailed diagnostic information
Use logger.info() for general information about program execution
Use logger.warning() for unexpected events or potential problems
Use logger.error() for serious problems that prevented function execution
All classes require descriptive docstrings explaining purpose
Use PascalCase for class names with descriptive names and standard suffixes: Configuration, Error/Exception, Resolver, Interface
Use ABC for abstract base classes with @abstractmethod decorators
Complete type annotations for all class attributes
Follow Google Python docstring conventions (https://google.github.io/styleguide/pyguide.html) with sections: Args, Returns, Raises, Attributes
Run uv run make format to auto-format code with black and ruff before completion
Run uv run make verify to run all linters (black, pyl...

Files:

tests/e2e/features/steps/feedback.py

tests/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

tests/**/*.py: Use pytest for all unit and integration tests, not unittest
Use pytest-mock for AsyncMock objects in tests
Use MOCK_AUTH = ("mock_user_id", "mock_username", False, "mock_token") pattern for authentication mocks in tests

Files:

tests/e2e/features/steps/feedback.py

tests/e2e/features/steps/**/*.py

📄 CodeRabbit inference engine (CLAUDE.md)

Define E2E test steps in tests/e2e/features/steps/ directory

Files:

tests/e2e/features/steps/feedback.py

🧠 Learnings (2)

📚 Learning: 2025-09-02T11:14:17.117Z

Learnt from: radofuchs
Repo: lightspeed-core/lightspeed-stack PR: 485
File: tests/e2e/features/steps/common_http.py:244-255
Timestamp: 2025-09-02T11:14:17.117Z
Learning: The POST step in tests/e2e/features/steps/common_http.py (`access_rest_api_endpoint_post`) is intentionally designed as a general-purpose HTTP POST method, not specifically for REST API endpoints, so it should not include context.api_prefix in the URL construction.

Applied to files:

tests/e2e/features/steps/feedback.py

📚 Learning: 2026-01-11T16:30:41.784Z

Learnt from: CR
Repo: lightspeed-core/lightspeed-stack PR: 0
File: CLAUDE.md:0-0
Timestamp: 2026-01-11T16:30:41.784Z
Learning: Applies to tests/e2e/features/steps/**/*.py : Define E2E test steps in `tests/e2e/features/steps/` directory

Applied to files:

tests/e2e/features/steps/feedback.py

🧬 Code graph analysis (1)

tests/e2e/features/steps/feedback.py (2)

tests/e2e/features/steps/common_http.py (1)

access_rest_api_endpoint_get (247-258)

tests/e2e/utils/utils.py (2)

restart_container (220-242)

switch_config (148-173)

🪛 GitHub Actions: Black

tests/e2e/features/steps/feedback.py

[error] 1-1: Black formatting check failed. 1 file would be reformatted. Run 'black' to format the code.

⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)

GitHub Check: build-pr
GitHub Check: Konflux kflux-prd-rh02 / lightspeed-stack-on-pull-request
GitHub Check: E2E: server mode / ci
GitHub Check: E2E: library mode / ci

🔇 Additional comments (8)

tests/e2e/features/steps/feedback.py (2)

3-11: LGTM - imports are properly structured.

The imports follow absolute import patterns as required. The json import at line 3 is used in set_feedback and access_feedback_post_endpoint, and the utility imports on line 11 are used in configure_invalid_feedback_storage_path.

119-123: LGTM - utility functions properly integrated.

The step correctly uses switch_config and restart_container utilities to configure an invalid storage path and restart the container.

tests/e2e/features/streaming_query.feature (3)

14-14: LGTM - payload extended with model and provider.

The payload correctly includes model and provider fields using the {MODEL} and {PROVIDER} placeholders, aligning with the PR objective to explicitly specify these values for WatsonX compatibility.

77-85: LGTM - scenario validates default model/provider fallback.

This scenario correctly tests the endpoint behavior when model and provider are omitted, verifying that the service can fall back to defaults. Per the PR description, this scenario may fail for WatsonX due to the model identifier format issue until an upstream fix is available.

136-136: LGTM - unauthenticated scenario includes model/provider.

Consistent with other scenarios, the payload now includes model and provider fields. This ensures the 401 response is specifically for authentication failure, not for missing model/provider validation.

.github/workflows/e2e_tests_providers.yaml (3)

16-16: LGTM!

The matrix addition follows the existing pattern and correctly includes watsonx alongside the other providers.

206-208: LGTM!

WatsonX credentials are correctly sourced from GitHub secrets, following the established pattern for other providers.

235-237: LGTM!

Consistent with the Server Mode additions and follows the file's existing pattern of duplicating provider credentials across mode-specific env blocks.

.github/workflows/e2e_tests_providers.yaml

tests/e2e/features/steps/feedback.py

are-ces · 2026-01-13T10:52:04Z

@radofuchs I have fixed the two things you mentioned, PTAL

radofuchs

LGTM

are-ces requested review from radofuchs and tisnik and removed request for radofuchs December 22, 2025 09:22

coderabbitai bot reviewed Dec 22, 2025

View reviewed changes

tisnik approved these changes Dec 22, 2025

View reviewed changes

tisnik reviewed Jan 2, 2026

View reviewed changes

radofuchs requested changes Jan 7, 2026

View reviewed changes

are-ces force-pushed the watsonx-support branch from 9dbf66b to 1d03c1a Compare January 7, 2026 11:12

are-ces commented Jan 7, 2026

View reviewed changes

coderabbitai bot reviewed Jan 7, 2026

View reviewed changes

.github/workflows/e2e_tests.yaml Outdated Show resolved Hide resolved

coderabbitai bot reviewed Jan 13, 2026

View reviewed changes

.github/workflows/e2e_tests_providers.yaml Show resolved Hide resolved

tests/e2e/features/steps/feedback.py Outdated Show resolved Hide resolved

are-ces force-pushed the watsonx-support branch 6 times, most recently from 5b7c6c1 to eb0d3be Compare January 13, 2026 10:28

Support watsonx provider

60de883

are-ces force-pushed the watsonx-support branch from eb0d3be to 60de883 Compare January 13, 2026 10:32

radofuchs approved these changes Jan 13, 2026

View reviewed changes

radofuchs merged commit efdc365 into lightspeed-core:main Jan 13, 2026
19 of 23 checks passed

coderabbitai bot mentioned this pull request Jan 14, 2026

LCORE-861: Azure Entra ID token managment #988

Open

18 tasks

LCORE-332: Lightspeed core needs to fully support WatsonX LLM provider #943

LCORE-332: Lightspeed core needs to fully support WatsonX LLM provider #943

Uh oh!

Conversation

are-ces commented Dec 22, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Dec 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Rate limit exceeded

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

are-ces commented Dec 22, 2025

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

tisnik Jan 2, 2026

Choose a reason for hiding this comment

Uh oh!

are-ces Jan 7, 2026

Choose a reason for hiding this comment

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

are-ces left a comment

Choose a reason for hiding this comment

Uh oh!

are-ces commented Jan 7, 2026

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

are-ces commented Jan 13, 2026

Uh oh!

radofuchs left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

are-ces commented Dec 22, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Dec 22, 2025 •

edited

Loading