LCORE-1198: RAG chunk parsing improvement for streaming query #1018

asimurka · 2026-01-19T11:39:17Z

Description

This PR adds support for RAG chunk storing in v2/streaming query. RAG chunks are now parsed on the fly when a File Search Tool chunk is emitted.

Type of change

Tools used to create PR

Identify any AI code assistants used in this PR (for transparency and review context)

Assisted-by: Cursor

Related Tickets & Documents

Related Issue #
Closes # LCORE-1198

Checklist before requesting a review

I have performed a self-review of my code.
PR has passed all pre-merge test jobs.
If it is a core feature, I have added thorough tests.

Testing

Please provide detailed steps to perform tests related to this code change.
How were the fix/results from this change verified? Please provide relevant screenshots or results.

Summary by CodeRabbit

Refactor
- Improved RAG chunk extraction during query processing for more efficient data handling.
Chores
- Code cleanup and formatting refinements across multiple files for better maintainability.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

coderabbitai · 2026-01-19T11:39:31Z

Walkthrough

The PR refactors RAG chunk extraction in query processing by introducing a new helper function to extract RAG chunks from file search results and threading this data through tool call processing in both standard and streaming response endpoints. Additionally, it performs minor code cleanup including whitespace normalization and removal of unused test constants.

Changes

Cohort / File(s)	Summary
RAG Chunk Extraction Refactoring `src/app/endpoints/query_v2.py`, `src/app/endpoints/streaming_query_v2.py`	Introduced `extract_rag_chunks_from_file_search_item()` helper to convert file_search_call results into RAGChunk entries. Updated `_build_tool_call_summary()` to accept and process `rag_chunks` parameter during tool call processing instead of in a separate post-processing step. Updated `streaming_query_v2.py` to thread rag_chunks through the Responses API integration and pass serialized chunks to `cleanup_after_streaming()`.
Utility and Endpoint Updates `src/utils/endpoints.py`	Updated `cleanup_after_streaming()` docstring to clarify rag_chunks parameter behavior.
Logger Removal `src/utils/mcp_headers.py`	Removed module-level logger initialization, but code still references `logger.error()` in the module—potential bug requiring verification.
Code Cleanup and Formatting `dev-tools/mcp-mock-server/server.py`, `src/client.py`, `src/quota/quota_limiter.py`, `tests/e2e/features/steps/llm_query_response.py`, `tests/unit/app/endpoints/test_a2a.py`, `tests/unit/app/endpoints/test_streaming_query.py`, `tests/unit/cache/test_postgres_cache.py`, `tests/unit/models/rlsapi/test_requests.py`, `tests/unit/models/rlsapi/test_responses.py`, `tests/unit/test_configuration.py`	Removed blank lines, unused test constants (`CONVERSATION_ID_2`), and reformatted YAML string literals in tests—no functional behavior changes.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

LCORE-601: Add RAG chunks in query response #550 — Modifies response endpoints and streaming code to thread RAGChunk data through tool-call processing with the same functions (_build_tool_call_summary, parse_rag_chunks_from_responses_api, cleanup_after_streaming).
LCORE-1094: Rag chunks are not parsed #893 — Modifies query_v2.py to add parsing and handling of RAG chunks, introducing extract/parse helpers and threading rag_chunks into response summaries.
LCORE-1201: reformatted sources by newest Black version #1013 — Includes overlapping whitespace cleanup edits (logger removal, blank line removals in dev-tools and test files).

Suggested labels

ok-to-test

Suggested reviewers

tisnik
are-ces
asamal4

🚥 Pre-merge checks | ✅ 3

✅ Passed checks (3 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly describes the main change: RAG chunk parsing improvement for streaming query, which aligns with the PR's primary objective of adding RAG chunk support to the streaming query flow.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 0

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

src/utils/endpoints.py (1)
797-799: Bug: summary.rag_chunks is not synchronized with the rag_chunks parameter.

create_referenced_documents_with_metadata uses summary.rag_chunks, but in streaming_query_v2.py, summary.rag_chunks is initialized as an empty list and never updated. The actual RAG chunks are collected in a separate rag_chunks variable and passed via the rag_chunks parameter.

This means referenced_documents will always be empty for streaming responses, even when RAG chunks were retrieved.
🐛 Proposed fix: Use the rag_chunks parameter when available
+    # Use provided rag_chunks if available, otherwise fall back to summary.rag_chunks
+    chunks_for_documents = (
+        [RAGChunk(**chunk) for chunk in rag_chunks]
+        if rag_chunks
+        else summary.rag_chunks
+    )
     referenced_documents = create_referenced_documents_with_metadata(
-        summary, metadata_map
+        TurnSummary(
+            llm_response=summary.llm_response,
+            tool_calls=summary.tool_calls,
+            tool_results=summary.tool_results,
+            rag_chunks=chunks_for_documents,
+        ),
+        metadata_map,
     )
Or alternatively, add the import and create a simpler helper that takes rag_chunks directly.
src/app/endpoints/streaming_query_v2.py (1)
311-330: Missing synchronization: summary.rag_chunks never updated with collected chunks.

The local rag_chunks list is populated during streaming, but summary.rag_chunks remains empty. Since cleanup_after_streaming uses summary.rag_chunks for building referenced_documents (via create_referenced_documents_with_metadata), the referenced documents will be incorrect.
🐛 Proposed fix: Sync rag_chunks to summary before cleanup
+        # Sync collected RAG chunks to summary for cleanup processing
+        summary.rag_chunks = rag_chunks
+
         # Perform cleanup tasks (database and cache operations))
         await cleanup_after_streaming(
             user_id=context.user_id,
             conversation_id=conv_id,
             model_id=context.model_id,
             provider_id=context.provider_id,
             llama_stack_model_id=context.llama_stack_model_id,
             query_request=context.query_request,
             summary=summary,
             metadata_map=context.metadata_map,
             started_at=context.started_at,
             client=context.client,
             config=configuration,
             skip_userid_check=context.skip_userid_check,
             get_topic_summary_func=get_topic_summary,
             is_transcripts_enabled_func=is_transcripts_enabled,
             store_transcript_func=store_transcript,
             persist_user_conversation_details_func=persist_user_conversation_details,
             rag_chunks=[rag_chunk.model_dump() for rag_chunk in rag_chunks],
         )

🧹 Nitpick comments (1)

src/app/endpoints/query_v2.py (1)

484-499: Consider returning the new chunks instead of mutating the input list.

The function mutates rag_chunks in place. While the docstring documents this behavior, the coding guidelines recommend avoiding in-place parameter modification. Consider returning the extracted chunks and letting the caller extend the list.

♻️ Optional refactor to return new data structure

 def extract_rag_chunks_from_file_search_item(
     item: OpenAIResponseOutputMessageFileSearchToolCall,
-    rag_chunks: list[RAGChunk],
-) -> None:
-    """Extract RAG chunks from a file search tool call item and append to rag_chunks.
+) -> list[RAGChunk]:
+    """Extract RAG chunks from a file search tool call item.
 
     Args:
         item: The file search tool call item.
-        rag_chunks: List to append extracted RAG chunks to.
+
+    Returns:
+        List of extracted RAG chunks.
     """
+    rag_chunks: list[RAGChunk] = []
     if item.results is not None:
         for result in item.results:
             rag_chunk = RAGChunk(
                 content=result.text, source="file_search", score=result.score
             )
             rag_chunks.append(rag_chunk)
+    return rag_chunks

Then at the call sites:

-        extract_rag_chunks_from_file_search_item(item, rag_chunks)
+        rag_chunks.extend(extract_rag_chunks_from_file_search_item(item))

are-ces · 2026-01-19T12:38:30Z

LGTM

tisnik

LGTM

asimurka added 2 commits January 19, 2026 12:17

Added RAG chunks extraction to v2/streaming_query

3f1fdf6

Reformated by black

d7d11c7

coderabbitai bot reviewed Jan 19, 2026

View reviewed changes

tisnik approved these changes Jan 19, 2026

View reviewed changes

tisnik merged commit 448cb60 into lightspeed-core:main Jan 19, 2026
20 of 22 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

LCORE-1198: RAG chunk parsing improvement for streaming query #1018

LCORE-1198: RAG chunk parsing improvement for streaming query #1018

Uh oh!

asimurka commented Jan 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot commented Jan 19, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

are-ces commented Jan 19, 2026

Uh oh!

tisnik left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

LCORE-1198: RAG chunk parsing improvement for streaming query #1018

LCORE-1198: RAG chunk parsing improvement for streaming query #1018

Uh oh!

Conversation

asimurka commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Type of change

Tools used to create PR

Related Tickets & Documents

Checklist before requesting a review

Testing

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Jan 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Possibly related PRs

Suggested labels

Suggested reviewers

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

are-ces commented Jan 19, 2026

Uh oh!

tisnik left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

asimurka commented Jan 19, 2026 •

edited

Loading

coderabbitai bot commented Jan 19, 2026 •

edited

Loading