fix: ensure ingestion and media queues are always flushed #1401
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Important
Ensure ingestion and media queues are always flushed in
resource_manager.py, even withProxyTracerProvider.flush()andshutdown()inresource_manager.pyto ensure ingestion and media queues are always flushed.ProxyTracerProviderinstances, ensuring flush operations are not skipped.flush()to confirm successful flushing of OTEL tracer provider, score ingestion queue, and media upload queue.This description was created by
for 2479a83. You can customize this summary. It will automatically update as commits are pushed.
Disclaimer: Experimental PR review
Greptile Overview
Updated On: 2025-10-09 20:05:52 UTC
Summary
This PR fixes a critical data loss bug in the `LangfuseResourceManager`'s flush and shutdown operations. The issue occurred when the OpenTelemetry tracer provider was a `ProxyTracerProvider` (OpenTelemetry's default provider when no explicit configuration is set), which is common in many deployment scenarios.Previously, both the
flush()andshutdown()methods would return early when encountering aProxyTracerProvider, skipping the essential queue flushing operations for score ingestion and media upload queues. This meant that pending scores and media uploads could be lost during application shutdown or manual flush operations.The fix inverts the conditional logic to ensure that the Langfuse-specific queues (score ingestion and media upload) are always flushed, while only making the OpenTelemetry tracer provider flush conditional on having a real (non-proxy) tracer provider. This change maintains thread safety and resource cleanup guarantees that are critical for the
LangfuseResourceManager's design as a singleton managing background threads and queues.The change integrates well with the existing codebase architecture, where the
LangfuseResourceManagerserves as the central coordinator for background processing tasks. The fix preserves the existing OpenTelemetry integration while ensuring Langfuse's internal queue management remains robust regardless of OpenTelemetry configuration state.Important Files Changed
Changed Files
Confidence score: 5/5
Sequence Diagram
sequenceDiagram participant User participant LRM as "LangfuseResourceManager" participant TP as "TracerProvider" participant SIQ as "Score Ingestion Queue" participant MUQ as "Media Upload Queue" participant SIC as "ScoreIngestionConsumer" participant MUC as "MediaUploadConsumer" User->>LRM: "flush()" LRM->>TP: "force_flush()" TP-->>LRM: "flush complete" LRM->>LRM: "log 'Successfully flushed OTEL tracer provider'" LRM->>SIQ: "join()" Note over SIQ: "Wait for all score ingestion tasks to complete" SIQ-->>LRM: "all tasks complete" LRM->>LRM: "log 'Successfully flushed score ingestion queue'" LRM->>MUQ: "join()" Note over MUQ: "Wait for all media upload tasks to complete" MUQ-->>LRM: "all tasks complete" LRM->>LRM: "log 'Successfully flushed media upload queue'" LRM-->>User: "flush complete"