Skip to content

Conversation

@hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Nov 11, 2025

Important

Move evaluator execution outside the main execution span in _process_experiment_item() to improve error handling and separation of concerns.

  • Behavior:
    • Move evaluator execution outside the main execution span in _process_experiment_item() in client.py.
    • Evaluators are now run after the main task execution, improving error handling and separation of concerns.
  • Error Handling:
    • Errors during task execution update the span with error details and re-raise the exception.
    • Errors during evaluator execution are logged but do not affect the main task execution.
  • Misc:
    • Minor restructuring of try-except blocks for clarity and separation of task and evaluator logic.

This description was created by Ellipsis for 65a9523. You can customize this summary. It will automatically update as commits are pushed.

Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

Refactored _process_experiment_item to move evaluator execution outside the main try-block, separating evaluation logic from the primary task execution within the span.

  • The evaluators now run after the main task completes successfully, but still within the span context manager
  • If the main task throws an exception, evaluators are skipped entirely (exception is raised immediately)
  • All variables (output, input_data, trace_id, etc.) remain accessible since they're initialized before the try-block ends
  • The span object remains valid throughout since the code is still within the with statement scope

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk
  • The change is a clean code reorganization that correctly moves evaluator logic outside the try-block while maintaining all necessary variable scopes. The refactoring properly handles error cases by skipping evaluators when exceptions occur, and all references remain valid within the context manager scope.
  • No files require special attention

Important Files Changed

File Analysis

Filename Score Overview
langfuse/_client/client.py 5/5 Moved evaluator execution from inside try-block to after it, separating evaluation from main task execution while keeping them in the span context. Clean refactor with no logical issues.

Sequence Diagram

sequenceDiagram
    participant Client
    participant Span as Experiment Item Span
    participant Task as Main Task
    participant Evaluators
    
    Client->>Span: start_as_current_span("experiment-item-run")
    activate Span
    
    Note over Span: Try Block Start
    Span->>Span: Initialize variables (input_data, expected_output, etc.)
    Span->>Task: await _run_task(task, item)
    Task-->>Span: output
    Span->>Span: update(input, output, metadata)
    Note over Span: Try Block End
    
    alt Exception in try block
        Span->>Span: update(level=ERROR, status_message)
        Span-->>Client: raise exception
        Note over Evaluators: Evaluators NOT executed
    else Success
        Note over Span,Evaluators: NEW: Evaluators run AFTER try block
        loop For each evaluator
            Span->>Evaluators: await _run_evaluator()
            Evaluators-->>Span: eval_results
            Span->>Span: create_score()
        end
        Span-->>Client: return ExperimentItemResult
    end
    
    deactivate Span
Loading

@hassiebp hassiebp enabled auto-merge (squash) November 11, 2025 09:48
Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@hassiebp hassiebp merged commit 9228e7d into main Nov 11, 2025
12 checks passed
@hassiebp hassiebp deleted the evals-out-of-exp branch November 11, 2025 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants