Skip to content

Conversation

@hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Nov 12, 2025

Important

Adds run_batched_evaluation to Langfuse client for large-scale evaluation with error handling, retry logic, and resume capability, along with comprehensive tests.

  • Behavior:
    • Adds run_batched_evaluation() to Langfuse client in client.py for large-scale evaluation of traces and observations.
    • Implements error handling, retry logic, and resume capability.
  • Modules:
    • New batch_evaluation.py module for batch evaluation logic.
  • Testing:
    • Adds tests/test_batch_evaluation.py with 40+ test cases.
  • Misc:
    • Moves import statements to the top in client.py to adhere to style guide.

This description was created by Ellipsis for 931bdd2. You can customize this summary. It will automatically update as commits are pushed.


Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

This PR adds run_batched_evaluation to enable large-scale evaluation of traces, observations, and sessions. The implementation includes mapper functions, evaluators, composite evaluators, comprehensive error handling, retry logic, and resume capability.

Key Changes

  • Added new batch_evaluation.py module with core implementation
  • Added run_batched_evaluation() method to Langfuse client
  • Exported new types (EvaluatorInputs, MapperFunction, CompositeEvaluatorFunction, EvaluatorStats, BatchEvaluationResumeToken, BatchEvaluationResult) to public API
  • Comprehensive test suite with 40+ test cases

Issues Found

  • Import statement inside method violates style guide (should be at module top)

Confidence Score: 4/5

  • Safe to merge with minor style improvement
  • Well-architected implementation with comprehensive error handling, retry logic, and extensive test coverage. Only minor style violation found (inline import). Code is production-ready with proper protocols, type hints, and documentation.
  • langfuse/_client/client.py needs import moved to top per style guide

Important Files Changed

File Analysis

Filename Score Overview
langfuse/_client/client.py 4/5 added run_batched_evaluation method with comprehensive docs; import should be moved to top per style guide
langfuse/batch_evaluation.py 5/5 new module implementing batch evaluation with proper error handling, retry logic, and resume capability

Sequence Diagram

sequenceDiagram
    participant User
    participant Langfuse as Langfuse Client
    participant Runner as BatchEvaluationRunner
    participant API as Langfuse API
    participant Mapper
    participant Evaluator
    
    User->>Langfuse: run_batched_evaluation(scope, mapper, evaluators, ...)
    Langfuse->>Runner: create BatchEvaluationRunner
    Langfuse->>Runner: run_async(...)
    
    loop For each batch (pagination)
        Runner->>API: fetch_batch_with_retry(scope, filter, page)
        API-->>Runner: items batch
        
        loop For each item in batch (concurrent)
            Runner->>Mapper: map(item)
            Mapper-->>Runner: EvaluatorInputs
            
            loop For each evaluator
                Runner->>Evaluator: evaluate(input, output, ...)
                Evaluator-->>Runner: Evaluation(s)
                Runner->>Langfuse: create_score(trace_id/obs_id/session_id)
            end
            
            opt If composite_evaluator
                Runner->>Evaluator: composite_evaluator(item, evaluations)
                Evaluator-->>Runner: composite Evaluation
                Runner->>Langfuse: create_score(...)
            end
        end
    end
    
    Runner->>Langfuse: flush()
    Runner-->>Langfuse: BatchEvaluationResult
    Langfuse-->>User: BatchEvaluationResult
Loading

Context used:

  • Rule from dashboard - Move imports to the top of the module instead of placing them within functions or methods. (source)

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

4 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@hassiebp hassiebp merged commit 4c57007 into main Nov 14, 2025
12 checks passed
@hassiebp hassiebp deleted the add-batch-evals branch November 14, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants