Skip to content

Conversation

@hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Dec 1, 2025

Important

Add input_schema and expected_output_schema parameters to create_dataset() for JSON Schema validation in client.py.

  • Behavior:
    • Add input_schema and expected_output_schema optional parameters to create_dataset() in client.py for JSON Schema validation of dataset items.
    • Update create_dataset() to include these schemas in CreateDatasetRequest.
  • Tests:
    • Modify test_create_dataset_item() and test_get_dataset_runs() in test_datasets.py to use dictionary inputs instead of JSON strings.

This description was created by Ellipsis for 030b82b. You can customize this summary. It will automatically update as commits are pushed.


Disclaimer: Experimental PR review

Greptile Overview

Greptile Summary

This PR extends the create_dataset() method with two new optional parameters: input_schema and expected_output_schema. These parameters allow users to specify JSON Schemas that will be used to validate dataset items when they are created.

  • Added input_schema parameter to validate dataset item inputs
  • Added expected_output_schema parameter to validate dataset item expected outputs
  • Updated docstrings to document the new parameters

The implementation follows existing codebase patterns, using camelCase aliases when constructing the CreateDatasetRequest Pydantic model, consistent with how other aliased fields (like sourceTraceId, sourceObservationId) are handled elsewhere in the codebase.

Confidence Score: 5/5

  • This PR is safe to merge with minimal risk - it adds optional parameters with sensible defaults that don't affect existing functionality.
  • Score reflects straightforward feature addition with no breaking changes. The implementation follows established codebase patterns, uses properly typed optional parameters, and the underlying API already supports these fields.
  • No files require special attention.

Important Files Changed

File Analysis

Filename Score Overview
langfuse/_client/client.py 5/5 Added input_schema and expected_output_schema optional parameters to create_dataset() method, following existing codebase patterns for schema validation on dataset items.

Sequence Diagram

sequenceDiagram
    participant User
    participant Langfuse Client
    participant CreateDatasetRequest
    participant Langfuse API

    User->>Langfuse Client: create_dataset(name, input_schema, expected_output_schema)
    Langfuse Client->>CreateDatasetRequest: Create request body with schemas
    Langfuse Client->>Langfuse API: POST /datasets
    Langfuse API-->>Langfuse Client: Dataset (with validation schemas)
    Langfuse Client-->>User: Dataset object
Loading

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Agent Settings | Greptile

@hassiebp hassiebp merged commit e8b355e into main Dec 1, 2025
11 checks passed
@hassiebp hassiebp deleted the add-schema-dataset branch December 1, 2025 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Python SDK (v3.10.1) create_dataset missing input_schema / expected_output_schema although shown in docs

2 participants