Skip to content

Conversation

@christso
Copy link
Collaborator

@christso christso commented Jan 8, 2026

Summary

Add support for JSONL (JSON Lines) format as an alternative to YAML for evaluation datasets.

Why

Enables large-scale evaluation workflows following industry standards (DeepEval, LangWatch, Hugging Face, OpenAI):

  • Memory efficiency: Line-by-line processing for datasets with thousands of cases
  • Git-friendly diffs: Clear line-based changes vs nested YAML
  • Programmatic generation: Easy append operations
  • Tool compatibility: Works with standard JSONL tools (jq, grep)
  • Industry alignment: Follows established ML/AI framework patterns

Design

  • Pure JSONL: One eval case per line (no embedded metadata)
  • Sidecar YAML: Optional companion file for metadata and defaults
  • Override precedence: Per-line fields override sidecar defaults
  • Backward compatible: Existing YAML files unchanged

Files

  • \proposal.md: Complete motivation, examples, alternatives considered
  • \design.md: Architecture decisions and rationale
  • \ asks.md: 6-phase implementation checklist
  • \specs/jsonl-dataset-format/spec.md: 8 requirements, 27 scenarios

Validation

✅ Passed \openspec validate --strict\

Next Steps

This is a proposal only - no implementation yet. Awaiting review and approval before proceeding with implementation phases.

Add support for JSONL (JSON Lines) format as an alternative to YAML for evaluation datasets, following industry standards from DeepEval, LangWatch, Hugging Face, and OpenAI.

Key features:

- Pure JSONL files (one eval case per line)

- Optional sidecar YAML for metadata and defaults

- Per-case overrides for execution and evaluators

- Same file reference resolution as YAML

- Fully backward compatible with existing YAML files

Benefits:

- Memory efficient for large datasets

- Git-friendly line-based diffs

- Easy programmatic generation and appending

- Compatible with standard JSONL tools

Includes complete proposal, design doc, implementation tasks, and spec with 8 requirements and 27 scenarios.
@christso christso marked this pull request as draft January 8, 2026 23:59
Add JSONL (JSON Lines) format as an alternative to YAML for evaluation
datasets, following industry standards from DeepEval, LangWatch, and
Hugging Face.

Key features:
- Pure JSONL data format (one eval case per line)
- Optional sidecar YAML metadata file for dataset defaults
- Per-case overrides for execution, evaluators, and rubrics
- Line-by-line parsing with clear error messages
- Same validation and file reference resolution as YAML
- Full backward compatibility with existing YAML files

Benefits:
- Streaming-friendly for large datasets
- Git-friendly line-based diffs
- Easy programmatic generation
- Standard tool compatibility (jq, grep, etc.)
@christso christso marked this pull request as ready for review January 9, 2026 00:56
- Replace jsonl-format example with basic-jsonl (mirrors basic example)
- Add file reference examples in JSONL format
- Update eval-builder skill with JSONL format documentation
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants