You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
fix(performance): eliminate redundant regex calls in structured output mode
Resolves 3-4x performance regression observed in CI benchmarks by fixing
multiple redundant regex processing issues:
Performance Issues Fixed:
- Double regex calls in smart cascade mode with structured=True
- Double regex calls in auto engine mode with structured=True
- Redundant Span class imports in multi-chunk processing loop
Root Cause:
- Smart cascade and auto engine called annotate() then annotate_with_spans()
- This resulted in processing the same text twice for structured output
- Multi-chunk processing imported Span class for every span vs once per batch
Optimization:
- Use annotate_with_spans() directly when structured=True is requested
- Convert spans to dict format for cascade decision logic when needed
- Cache Span class import outside of processing loops
- Maintain backward compatibility and identical output
Performance Impact:
- Eliminates redundant regex processing in benchmark-critical paths
- Reduces overhead in structured output mode significantly
- Maintains sub-4ms regex performance in benchmarks
🤖 Generated with [Claude Code](https://claude.ai/code)
Co-Authored-By: Claude <noreply@anthropic.com>
0 commit comments