Skip to content

Conversation

@hassiebp
Copy link
Contributor

@hassiebp hassiebp commented Aug 14, 2025

Important

Fixes string serialization in _serialize function to prevent unnecessary JSON encoding for OpenTelemetry span attributes.

  • Behavior:
    • _serialize in attributes.py now returns strings as-is, skipping JSON serialization for strings.
    • Fixes double-serialization issue for OpenTelemetry span attributes.
  • Tests:
    • Updates test_decorators.py and test_openai.py to expect unquoted string outputs.
    • Changes assertions in test_openai.py to check for integer output instead of string.
  • Misc:
    • Minor assertion changes in test_decorators.py to match new serialization behavior.

This description was created by Ellipsis for 6e5d83f. You can customize this summary. It will automatically update as commits are pushed.


Disclaimer: Experimental PR review

Greptile Summary

This PR modifies the _serialize function in langfuse/_client/attributes.py to fix Unicode escaping issues in attribute values. The change adds an early return condition that skips JSON serialization for string values, returning them as-is instead of double-serializing them.

What Changed:
The _serialize function now checks if the input is None or already a string before applying JSON serialization. Previously, all non-None objects (including strings) were being processed through json.dumps() with the EventSerializer class, which would escape Unicode characters and wrap strings in additional quotes.

Why This Matters:
This change addresses a specific problem with OpenTelemetry span attributes, where string values should be stored directly without additional JSON encoding. The previous implementation was causing unnecessary double-serialization - strings were being JSON-encoded when they should have been passed through unchanged.

Integration with Codebase:
The _serialize function appears to be part of the client's attribute handling system, likely used when preparing data for OpenTelemetry spans or similar telemetry operations. The EventSerializer class (referenced in the test context) is still used for complex objects that genuinely need specialized serialization, but simple strings now bypass this processing entirely. This maintains the existing behavior for complex data types while optimizing the common case of string attributes.

Confidence score: 4/5

  • This PR is safe to merge with minimal risk as it's a targeted fix for a specific serialization issue
  • Score reflects a simple, well-scoped change that addresses a clear problem without affecting other functionality
  • Pay close attention to langfuse/_client/attributes.py to ensure the string type check covers all expected string types

Copy link
Contributor

@greptile-apps greptile-apps bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 file reviewed, no comments

Edit Code Review Bot Settings | Greptile

@hassiebp hassiebp merged commit 06bbd30 into main Aug 14, 2025
10 checks passed
@hassiebp hassiebp deleted the fix-unicode-escaping-in-io branch August 14, 2025 16:51
if obj is None or isinstance(obj, str):
return obj

return json.dumps(obj, cls=EventSerializer)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ensure_ascii here should be False

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants