Skip to content

Conversation

@snimu
Copy link
Contributor

@snimu snimu commented Dec 23, 2025

Description

The sub-LLMs can receive more than just strings from the RLM. This PR introduces a normalization step of these messages, so that sub-LLMs work in all valid cases, which they didn't before.

Type of Change

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • Documentation update
  • Test improvement

Testing

  • All existing tests pass when running uv run pytest locally.
  • New tests have been added to cover the changes

Checklist

  • My code follows the style guidelines of this project as outlined in AGENTS.md
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • My changes generate no new warnings
  • Any dependent changes have been merged and published

Additional Notes


Note

Improves robustness of sub-LLM chat calls by sanitizing message content formats.

  • Adds _normalize_message_content() in verifiers/envs/experimental/rlm_env.py to coerce content into API-accepted forms (extract nested {role, content}, wrap {type: ...} content-part objects, fallback wrap unknown dicts)
  • Uses normalized messages in _call_sub_llm_api() instead of raw messages to avoid malformed payload errors

Written by Cursor Bugbot for commit c5b18a8. This will update automatically on new commits. Configure here.

# Check if content is a nested message dict (has 'role' and 'content' keys)
# This happens when model passes message dicts to llm_batch instead of strings
if "role" in content and "content" in content:
msg_copy["content"] = content["content"]
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nested content extraction skips further normalization checks

When extracting inner content from a nested message dict (one with both role and content keys), the extracted content["content"] is assigned directly without further normalization. If the inner content is itself a malformed dict (e.g., a content part object with type key, or another nested message), it won't be wrapped in an array or recursively normalized. This means the final content could still be an invalid bare dict, violating the stated invariant that the API expects content to be a string, array of objects, or None.

Fix in Cursor Fix in Web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems overly defensive, I've never seen that happen. Also, maybe the models just shouldn't nest too deep, and failing on this edgecase is fine.

@snimu snimu requested a review from willccbb December 23, 2025 18:54
@willccbb willccbb merged commit 1a428e7 into main Jan 3, 2026
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants