Skip to content

Conversation

@ammar-agent
Copy link
Collaborator

Summary

This addresses the known Gemini token exhaustion bug where the model gets stuck in a loop emitting variations like I am done. I am done. I am done... until it exhausts all output tokens.

The Bug

This is a well-documented issue with Gemini models:

The gemini-cli implements loop detection to catch this, and now we do too.

The Fix

  • Adds RepetitionDetector class that monitors streaming text for repetitive patterns using a sliding window approach
  • Integrates detector into StreamManager for Gemini models only (no overhead for other providers)
  • Automatically aborts the stream when 10+ repetitions of the same phrase (8-50 chars) are detected within a 2000 char window

Testing

  • Added comprehensive unit tests for the RepetitionDetector class covering:
    • Period-separated repetition (I am done. I am done.)
    • Newline-separated repetition
    • Chunked input across multiple addText calls
    • The exact bug pattern from the user report
    • The gemini-cli loop pattern

Generated with mux

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines 706 to 708
// Abort via the controller - the stream loop will detect this and exit
streamInfo.abortController.abort();
break;

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Emit abort event when repetition detector stops stream

When the repetition detector fires, it simply aborts the controller and breaks out of the text-delta handler, but processStreamWithCleanup only emits stream-end when the abort signal is false and there is no stream-abort emission for this path. That means a Gemini loop termination produces no completion event at all, so AIService never commits the partial and front-end listeners wait indefinitely for a terminal update. Consider emitting stream-abort (or an error) when the detector cancels the stream.

Useful? React with 👍 / 👎.

…austion

This addresses the known Gemini token exhaustion bug where the model gets
stuck in a loop emitting variations like 'I am done. I am done. I am done...'
until it exhausts all output tokens.

The fix:
- Adds RepetitionDetector class that monitors streaming text for repetitive
  patterns using a sliding window approach
- Integrates detector into StreamManager for Gemini models only
- Automatically aborts the stream when 10+ repetitions of the same phrase
  (8-50 chars) are detected within a 2000 char window

This is the same approach used by gemini-cli for loop detection.

See: google-gemini/gemini-cli#13322

_Generated with `mux`_
@ammar-agent ammar-agent force-pushed the fix-gemini-token-exhaustion-bug branch from 0a03884 to c8366c6 Compare December 1, 2025 19:53
@ammario
Copy link
Member

ammario commented Dec 2, 2025

Closed for gross

@ammario ammario closed this Dec 2, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants