-
Notifications
You must be signed in to change notification settings - Fork 118
Open
Description
Description
When an LLM response stream includes a chunk with an empty reasoning_content, the Dify chat UI ends up rendering multiple <think> blocks. As a result, multiple Thoughts (x.x s) entries are displayed.
Screenshots:
Environment
- Dify 1.11.2 self-hosted (Docker)
- Dify Plugin SDK 0.7.1
- OpenAI-API-Compatible plugin 0.0.30
- Model:
openai/gpt-oss-20b(served via vLLM) - Access path: LiteLLM endpoint registered to the OpenAI-API-Compatible plugin
Details
When sending a streaming request directly to the LiteLLM endpoint, the stream may include a chunk with an empty reasoning_content:
$ curl -H "Content-Type: application/json" -H "Authorization: Bearer $API_KEY" http://localhost:4000/v1/chat/completions -d '{"model":"openai/gpt-oss-20b","stream":true,"messages":[{"role":"user","content":"Please explain Euler-Lagrange equation."}]}'
...
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"","reasoning":"","role":"assistant"}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"We","reasoning":"We"}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":" need","reasoning":" need"}}]}
...
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"t","reasoning":"t"}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"(","reasoning":"("}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"","reasoning":""}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"∂","reasoning":"∂"}}]}
...
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":"##"}}]}
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"content":" The"}}]}
...
It looks like the Dify Plugin SDK closes the current <think> block when it encounters a chunk like the following:
data: {"id":"chatcmpl-a5b6f10276aae84e","created":1767852482,"model":"my-openai/gpt-oss-20b","object":"chat.completion.chunk","choices":[{"index":0,"delta":{"reasoning_content":"","reasoning":""}}]}
However, an empty reasoning_content chunk does not indicate the end of reasoning. Treating it as a boundary causes the SDK/UI to start a new <think> block for subsequent reasoning tokens, which results in multiple Thoughts (x.x s) entries.
Expected behavior
- Empty
reasoning_content(e.g."delta":{"reasoning_content":""}) should not close the current<think>block. - The
<think>block should be closed only whenreasoning_contentis absent or null.
Metadata
Metadata
Assignees
Labels
No labels