-
Notifications
You must be signed in to change notification settings - Fork 1.2k
fix!: Enhance response API support to not fail with tool calling #3385
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ab48a50 to
7e080db
Compare
|
hi @luis5tb would you mind elaborating a little on what the exact issue you observed is and how your PR addresses it? |
751c019 to
b7e3351
Compare
ashwinb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @jaideepr97 noted, the PR summary is not quite sufficient to justify the substantial changes particularly to the API shape also.
|
Thanks! Yes, I'll update the description with more information. The TLDR is that when using GPT-OSS model (with remote-vllm) I'm getting a lot of issues related to the conversion/generation of Messages between VLLM, OpenAI and LlamaStack formats. Regarding the API changes:
The other changes are basically to handle cases that probably were not considered and may be due to final implementation on VLLM and the objects it returns. All the changes are related to ensuring the format of the messages can be parsed by the different components (vllm and llamastack). Perhaps the base problem comes from LlamaStack translating the responses API calls to VLLM into chat_completion API calls, and that makes the message parsing a bit difficult |
b7e3351 to
3233b5f
Compare
3233b5f to
042d118
Compare
042d118 to
ab8beaf
Compare
|
@luis5tb Are you still planning to update the PR description? |
Yes! I was testing the pieces that are not needed anymore due to changes in vLLM and what else what needed due to new changes in too. I'll update later today or first thing tomorrow morning |
64e53bf to
d0b562f
Compare
d0b562f to
4c41608
Compare
|
@jwm4 @ashwinb @jaideepr97 Let me know if the current description is enough or more/different details are needed |
d33043f to
93d5894
Compare
93d5894 to
597dc2d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still seems to be failing integration tests though
@luis5tb since you're changing some datatypes you will need to update the api conformance test as well
Integration test fixed. Regarding the api conformance tests, I'm not entirely sure how to update/fix those. Can you give me some hints? I thought it was about executing the script to update the openapi spec (run_openapi_generator.sh). Note this change is only adding new types supported as input, not removing the existing ones, so there is no breaking on the API, was what accepted is still accepted. Only difference is that a couple of other types are accepted too |
597dc2d to
a8508ad
Compare
I guess it is technically a breaking change in the spec since you're removing fields, even though you are replacing them with the parent object. The test seems to be directly comparing removals and additions though I'm not sure why it also errors out on new additions |
|
@jaideepr97 these types of changes are ok, just prefix the title of the PR with |
Thanks! I updated the PR with that prefix + breaking change information |
88a5b13 to
4389018
Compare
done |
4389018 to
1296779
Compare
1296779 to
37a9d4f
Compare
ashwinb
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
What does this PR do?
Introduces two main fixes to enhance the stability of Responses API when dealing with tool calling responses and structured outputs.
Changes Made
It added OpenAIResponseOutputMessageMCPCall and ListTools to OpenAIResponseInput but feat(responses)!: improve responses + conversations implementations #3810 got merge that did the same in a different way. Still this PR does it in a way that keep the sync between OpenAIResponsesOutput and the allowed objects in OpenAIResponseInput.
Add protection in case self.ctx.response_format does not have type attribute
BREAKING CHANGE: OpenAIResponseInput now uses OpenAIResponseOutput union type.
This is semantically equivalent - all previously accepted types are still supported
via the OpenAIResponseOutput union. This improves type consistency and maintainability.