Commit 957c882
committed
Include text/event-stream header only when stream=True
Summary:
We want to use the headers to negotiate content.
Sending this header in every request will cause server to return chunks, even without the stream=True param.
```
llama-stack-client inference chat-completion --message="Hello there"
{"event":{"event_type":"start","delta":"Hello"}}
{"event":{"event_type":"progress","delta":"!"}}
{"event":{"event_type":"progress","delta":" How"}}
{"event":{"event_type":"progress","delta":" are"}}
{"event":{"event_type":"progress","delta":" you"}}
{"event":{"event_type":"progress","delta":" today"}}
```
Test Plan:
```
pip install .
llama-stack-client configure --endpoint={endpoint} --api-key={api-key}
llama-stack-client inference chat-completion --message="Hello there"
ChatCompletionResponse(completion_message=CompletionMessage(content='Hello! How can I assist you today?', role='assistant', stop_reason='end_of_turn', tool_calls=[]), logprobs=None)
```1 parent f5d4cfe commit 957c882
1 file changed
+8
-4
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
213 | 213 | | |
214 | 214 | | |
215 | 215 | | |
216 | | - | |
| 216 | + | |
| 217 | + | |
217 | 218 | | |
218 | 219 | | |
219 | 220 | | |
| |||
364 | 365 | | |
365 | 366 | | |
366 | 367 | | |
367 | | - | |
| 368 | + | |
| 369 | + | |
368 | 370 | | |
369 | 371 | | |
370 | 372 | | |
| |||
623 | 625 | | |
624 | 626 | | |
625 | 627 | | |
626 | | - | |
| 628 | + | |
| 629 | + | |
627 | 630 | | |
628 | 631 | | |
629 | 632 | | |
| |||
774 | 777 | | |
775 | 778 | | |
776 | 779 | | |
777 | | - | |
| 780 | + | |
| 781 | + | |
778 | 782 | | |
779 | 783 | | |
780 | 784 | | |
| |||
0 commit comments