-
Notifications
You must be signed in to change notification settings - Fork 118
Open
Description
Although this code is a streaming request, the internal implementation is a blocking Socket. If many large model requests are initiated at the same time, the performance will be very poor. Is there any request that supports non-blocking Socket? Aiohttp, for example
self.session.model.llm.invoke(
model_config=LLMModelConfig(
provider=llm_info.get('provider'),
model=llm_info.get('model'),
mode=llm_info.get('mode'),
completion_params=llm_info.get('completion_params')
),
prompt_messages=[
SystemPromptMessage(content=user_prompt),
],
stream=True
)
I found that httpx.Client() is used, can I use httpx.AsyncClient instead?
Metadata
Metadata
Assignees
Labels
No labels