[Feature] Allow CONCURRENT requests and Multiple Instances management; Add API authentication; and configuration improvements#225
Conversation
CodePothunter
commented
Mar 7, 2025
- Implement OpenAI-compatible API key authentication
- Add configuration options for GPU instances, concurrency, and request handling
- Update README with authentication instructions
- Modify configuration and routing to support optional API key verification
- Enhance system information and debug endpoints to expose authentication status.
- Implement OpenAI-compatible API key authentication - Add configuration options for GPU instances, concurrency, and request handling - Update README with authentication instructions - Modify configuration and routing to support optional API key verification - Enhance system information and debug endpoints to expose authentication status
|
This looks great, will take a look through today |
- Modify audio chunk concatenation to handle float32 audio data - Add explicit conversion from float32 to int16 using amplitude scaling - Remove unnecessary dtype specification in np.concatenate
- Create GPU-specific startup script - Set environment variables for GPU and project configuration - Use uv to install GPU extras and run FastAPI server
|
Is there a reason you deleted start-gpu and not start-cpu aswell |
|
It's a misoperation I've added it back in my new commit. |
fireblade2534
left a comment
There was a problem hiding this comment.
As far as I can tell this pr breaks streaming. (I tested by running Test.py) It produced an empty wav file (outputstream.wav).
The reason is that when stream=True, the audio conversion functions are not really called.
It has been solved in the most recent commit. The reason is that when stream=True, the audio conversion functions were not really called compared to the non-stream mode. |
|
When I run the docker container using the config in: It doesn't seem to respect env vars when I put them in a .env file or when I add them to the docker compose file |
Sorry, I did not consider the docker-related issues. |
|
Also when running it in a docker container (I havn't tested this outsidee of one) running Test.py twice to generate four querys causes the container to exit with code 139 (Note that I am using the gpu container): |
…easing audio container Refactor StreamingAudioWriter to improve audio encoding reliability - Restructure audio encoding logic for better error handling - Create a new method `_create_container()` to manage container creation - Improve handling of different audio formats and encoding scenarios - Add error logging for audio chunk encoding failures - Simplify container and stream management in write_chunk method
There was a problem hiding this comment.
This pr currently breaks or removes the following features:
- The webui does not actually stream or receive any audio
- All text normalization that was in there is no longer being called
- There is no option to change the speed as it is not being passed into the generation system
- _process_chunk is in tts_service but it never gets called
- Captioned speech is broken because no timestamps are ever requested
- Streaming is broken as only the first chunk of text is returned
- Triming audio is always disabled even though it makes sense to do for chunks that contain speech
- smart_split is never called so I'm not really sure how it is suppost to split text in a sensible way
- process_text_chunk is never called
Honestly this pr feels unfinished and untested
- Update InstancePool to accept and process speed parameter - Modify TTSService to pass speed to instance pool - Update Test.py with new port and authentication - Adjust start-gpu.sh to use port 50888
|
Why did u change the gpu port to 50888 |