提交前必须检查以下项目 | The following items must be checked before submission
问题类型 | Type of problem
模型推理和部署 | Model inference and deployment
操作系统 | Operating system
Linux
详细描述问题 | Detailed description of the problem
Ubuntu系统
docker-compose部署
镜像api-llm:vllm
当同时部署llm和embedding模型时
会报错:
RuntimeError: Cannot re-initialize CUDA in forked subprocess. To use CUDA with multiprocessing, you must use the 'spawn' start method
单独部署llm则没有问题
运行日志或截图 | Runtime logs or screenshots
`
