You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(scripts): add MLX-LM server script for fast Apple Silicon inference
- Add run-mlx-lm.sh script that starts an OpenAI-compatible MLX LLM server
- Default model: mlx-community/Qwen3-4B-4bit (configurable via MLX_MODEL env var)
- Runs on port 8080 (configurable via MLX_PORT env var)
- Only works on Apple Silicon (M1/M2/M3/M4)
- Add MLX-LLM pane to start-all-services.sh Zellij layout
- Update README with MLX-LM documentation
|**[Ollama](https://ollama.ai/)**| Local LLM for text processing | ✅ Yes, with default model |
301
+
|**[MLX-LM](https://github.com/ml-explore/mlx-lm)**| Fast LLM on Apple Silicon | ⚙️ Optional, via `uvx`|
301
302
|**[Wyoming Faster Whisper](https://github.com/rhasspy/wyoming-faster-whisper)**| Speech-to-text | ✅ Yes, via `uvx`|
302
303
|**[Wyoming Piper](https://github.com/rhasspy/wyoming-piper)**| Text-to-speech | ✅ Yes, via `uvx`|
303
304
|**[Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI)**| Premium TTS (optional) | ⚙️ Can be added later |
@@ -318,10 +319,13 @@ You can also use other OpenAI-compatible local servers:
318
319
319
320
| Server | Purpose | Setup Required |
320
321
|---------|---------|----------------|
322
+
|**[MLX-LM](https://github.com/ml-explore/mlx-lm)**| Fast LLM inference on Apple Silicon |`./scripts/run-mlx-lm.sh` or use `--openai-base-url http://localhost:10500/v1`|
321
323
|**llama.cpp**| Local LLM inference | Use `--openai-base-url http://localhost:8080/v1`|
322
324
|**vLLM**| High-performance LLM serving | Use `--openai-base-url` with server endpoint |
323
325
|**Ollama**| Default local LLM | Already configured as default |
324
326
327
+
> **Apple Silicon Users**: MLX-LM provides significantly faster inference than Ollama on M1/M2/M3/M4 Macs. Start it with `./scripts/run-mlx-lm.sh` and use `--llm-provider openai --openai-base-url http://localhost:10500/v1` to connect.
328
+
325
329
## Usage
326
330
327
331
This package provides multiple command-line tools, each designed for a specific purpose.
0 commit comments