feat(scripts): add MLX-LM server for fast Apple Silicon inference #128

basnijholt · 2025-12-05T05:21:53Z

Summary

Add run-mlx-lm.sh script that starts an OpenAI-compatible MLX LLM server on port 8080
Default model: mlx-community/Qwen3-4B-4bit (configurable via MLX_MODEL env var)
Only works on Apple Silicon (M1/M2/M3/M4) - provides significantly faster inference than Ollama
Add MLX-LLM pane to start-all-services.sh Zellij layout
Update README with MLX-LM documentation

Usage

# Start the server
./scripts/run-mlx-lm.sh

# Or with custom model
MLX_MODEL=mlx-community/Qwen3-8B-4bit ./scripts/run-mlx-lm.sh

# Use with agent-cli
agent-cli transcribe --llm --llm-provider openai --openai-base-url http://localhost:8080/v1
agent-cli autocorrect --llm-provider openai --openai-base-url http://localhost:8080/v1

Test plan

Run ./scripts/run-mlx-lm.sh on Apple Silicon Mac
Verify server starts and responds to requests
Test with agent-cli transcribe --llm --llm-provider openai --openai-base-url http://localhost:8080/v1

- Add run-mlx-lm.sh script that starts an OpenAI-compatible MLX LLM server - Default model: mlx-community/Qwen3-4B-4bit (configurable via MLX_MODEL env var) - Runs on port 8080 (configurable via MLX_PORT env var) - Only works on Apple Silicon (M1/M2/M3/M4) - Add MLX-LLM pane to start-all-services.sh Zellij layout - Update README with MLX-LM documentation

basnijholt force-pushed the mlx branch 3 times, most recently from 18779ff to 1ce6b6d Compare December 5, 2025 05:32

basnijholt force-pushed the mlx branch from 1ce6b6d to c48b13a Compare December 5, 2025 05:39

Merge remote-tracking branch 'origin/main' into mlx

4bc5620

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(scripts): add MLX-LM server for fast Apple Silicon inference #128

feat(scripts): add MLX-LM server for fast Apple Silicon inference #128

Uh oh!

basnijholt commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

feat(scripts): add MLX-LM server for fast Apple Silicon inference #128

Are you sure you want to change the base?

feat(scripts): add MLX-LM server for fast Apple Silicon inference #128

Uh oh!

Conversation

basnijholt commented Dec 5, 2025

Summary

Usage

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants