Back to Directory

vllm-mlx

OpenAI and Anthropic compatible server for Apple Silicon. Run LLMs and vision-language models (Llama, Qwen-VL, LLaVA) with continuous batching, MCP tool calling, and multimodal support. Native MLX backend, 400+ tok/s. Works with Claude Code.

419
Stars
50
Forks
February 24, 2026
Created On

Install this Skill

Add this capability to your agent instantly using the CLI.

$npx @agent/tresor install vllm-mlx
vllm-mlx - Agent Tresor