Qwen 3.5 Local Installation & Setup Guide — From Ollama to vLLM
Step-by-step guide to running Qwen 3.5 locally. From 5-minute Ollama setup to production vLLM servers, plus optimal model size selection per GPU.

Qwen 3.5 Local Installation & Setup Guide — From Ollama to vLLM
In the previous post, we compared Qwen 3.5 and DeepSeek V3.2. Now let's get Qwen 3.5 running locally on your machine, step by step.
From a 5-minute Ollama setup to a production-grade vLLM API server, plus optimal model size selection per GPU — this guide covers everything.
1. Which Size Should You Pick?
Qwen 3.5 comes in 8 sizes. Matching the right model to your GPU is step one.
Related Posts

DeerFlow 2.0 Production Deployment — Docker Compose, Kubernetes, Message Gateways
Deploy DeerFlow to production with Docker Compose and Kubernetes. Connect Slack/Telegram message gateways for team access.

DeerFlow 2.0 Custom Skills + MCP + Sandbox — Building Your Own Tools and Workflows
DeerFlow's markdown-based skills system, MCP server integration, Docker/K8s sandbox, and persistent memory system with practical code examples.

DeerFlow 2.0 Multi-Agent Workflow Deep Dive — StateGraph, Plan-Execute, Human-in-the-Loop
Code-level analysis of DeerFlow's LangGraph StateGraph-based Multi-Agent Workflow. Supervisor routing, Plan-Execute pattern, and dynamic sub-agent spawning.