← model directory

qwen

Qwen: Qwen3.5-Flash

qwen/qwen3.5-flash-02-23

↓ runs free on your own hardware

The Qwen3.5 native vision-language Flash models are built on a hybrid architecture that integrates a linear attention mechanism with a sparse mixture-of-experts model, achieving higher inference efficiency. Compared to the...

specs & pricing
Type
text
Provider
qwen
Model ID
qwen/qwen3.5-flash-02-23
Capabilities
vision, tools, reasoning
Context window
1M tokens
Self-hostable
Yes — runs on your own GPU
Input price
$0.071 / 1M tokens
Output price
$0.29 / 1M tokens

Cloud price is billed from prepaid credits when a request fails over to the cloud. Open-weight models run free on GPUs you own — the gateway routes to your nodes first.