← model directory

cloudflare

zai-org/glm-4.7-flash

cloudflare/zai-org/glm-4.7-flash

↓ runs free on your own hardware

GLM-4.7-Flash is a fast and efficient multilingual text generation model with a 131,072 token context window. Optimized for dialogue, instruction-following, and multi-turn tool calling across 100+ languages.

specs & pricing
Type
text
Provider
cloudflare
Model ID
cloudflare/zai-org/glm-4.7-flash
Context window
131K tokens
Self-hostable
Yes — runs on your own GPU
Input price
$0.067 / 1M tokens
Output price
$0.44 / 1M tokens

Cloud price is billed from prepaid credits when a request fails over to the cloud. Open-weight models run free on GPUs you own — the gateway routes to your nodes first.