← model directory

google

Google: Gemini 3.1 Flash Lite

google/gemini-3.1-flash-lite

Gemini 3.1 Flash Lite is Google’s GA high-efficiency multimodal model optimized for low-latency, high-volume workloads. It supports text, image, video, audio, and PDF inputs, and is designed for lightweight agentic...

specs & pricing
Type
text
Provider
google
Model ID
google/gemini-3.1-flash-lite
Capabilities
vision, audio-in, tools, reasoning
Context window
1.0M tokens
Input price
$0.28 / 1M tokens
Output price
$1.65 / 1M tokens

Cloud price is billed from prepaid credits when a request fails over to the cloud. Open-weight models run free on GPUs you own — the gateway routes to your nodes first.