← model directory

nvidia

NVIDIA: Nemotron 3 Ultra

nvidia/nemotron-3-ultra-550b-a55b

↓ runs free on your own hardware

NVIDIA Nemotron 3 Ultra is an open frontier-reasoning and orchestration model from NVIDIA, with 55B active parameters out of 550B total (MoE). Built on a hybrid Transformer-Mamba mixture-of-experts architecture, it...

specs & pricing
Type
text
Provider
nvidia
Model ID
nvidia/nemotron-3-ultra-550b-a55b
Capabilities
tools, reasoning
Context window
1M tokens
Self-hostable
Yes — runs on your own GPU
Input price
$0.55 / 1M tokens
Output price
$2.75 / 1M tokens

Cloud price is billed from prepaid credits when a request fails over to the cloud. Open-weight models run free on GPUs you own — the gateway routes to your nodes first.

NVIDIA: Nemotron 3 Ultra — pricing, context & specs | Wide Area Intelligence