This paper presents Jet-Nemotron, a hybrid architecture language model that achieves accuracy comparable to or higher than leading full-attention models while significantly improving generation throughput. Jet-Nemotron is developed using Post Neural Architecture Search (PostNAS), a novel neural network architecture exploration pipeline that enables efficient model design. PostNAS fixes the MLP weights of a pre-trained full-attention model and efficiently explores attention block designs. This pipeline comprises four main components: (1) optimal full-attention layer placement and pruning training, (2) linear attention block selection, (3) novel attention block design, and (4) hardware-aware hyperparameter search. The Jet-Nemotron-2B model achieves accuracy comparable to or higher than Qwen3, Qwen2.5, Gemma3, and Llama3.2, while providing up to a 53.6x speedup in generation throughput and a 6.1x speedup in dictionary filling. It also achieves higher accuracy in MMLU and MMLU-Pro than state-of-the-art MoE full-attention models such as DeepSeek-V3-Small and Moonlight.