Yan is a foundational framework that encompasses the entire interactive video generation pipeline, from simulation, generation, and editing. It consists of three core modules. For AAA-level simulations, we designed a highly compressed, low-latency 3D-VAE and a KV-Cache-based shift-window denoising inference process to achieve real-time 1080P/60FPS interactive simulation. For multimodal generation, we infuse game-specific knowledge into an open-domain multimodal video diffusion model (VDM) and then introduce a hierarchical autoregressive captioning method that transforms the VDM into a frame-by-frame, action-controlled, real-time, infinite interactive video generator. Even when text and visual prompts come from different domains, the model demonstrates strong generalization, allowing for flexible mixing and composing of cross-domain styles and mechanisms based on user prompts. For multi-particle editing, we propose a hybrid model that explicitly separates interaction mechanism simulation and visual rendering, enabling text-based, interactive editing of multi-particle video content. By integrating these modules, Yan advances interactive video generation beyond an isolated function into a comprehensive AI-driven interactive generation paradigm, paving the way for the next generation of creative tools, media, and entertainment.