This paper presents a novel approach to overcome the limitations of wireless ray tracing (RT), a technology emerging as a key tool for 3D wireless channel modeling. Existing online learning methods struggle to accurately model next-generation (Beyond 5G, B5G) network signals, which are sensitive to environmental changes at high frequencies. Furthermore, they require real-time environmental supervision, which is costly and incompatible with GPU-based processing. In this paper, we propose SANDWICH (Scene-Aware Neural Decision Wireless Channel Raytracing Hierarchy), a novel method that redefines ray path generation as a sequential decision-making problem and leverages generative models to jointly learn optical, physical, and signal characteristics within each environment. SANDWICH is a fully differentiable offline approach that can be trained entirely on GPUs and outperforms existing online learning methods.