This paper proposes ConfSMoE, an improved version of the Sparse Mixture-of-Experts (SMoE) architecture, to effectively address the modal omission problem, a common problem in real-world multimodal learning. While conventional SMoE is vulnerable to modal omission, leading to performance degradation and generalization issues, ConfSMoE addresses the missing modalities through a two-stage imputation module. Through theoretical analysis and experimental evidence, we elucidate the phenomenon of expert collapse. To address this, we propose a novel expert gating mechanism that separates the existing softmax routing score into task confidence scores for ground truth signals. This mechanism mitigates the expert collapse problem without an additional load-balancing loss function. We comprehensively analyze the proposed method's resistance to modal omission and the impact of the proposed gating mechanism on four real-world datasets and three experimental setups.