This paper focuses on the hallucination problem in Video Multimodal Large-Scale Language Models (Video-MLLMs), specifically Semantic Aggregation Hallucination (SAH) occurring in long-duration videos. Unlike previous studies that have simplified the causes of hallucination by focusing on short videos, this paper redefines SAH, which occurs during complex semantic processing in long-duration videos, and presents a new benchmark, ELV-Halluc, for this purpose. Using ELV-Halluc, we confirm the presence of SAH, analyze its correlation with semantic complexity and rapid semantic changes, and experimentally verify the effectiveness of positional encoding strategies and dynamic positional offset (DPO) strategies for SAH mitigation. Utilizing 8,000 adversarial data pairs, we improve model performance and achieve a 27.7% reduction in SAH rate.