In this paper, we explore the mechanism of position generalization ability of large-scale language models (LLMs), that is, the ability to understand meaning despite changes in text position and generalize to texts longer than the training data. By analyzing how LLMs handle positional relevance, we find that they process attention logrits in a manner similar to the arithmetic sum of positional relevance and semantic significance, despite the complexity of their self-attention mechanisms. In particular, we identify and theoretically prove specific patterns in intermediate features, showing that this positional generalization ability is a learned behavior. As a result, we present a computational explanation and criteria for the positional flexibility of LLMs, and perform pioneering work linking positional generalization with the internal mechanisms of LLMs.