Considering the future direction of AI systems, we hypothesize that agents can self-improvement across all aspects of their design. We formalize this through a five-axis decomposition and decision hierarchy, separating incentives and learning behaviors to analyze each axis individually. Our main result identifies and introduces a structural conflict known as the utility-learning tension. This tension arises from the fact that utility-based changes that improve immediate or expected performance can weaken the statistical prerequisites for reliable learning and generalization. This study demonstrates that distribution-independent guarantees are preserved only when the model family capable of achieving a policy is uniformly capacity-bounded. When capacity can grow indefinitely, utility-rational self-improvement can render learnable tasks unlearnable. Under general assumptions, these axes collapse along the same capacity criterion, forming a single boundary for secure self-correction. We validate this theory through numerical experiments across multiple axes, comparing a destructive utility policy with two gated policies used to maintain learnability.