In this paper, we demonstrate that a Transformer evaluated at a fixed depth is limited in its expressive power to the TC0 circuit class, and propose a novel approach to improve the expressive power of the encoder Transformer, rather than an autoregressive approach, to overcome this limitation. While existing autoregressive approaches (next-token prediction, chain-of-thought reasoning) rely on a feedback loop that decodes and re-encodes intermediate states into tokens, the SELF-Transformer proposed in this paper iteratively refines attention weights within the encoder layer to a fixed point, thereby adjusting test-time computation according to input difficulty. This is done by iteratively updating the alignment matrix internally, rather than generating an alignment matrix that mixes input sequences in a single pass. As a result, we achieve up to 20% accuracy improvement on encoder-style benchmarks without increasing the number of parameters, and show that input-adaptive alignment provides significant benefits at test-time with a small additional computational cost. Thus, the SELF-Transformer significantly recovers the expressive power of recurrent reasoning while maintaining the simplicity of a pure encoder architecture.