This paper highlights why state-of-the-art Vision Transformers (ViTs) are not designed to exploit natural geometric symmetries such as 90-degree rotations and reflections, arguing that the lack of efficient implementations is the cause. To this end, we introduce Octic Vision Transformers (octic ViTs), which capture these symmetries by exploiting the octic group isomorphism. In contrast to the computational overhead of conventional isomorphism models, octic linear layers achieve a 5.33x reduction in FLOPs and up to an 8x reduction in memory compared to regular linear layers. We study two new ViT families built with octic blocks, and train the octic ViTs on ImageNet-1K using supervised (DeiT-III) and unsupervised (DINOv2) learning, achieving both baseline accuracy and significant efficiency improvements.