To overcome the limitations of existing biometric systems, this paper proposes a lightweight vision transformer (POC-ViT) that utilizes dual biometric features from the forehead and eye area, which are unaffected by face masks or hygiene issues. POC-ViT captures the interdependent structural patterns of the two biometric features using a phase-only mutual attention mechanism. The mutual attention mechanism, computed based on phase correlation, is robust to resolution, intensity, and illumination variations, and its lightweight model makes it suitable for edge device deployment. Experimental results using the FSVP-PBP database, which includes 350 subjects, demonstrate that the proposed POC-ViT achieves a superior classification accuracy of 98.8%, outperforming state-of-the-art methods.