This paper presents a novel method, Federated Cross-Training (FedCT), that improves cross-training strategies in federated learning. To address the limitations of existing cross-training, such as mismatched optimization objectives and feature space heterogeneity due to data distribution differences, it leverages knowledge distillation from both local and global perspectives. Specifically, it consists of three modules: a consistency-aware knowledge propagation module, a multi-perspective knowledge-guided representation learning module, and a mixup-based feature augmentation module. These modules preserve local knowledge, maintain consistency between local and global knowledge, and increase feature space diversity, thereby improving performance. Experimental results using four datasets show that FedCT outperforms existing state-of-the-art methods.