This paper addresses the problem of balancing energy efficiency and performance in AI/ML models, focusing on a deep learning receiver, DeepRX (based on a fully convolutional ResNet architecture). We evaluate the energy consumption of DeepRX considering factors including FLOPs/Watt and FLOPs/clock, and verify the consistency between the estimated energy usage and the actual energy usage affected by memory access patterns. We extend the comparison of energy dynamics during training and inference. Our main contribution is to apply knowledge distillation (KD) to train a small DeepRX student model that emulates the performance of the teacher model while reducing energy consumption. We experiment with various student model sizes, optimal teacher model sizes, and KD hyperparameters. We measure the performance by comparing the bit error rate (BER) performance of the distilled model and the signal-to-interference-plus-noise ratio (SINR) values of a model trained from scratch. The distilled model shows a lower error lower bound across SINR levels, highlighting the effectiveness of KD in achieving energy-efficient AI solutions.