This paper presents a multifaceted approach for diagnosing wrist lesions, a common finding in pediatric fracture patients. To address the lack of medical image data, we fuse X-ray images with patient metadata and define the problem as a fine-grained recognition task utilizing pre-trained weights on a fine-grained dataset rather than a general dataset such as ImageNet. Unlike previous studies, this is the first to apply metadata integration to wrist lesion recognition, demonstrating a 2% improvement in diagnostic accuracy on a small, customized dataset and over 10% improvement on a large-scale fracture dataset.