GalaxAlign is a novel multimodal approach for galaxy morphology analysis. To overcome the high cost or low accuracy of existing methods, it was inspired by the way citizen scientists identify galaxies using text descriptions and schematic symbols. GalaxAlign uses a trimodal alignment framework that aligns three types of data—schematic symbols, text labels, and galaxy images—during the fine-tuning process. This enables effective fine-tuning without expensive pretraining, and demonstrates performance improvements for galaxy classification and similarity retrieval tasks.