This paper highlights that while deep neural networks have demonstrated success as neural representation models for human behavior and visual tasks, they learn fundamentally differently from human learning and fail to achieve robust generalization capabilities. A key discrepancy lies in the fact that human conceptual knowledge is hierarchically organized, ranging from fine-grained to macroscopic, while model representations fail to accurately capture all these levels of abstraction. To address this, we train a teacher model to mimic human judgment, then fine-tune the representations of a pre-trained, state-of-the-art vision-based model to transfer a human-aligned structure. The result is a human-aligned model that more accurately approximates human behavior and uncertainty across a variety of similarity tasks and improves generalization and distributional robustness across a variety of machine learning tasks. In conclusion, we demonstrate that adding human knowledge to neural networks yields optimal representations that enhance both human cognitive judgment and practical application, leading to more robust, interpretable, and human-aligned artificial intelligence systems.