In the field of malware analysis, this paper points out the limitations of existing AI-based approaches that focus on data representation (images, sequences) without considering expert perspectives. To improve this, we propose a preprocessing method centered on expert knowledge that enhances malware semantic analysis and result interpretability. Specifically, we present a novel preprocessing method that generates JSON reports for Portable Executable (PE) files. This report collects features extracted from static and dynamic analyses and integrates knowledge from packer signature detection, MITRE ATT&CK, and the Malware Behavior Catalog (MBC). The goal of this preprocessing is to collect semantic representations of binary files that are understandable to malware analysts and enhance the explainability of AI models for malware analysis. Using this preprocessing, we trained a large-scale language model for malware classification, achieving a weighted average F1-score of 0.94 on a complex dataset representing market reality.