This paper proposes a multilayer fused graph neural network (MLFGNN) to improve the accuracy of molecular property prediction, which is essential in drug discovery and related fields. To address the difficulty of conventional graph neural networks (GNNs) in simultaneously capturing local and global molecular structures, we integrate a graph attention network and a novel graph transformer to jointly model local and global dependencies. Furthermore, we integrate molecular fingerprints as complementary modalities and introduce an interaction mechanism between attentions to adaptively fuse information across representations. Extensive experiments on various benchmark datasets demonstrate that our proposed approach outperforms state-of-the-art methods in both classification and regression tasks. Interpretability analysis demonstrates that our proposed approach effectively captures task-relevant chemical patterns. This suggests the utility of multilayer and multimodal fusion for molecular representation learning.