In this paper, we propose a cross-domain multi-level reasoning (CdMT) framework to improve zero-shot fine-grained traffic sign recognition (TSR) performance in wild environments. Existing methods struggle particularly in cross-country TSR scenarios due to differences in traffic signs across countries, and CdMT leverages the multi-level reasoning capability of large-scale multi-modal models (LMMs) to address this challenge. We design a multi-level reasoning process for LMMs by introducing context, features, and discriminative explanations. The enhanced context explanations, through centroid prompt optimization, enable accurate localization of signs in complex road images and filtering out irrelevant responses. Feature explanations derived from context learning with template traffic signs bridge the gap between domains and improve fine-grained TSR, while discriminative explanations enhance the multi-modal reasoning capability of LMMs by distinguishing subtle differences between similar signs. CdMT is independent of training data and requires only simple and uniform instructions to achieve cross-country TSR. Through extensive experiments on three benchmark datasets and two real-world datasets, we demonstrate that the proposed CdMT framework outperforms state-of-the-art methods on all five datasets. (GTSRB 0.93, BTSD 0.89, TT-100K 0.97, Sapporo 0.89, Yokohama 0.85)