This paper emphasizes the importance of accessing real-time information on wildfire situations in Canada and focuses on leveraging social media data to overcome the limitations of existing data sources. Specifically, we present WildFireCan-MMD, a multimodal (text and image) wildfire social media dataset lacking in the Canadian context. This dataset annotates recent Canadian wildfire-related posts (X) into 12 key themes. We compare a zero-shot Vision-Language Model (VLM), a custom-trained model, and a baseline classifier, demonstrating that the custom-trained model outperforms both the zero-shot model and the baseline classifier (84.48% f-score) when labeled data is available. Furthermore, we propose a method for identifying wildfire trends using large-scale, unlabeled datasets, emphasizing the importance of region-specific datasets.