This paper highlights the importance of monitoring urban tree dynamics to support urban greening policies and reduce risks to power infrastructure. We present a study on tree point extraction using multispectral LiDAR (MS-LiDAR) and a deep learning (DL) model. To overcome the limitations of conventional airborne LiDAR due to the complex urban environment and tree diversity, we utilized MS-LiDAR, which captures both 3D spatial and spectral data. We evaluated three state-of-the-art models: Superpoint Transformer (SPT), Point Transformer V3 (PTv3), and Point Transformer V1 (PTv1). The results show that the SPT model achieves 85.28% mIoU, demonstrating superior time efficiency and accuracy. Furthermore, adding the pseudo-normalized difference vegetation index (pNDVI) to the spatial information yielded the highest detection accuracy, reducing the error rate by 10.61 percentage points. This study demonstrates the potential of MS-LiDAR and DL to improve tree extraction and inventory.