In this paper, we propose an improved deep learning model based on ConvNeXt to improve the accuracy of rock size classification. The proposed model, CNSCA, improves upon the basic structure of ConvNeXt by adding self-attention and channel-attention mechanisms. The self-attention mechanism captures long-range spatial dependencies, while the channel-attention mechanism emphasizes information-rich feature channels, effectively capturing fine-grained local patterns and broad contextual relationships. We evaluate the model using a rock size classification dataset and compare it with three strong baseline models. Our results demonstrate that incorporating the attention mechanism significantly improves the model's performance on fine-grained classification tasks involving natural textures such as rocks.