This paper addresses the challenge of simultaneously achieving robustness and resource efficiency, two highly desirable properties in modern machine learning models. We demonstrate that high learning rates help achieve both robustness against spurious correlations and network compactness. We demonstrate that high learning rates yield desirable representational properties, such as invariant feature utilization, class separability, and activation sparsity. Across a variety of spurious correlation datasets, models, and optimizers, we demonstrate that high learning rates consistently achieve these properties compared to other hyperparameters and regularization methods. Furthermore, we present strong evidence that the success of high learning rates on standard classification tasks is related to their ability to address hidden/rare spurious correlations in the training dataset. Our investigation into the underlying mechanisms of this phenomenon highlights the importance of confident error predictions on bias-conflicted samples at high learning rates.