This paper studies how to simultaneously achieve two highly desirable properties in modern machine learning models: robustness and resource efficiency. We show that high learning rates help achieve both robustness against spurious correlations and network compactness. We show that high learning rates yield desirable representational properties, such as invariant feature utilization, class separability, and activation sparsity, and that they are more consistent in satisfying these properties than other hyperparameters and regularization methods. In addition to demonstrating the positive effects of high learning rates across a variety of spurious correlation datasets, models, and optimizers, we provide strong evidence that the success of high learning rates on standard classification tasks is likely due to their effectiveness in combating hidden/rare spurious correlations in the training dataset.