In this paper, we present a method to improve the generalization performance and robustness of disease classification by learning identifiable causal representations in medical images. In particular, we introduce an end-to-end framework to learn identifiable representations by grouping observations in chest X line images and enhancing invariance to race, gender, and image view. Experimental results show that the causal representations learned through grouping improve the generalization performance and robustness in various classification tasks.