This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
Highly Imbalanced Regression with Tabular Data in SEP and Other Applications
Created by
Haebom
Author
Josias K. Moukpe, Philip K. Chan, Ming Zhang
Outline
This paper addresses the highly imbalanced regression problem in tabular data with imbalanced proportions exceeding 1,000. Accurately estimating the target value of rare instances is crucial for applications such as predicting the intensity of rare and hazardous solar energetic particle (SEP) events. The conventional MSE loss function does not account for the correlation between predicted and actual values, the typical inverse importance function only allows convex functions, and uniform sampling can generate mini-batches devoid of rare instances. Therefore, this paper proposes CISIR, which integrates correlation, monotonically decreasing involution (MDI) importance, and hierarchical sampling. Experimental results on five datasets demonstrate that CISIR achieves lower error rates and higher correlation than other recent methods, and that adding a correlation component to other state-of-the-art methods can improve their performance. Finally, MDI importance outperforms other importance functions. The source code can be found at https://github.com/Machine-Earning/CISIR .