This paper introduces MINERVA, a novel supervised feature selection method based on neural estimation of mutual information to model feature-target relationships. Given that conventional feature filters can fail for targets that rely on higher-order feature interactions rather than individual feature contributions, we perform feature selection using a carefully designed loss function augmented with a regularization term that induces sparsity and a neural network to parameterize the approximation of mutual information. MINERVA is implemented as a two-step process to separate representation learning and feature selection, thereby improving generalization performance and more accurately representing feature importance. The method demonstrates its ability to achieve accurate results through experiments on synthetic and real-world fraud datasets.