This paper presents an algorithm for solving binary classification problems where data is unavailable and the dataset is not fully representative of the problem. The algorithm relies on a trained model with loose accuracy constraints, an iterative hyperparameter search and pruning procedure over the search space Θ, and a data generating function. The algorithm works by reconstructing the manifold containing the support of the underlying distributions up to the homology. We provide an analysis of the accuracy and running time complexity under ideal conditions, and an extension to deep neural networks. In the ideal case, when the number of hyperparameter sets in the search space is $\size{\Theta}$, the algorithm returns solutions that are at most $2(1 - {2^{-\size{\Theta}}})$ times better than training using the enumeration of Θ and selecting the best model. As part of the analysis, we prove that the open covering of the dataset has the same homology as the manifold containing the support of the underlying probability distributions if and only if the dataset is learnable. The latter result serves as a formal basis for explaining the effectiveness of data augmentation techniques.