This paper highlights the importance of transfer learning, which leverages the pre-trained weights of backbones, for training successful models using limited annotated data. Selecting an appropriate backbone is crucial, especially on small datasets, where final performance heavily relies on the quality of the initial feature representation. While previous research has benchmarked backbones across diverse datasets to identify universally best-performing backbones, this paper demonstrates that the effectiveness of backbones is highly dataset-dependent, particularly in low-data scenarios where no single backbone consistently performs well. To overcome these limitations, this paper proposes a new research direction for dataset-specific backbone selection and investigates its practicality in low-data environments. Because a full evaluation of a large pool of backbones is computationally impractical, this paper formulates Vision Backbone Efficient Selection (VIBES), a problem for searching for high-performance backbones under computational constraints. We define a solution space, propose several heuristics, and conduct experiments on four diverse datasets to demonstrate the feasibility of VIBES for low-data image classification. Our experiments demonstrate that even a simple search strategy can find a suitable backbone from a pool of over 1,300 pre-trained models, outperforming common benchmark recommendations with a search time of less than 10 minutes on a single GPU (NVIDIA RTX A5000).