This paper highlights the critical importance of reliable abstinence for augmented search generation (RAG) systems in safety-critical domains, such as women's health, where incorrect answers can cause harm. We present an energy-based model (EBM) that learns a smooth energy landscape for a dense semantic corpus of 2.6 million guideline-based questions. This model enables the system to decide whether to generate or abstain. The EBM is evaluated against the calibrated softmax baseline and the k-nearest neighbor (kNN) density heuristic, with the difficult case being queries near semantically challenging distributions. The EBM achieves superior abstinence performance in semantically challenging cases, achieving an area under the curve (AUROC) of 0.961 compared to 0.950 for the softmax baseline and reducing FPR@95 from 0.331 to 0.235. While performance is similar in easy negative cases, the EBM's advantage is most pronounced in safety-critical, challenging distributions. Comprehensive ablation studies using controlled negative sampling and fair data exposure demonstrate that robustness primarily stems from the energy score head, and that the inclusion or exclusion of specific negative types (difficult, easy, or mixed) sharpens the decision boundary but is not essential for generalization to difficult cases. These results demonstrate that energy-based self-scoring provides more reliable confidence signals than probability-based softmax confidence, providing a scalable and interpretable foundation for secure RAG systems.