This paper investigates how uncertainty estimation can be used to enhance the safety and reliability of machine learning (ML) systems, which are increasingly deployed in high-risk, trust-critical domains. In particular, we focus on selective prediction, where models refrain from making predictions when confidence is low. First, we demonstrate that the model's training path contains rich uncertainty signals that can be leveraged without altering the architecture or loss. By ensembling predictions from intermediate checkpoints, we propose a lightweight, post-hoc abstention method that works across diverse tasks, avoids the cost of deep ensembles, and achieves state-of-the-art selective prediction performance. Importantly, this method is fully compatible with differential privacy (DP), enabling us to study how privacy noise impacts uncertainty quality. While many methods degrade under DP, our path-based approach is robust and introduces a framework for decoupled privacy-uncertainty tradeoffs. Next, we develop a finite-sample decomposition of the selective classification gap (the deviation from the oracle accuracy-fit curve) to identify five interpretable sources of error and clarify interventions that can reduce the gap. This explains why calibration alone cannot correct ranking errors and suggests a method for improving uncertainty rankings. Finally, we demonstrate that adversarial manipulation of uncertainty signals can conceal errors or deny service while maintaining high accuracy, and we design a defense mechanism that combines calibration auditing and verifiable inference. These contributions advance trustworthy ML by improving, evaluating, and protecting uncertainty estimates, enabling models that not only make accurate predictions but also know when to say "I don't know."