This paper highlights the importance of pre-deployment fairness and bias assessment, as large-scale language models (LLMs) are widely used in high-risk fields such as clinical decision support, legal analysis, recruitment, and education. To overcome the shortcomings of existing evaluations, we propose HALF (Harm-Aware LLM Fairness), a deployment-centric framework that assesses model bias in realistic application environments and considers the severity of harm. HALF organizes nine application domains into three tiers (severe, moderate, and mild) and uses a five-stage pipeline. The evaluation results for eight LLMs show that (1) LLMs do not consistently exhibit fairness across domains, (2) model size and performance do not guarantee fairness, and (3) inference models outperform medical decision support models but not training models.