This study investigates the necessity of fine-tuning versus zero-shot pretraining, the benefits of domain-specific versus general pretraining, the value of additional domain-specific pretraining, and the continued relevance of small-scale language models (SLMs) over large-scale language models (LLMs) for specific tasks to guide language model selection. Using electronic pathology reports from the British Columbia Cancer Registry (BCCR), we evaluated three classification scenarios with varying difficulty and data sizes. Various SLMs and one LLM were used as models. SLMs were evaluated using both zero-shot and fine-tuning methods, while LLMs were evaluated solely on zero-shot. Fine-tuning significantly improved SLM performance compared to zero-shot results in all scenarios. Zero-shot LLMs outperformed zero-shot SLMs but consistently lagged behind fine-tuned SLMs. Domain-specific SLMs outperformed general SLMs after fine-tuning, particularly on challenging tasks. Additional domain-specific pretraining provided only a marginal benefit on easy tasks, but significant improvements on complex and data-poor tasks. In conclusion, we demonstrate that fine-tuning SLM in specific domains is crucial and can outperform zero-shot LLM on target classification tasks. Pretraining on domain-relevant or domain-specific data provides additional benefits, especially for complex problems or with limited fine-tuning data. While LLM offers powerful zero-shot capabilities, it did not match the performance of a properly fine-tuned SLM on the specific task in this study. Even in the LLM era, SLM remains relevant and efficient, and can offer a better performance-resource balance than LLM.