This paper focuses on the evaluation of large-scale language models (LLMs), particularly in business scenarios. To address the inefficiencies of existing manual evaluation methods, we propose TALEC, a model-based evaluation method that allows for the application of user-defined evaluation criteria. TALEC utilizes in-context learning (ICL) to train internal criteria for the judgment model and combines zero-shot and few-shot evaluations to focus on more information. Furthermore, we propose an effective prompting paradigm and an engineering approach to enhance the accuracy of the judgment model. Experimental results show that TALEC correlates with human evaluations by over 80%, and in certain tasks, it outperforms inter-human correlations. We also present results demonstrating that ICL can be used as an alternative to fine-tuning. The code is available on GitHub.