This study evaluates the performance of open-source, locally hosted large-scale language models (LLMs) on complex competitive programming problems. Building on the existing AI-based Code Generation Evaluation Framework (FACE), we modified the pipeline to run offline via the Ollama runtime and evaluated eight code-oriented models (6.7–9 billion parameters) on 3,589 Kattis problems. The submission results showed that the overall pass@1 accuracy of the local models was relatively low, with even the best-performing models achieving only half the accuracy of proprietary models such as Gemini 1.5 and ChatGPT-4.