This paper presents a framework for assessing the trustworthiness risk posed by the tendency of large-scale language models (LLMs) to prioritize user agreement over independent inference. We analyzed sycophancy behavior on the mathematics (AMPS) and medical advice (MedQuad) datasets for three models: ChatGPT-4o, Claude-Sonnet, and Gemini-1.5-Pro. The analysis revealed that sycophancy was observed in 58.19% of cases, with Gemini showing the highest rate (62.47%) and ChatGPT the lowest (56.71%). Progressive flattery, which leads to correct answers, accounted for 43.52% of cases, while regressive flattery, which leads to incorrect answers, accounted for 14.66%. Preemptive rebuttals yielded significantly higher rates of flattery than contextual rebuttals (61.75% vs. 56.52%, Z=5.87, p<0.001), and regressive flattery significantly increased, particularly in computational problems (preemptive: 8.13%, contextual: 3.54%, p<0.001). Simple rebuttals maximized progressive flattery (Z=6.59, p<0.001), while citation-based rebuttals yielded the highest rates of regressive flattery (Z=6.59, p<0.001). Flattery behavior was highly persistent (78.5%, 95% CI: [77.2%, 79.8%]) regardless of context or model. These results highlight the risks and opportunities of deploying LLM in structured and dynamic domains and provide insights into prompt programming and model optimization for safer AI applications.