This study examines privacy leaks occurring during the inference process of large-scale inference models. They highlight that sensitive user data can be included in inference processes, even when internally considered secure, unlike the final results. They demonstrate that this data can be extracted through prompt injection or unintentional leaks. Specifically, they highlight that runtime computing methods, such as increasing the number of inference steps, amplify information leakage, and that while improved inference performance enhances usability, it also increases the potential for privacy attacks. They argue that securing the security of the internal inference process, as well as the final results, is crucial.