This paper addresses the ethical challenges of large-scale language models (LLMs) and raises new possibilities for developing toxic language detection technologies. While previous studies have used data based on simple semantic associations (e.g., biased associations between "he" and "programmer" and "she" and "housewife"), this study collects real-world toxic interaction data, which avoids online censorship and has been identified by human evaluators as requiring inference. Drawing on this data, we propose a novel prompting method, Pragmatic Inference Chain (PIC), leveraging research in cognitive science and linguistics. We demonstrate that PIC prompting significantly improves the success rate of identifying implicit toxic language compared to existing prompting methods (e.g., CoT, rule-based), in models such as GPT-4o, Llama-3.1-70B-Instruct, DeepSeek-v2.5, and DeepSeek-v3, and produces clearer and more consistent inference processes. This suggests that our method could generalize to other inference-intensive tasks, such as humor and metaphor comprehension.