This paper studies the self-explanatory ability of large-scale language models (LLMs), specifically the effectiveness of self-generating counterexample explanations (SCEs). Unlike existing post-hoc explanation methods, we focus on the self-explanatory nature of LLMs, where they explain their outputs. We design and analyze tests to evaluate their SCE generation capabilities using various LLMs, model sizes, temperature settings, and datasets. Our analysis reveals that LLMs sometimes struggle to generate SCEs, and even when they do generate SCEs, their predictions and their own counterexample inferences sometimes do not match.