The effectiveness of unlearning, which removes specific data from large-scale language models (LLMs), is typically evaluated using task-level metrics such as accuracy or confusion. However, we demonstrate that these metrics can be misleading. While the model may appear to have forgotten, its original behavior can be easily restored with minimal fine-tuning. This "reversibility" phenomenon suggests that information is suppressed rather than actually deleted. To address this issue, we introduce a "representation-level analysis framework" that incorporates PCA-based similarity and shift, centered kernel alignment (CKA), and Fisher information. Through this, we identify four distinct forgetting regimes in terms of reversibility and catastrophic forgetting. Our analysis reveals that achieving the ideal state (irreversible and non-catastrophic forgetting) is extremely difficult. By exploring the limitations of unlearning, we identify cases of seemingly irreversible and targeted forgetting, providing new insights for designing more robust deletion algorithms.