This paper proposes LIR-ASR, an iterative error correction framework inspired by human auditory perception that leverages large-scale language models (LLMs). LIR-ASR generates phonetic variations using a "listen-imagine-refine" strategy and refines them based on context. To avoid local optima, heuristic optimization using a finite state machine (FSM) is employed, along with rule-based constraints to maintain semantic fidelity. Experimental results on English and Chinese ASR outputs demonstrate that LIR-ASR significantly improves transcription accuracy, reducing CER/WER by an average of 1.5 percentage points compared to baselines.