This study presents an empirical study of the application of the Diffusion-Based Large Language Model (DLLM), LLaDA, to automatic speech recognition (ASR). Leveraging LLaDA as an external, deliberative processing module for transcripts from Whisper-LLaMA, we explored various masking strategies (random masking, low-confidence masking, and semi-automatic regression) utilizing bidirectional attention and denoising. On the LibriSpeech dataset, the best cascade system achieved a word error rate (WER) of 2.25% and 4.94% for the test-clean/test-other segments, demonstrating a 12.3% relative improvement over the Whisper-LLaMA baseline in the test-other segment. Furthermore, LLaDA, which uses only text without acoustic features, failed to improve accuracy, highlighting the importance of acoustically conditioned embeddings. Additionally, we evaluated Whisper-LLaDA as a standalone decoder for ASR using diffusion-based and semi-automatic regression decoding, and achieved faster inference speed than the Whisper-LLaMA baseline in most experimental settings, but with slightly lower recognition accuracy.