This paper proposes a novel inverse reinforcement learning (IRL) method that addresses the rigidity of fixed reward structures and the inflexibility of implicit reward regulation. Based on the maximum entropy IRL framework, it incorporates a squared temporal difference (TD) regularizer with an adaptive target that dynamically evolves during training, imposing adaptive bounds on restored rewards and facilitating robust decision-making. To capture richer payoff information, distributional reinforcement learning is incorporated into the training process. Experimentally, the proposed method achieves expert-level performance on the complex MuJoCo task and outperforms baseline methods on humanoid tasks across three demonstrations. Extensive experiments and ablation studies further validate the effectiveness of this method and provide insights into reward dynamics in imitation learning.