This paper presents a novel structural reward learning framework, Policy-Aware Matrix Completion (PAMC), to address the challenges of sparse-reward reinforcement learning (RL). PAMC exploits the approximate low-dimensional and sparse structure of the reward matrix under policy-biased sampling. It uses back-prone weights to prove a recovery guarantee and establishes a visit-weighted error-regret bound that links completion error to control performance. When the assumption weakens, PAMC widens the confidence interval to safely return to exploration and halts the algorithm. Experimentally, PAMC improves sample efficiency on the Atari-26, DM Control, MetaWorld MT50, D4RL offline RL, and baseline RL benchmarks, and outperforms DrQ-v2, DreamerV3, Agent57, T-REX/D-REX, and PrefPPO in computational regularization comparisons. These results highlight PAMC as a practical and principled tool in the presence of structural rewards and serve as the first concrete example of a broader structural reward learning perspective.