We analyze whether the explanations for LLM decisions accurately reflect the actual factors driving those decisions (fidelity). We analyze counterfactual fidelity across 75 models from 13 families, examining the balance between parsimony and comprehensiveness, the method for assessing correlational fidelity metrics, and the potential for manipulation. We propose two new metrics: the Correlational Counterfactual Test (phi-CCT, a simplified version of CCT) and the F-AUROC. Our results show that larger and better-performing models consistently score higher on fidelity metrics.