This paper describes an automated model evaluation framework (AutoEval) developed to reduce the manual annotation effort required to evaluate the performance of object detection models. This framework proposes a novel metric, Prediction Consistency and Reliability (PCR). PCR estimates object detection performance without ground truth by measuring spatial consistency between bounding boxes before and after non-maximum suppression (NMS) and the reliability of overlapping boxes. To achieve a more realistic and scalable evaluation, we constructed a metadata dataset by applying various levels of image corruption.