This paper aims to develop an automated system to provide authors with useful feedback during peer review. To address the time constraints of reviewers, we propose four key dimensions that enhance the usefulness of reviews: actionability, evidence and specificity, verifiability, and usability. To evaluate these dimensions and facilitate model development, we introduce the RevUtil dataset, which contains 1,430 human-labeled review comments and 10,000 synthetically labeled data. The synthetic data also includes rationales, which explain the scores of each dimension. Using the RevUtil dataset, we benchmark fine-tuned models that evaluate these dimensions and generate rationales. Experimental results show that the fine-tuned models achieve agreement with humans, comparable to, or in some cases surpassing, powerful closed-form models like GPT-4o. However, machine-generated reviews generally perform worse than human reviewers on all four dimensions.