This paper addresses the evaluation of Link Prediction (LP) methods, an important problem in network science and machine learning. Existing LP method evaluations have been conducted in a uniform setting that ignores several factors related to the special requirements of the data and application domain. This paper examines several factors, including network type, problem type, geodesic distance between endpoints and distributions for classes, characteristics and applicability of LP methods, impact of class imbalance and early retrieval, evaluation metrics, etc., and presents an experimental setup that can evaluate LP methods in a rigorous and controlled manner. Extensive experiments are conducted using a variety of real-world network datasets, and carefully designed hypotheses provide valuable insights into the interactions between these factors and LP performance. Based on these insights, recommendations are provided as best practices to be followed for evaluating LP methods.