This paper explores the use of synthetic data generation to address the cost of human annotation in natural language processing (NLP) systems. We analyze the effectiveness of gradually replacing human-generated data with synthetic data for fact verification (FV) and question answering (QA) tasks using eight diverse datasets. Our experiments reveal that replacing up to 90% of the training data with synthetic data results in minimal performance degradation, but replacing the remaining 10% results in a significant performance degradation. We demonstrate that models trained purely on synthetic data can improve performance with as few as 125 human-generated data points, while significantly larger amounts of synthetic data are required to achieve the performance gains associated with an additional 200 human-generated data points. These findings suggest that even if large-scale human annotation is not feasible, human-generating a portion of the dataset can be valuable.