This paper introduces STOPA, a novel dataset aimed at addressing the lack of systematically curated datasets for source attribution of synthesized speech, a key area of deepfake voice detection research. STOPA comprises 700,000 samples generated from eight acoustic models (AMs), six vocoder models (VMs), and 13 different synthesizers, systematically addressing a wide range of parameter settings. Unlike existing datasets, STOPA provides a systematically controlled framework that encompasses various generative elements, such as vocoder models, acoustic models, and pre-trained weight selection, thereby improving attribution confidence.