In order to overcome the limitations of various synthetic speech analysis datasets, as the distinction between real and synthetic speech becomes increasingly important due to the increasing risk of fake information and identity theft, we propose a Speech-Forensics dataset that extensively covers real, synthetic, and partially faked speech samples, which contain multiple segments synthesized by various high-quality algorithms. In addition, we propose a TEmporal Speech Localization Network (TEST) that simultaneously performs authenticity verification, localization of multiple fake segments, and recognition of synthetic algorithms without complex post-processing. TEST effectively integrates LSTM and Transformer to extract robust temporal speech representations, and estimates synthetic segments using dense prediction on multi-scale pyramid features. The proposed model achieves an average mAP of 83.55% and EER of 5.25% at the utterance level, and an EER of 1.07% and an F1-score of 92.19% at the segment level, highlighting its robust capability for comprehensive analysis of synthetic speech.