This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
PDeepPP is an integrated deep learning framework that integrates a pre-trained protein language model and a hybrid transformer-convolution architecture, enabling robust identification across a wide range of peptide features. It systematically extracts global and local sequence features by curating extensive benchmark datasets and implementing strategies to address data imbalance. Extensive analysis, including dimensionality reduction and comparative studies, demonstrates PDeepPP's robust and interpretable peptide representations, achieving state-of-the-art performance on 25 of 33 biological identification tasks. Specifically, it achieves high accuracy in antibacterial (0.9726) and phosphorylation site (0.9984) identification, 99.5% specificity in glycosylation site prediction, and significantly reduces false negatives in antimalarial tasks. By enabling large-scale, accurate peptide analysis, PDeepPP supports biomedical research and the discovery of novel therapeutic targets for disease treatment. All code, datasets, and pretrained models are publicly available on GitHub ( https://github.com/fondress/PDeepPP ) and Hugging Face ( https://huggingface.co/fondress/PDeppPP) .