Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

PLAME: Lightweight MSA Design Advances Protein Folding From Evolutionary Embeddings

Created by
  • Haebom

Author

Hanqun Cao, Xinyi Zhou, Zijun Gao, Chenyu Wang, Xin Gao, Zhi Zhang, Chunbin Gu, Ge Liu, Pheng-Ann Heng

Outline

PLAME is a lightweight MSA design framework proposed to address the poor performance of multiple sequence alignments (MSAs) for low-similarity and orphan proteins. It leverages evolutionary embeddings from pretrained protein language models to generate MSAs that better support downstream folding. We combine the MSA generation with conservation-diversity loss, which balances consensus on conserved positions with the inclusiveness of plausible sequence variants. We develop an MSA selection strategy to filter high-quality candidates and a sequence quality metric to predict folding improvement. We demonstrate significant improvements in structural accuracy (e.g., lDDT/TM-score) on the AlphaFold2 low-similarity/orphan benchmark, and consistent improvements are observed when used with AlphaFold3. We also demonstrate its utility as a lightweight adapter for ESMFold, achieving AlphaFold2-level accuracy while maintaining ESMFold-level inference speed. In conclusion, PLAME provides a practical method for high-quality folding of proteins lacking strong evolutionary neighbors.

Takeaways, Limitations

Takeaways:
Improved protein structure prediction accuracy for low-similarity and orphan proteins (improved performance of AlphaFold2, AlphaFold3, and ESMFold).
Increased computational efficiency due to lightweight design.
Improving MSA quality and enhancing prediction performance through MSA selection strategies and sequence quality metrics.
Improving accessibility to protein structure prediction by improving the accuracy of ESMFold.
Limitations:
Since this paper focuses on improving the performance of specific protein language models and structural prediction models of the AlphaFold family, further research is needed to determine their generalizability to other models or methodologies.
Further validation is needed to determine whether PLAME's performance improvements are consistent across all low-similarity and orphan proteins.
Further research may be needed to optimize MSA selection strategies and sequence quality metrics.
👍