Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

Superior Molecular Representations from Intermediate Encoder Layers

Created by
  • Haebom

Author

Luis Pinto

Outline

Pre-trained molecular encoders have become essential tools in computational chemistry, such as property prediction and molecule generation. However, existing approaches that rely solely on final layer embeddings can discard valuable information. In this study, we analyzed the information flow of five molecular encoders and found that intermediate layers preserve more general features, while the final layer specializes and compresses information. Layer-by-layer evaluations on 22 property prediction tasks revealed that using fixed embeddings from optimal intermediate layers improved performance by an average of 5.4% (up to 28.6%) compared to the final layer. Furthermore, fine-tuning the encoders with intermediate depth truncation yielded even greater improvements, by an average of 8.5% (up to 40.8%), achieving new state-of-the-art performance results across multiple benchmarks.

Takeaways, Limitations

Takeaways:
The intermediate layers of a molecular encoder have more general features than the final layers and may be useful for downstream tasks.
Using fixed embeddings in the optimal intermediate layers or fine-tuning the encoder with truncations in the intermediate layers can lead to improved performance.
Exploring the full representation depth of molecular encoders can improve performance and computational efficiency.
Achieve new peak performance results
Code to be released
Limitations:
The specific Limitations of the presented study is not explicitly mentioned in the abstract.
👍