Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Cross-Modality Controlled Molecule Generation with Diffusion Language Model

Created by
  • Haebom

Author

Yunzhe Zhang, Yifei Wang, Khanh Vinh Nguyen, Pengyu Hong

Outline

To overcome the limitations of the existing SMILES-based molecular generation diffusion model, which only supports single-modal constraints, this paper proposes Cross-Modality Controlled Molecule Generation with Diffusion Language Model (CMCM-DLM), which supports multi-modal constraints and the addition of new constraints. CMCM-DLM applies constraints of various modalities, such as molecular structures and chemical properties, stepwise by adding a Structure Control Module (SCM) and a Property Control Module (PCM) to a pre-trained diffusion model. The SCM establishes the molecular skeleton in the early stage, and the PCM fine-tunes the chemical properties of the generated molecules to target values in the later stage. Experimental results demonstrate the efficiency and adaptability of CMCM-DLM, suggesting a significant advancement in molecule generation in the field of new drug discovery.

Takeaways, Limitations

Takeaways:
We overcome the limitations of existing models by proposing a molecular production model that supports multimodal constraints.
We present an efficient method to leverage pre-trained models by adding new constraints without retraining.
Proof of its potential for use in molecular generation in various fields, including new drug development.
Implementation of effective control functions through separate application of structure and characteristic control modules.
Limitations:
Further research is needed on the application of constraints to modes other than the two presented (molecular structure, chemical properties).
Generalization performance evaluation is needed for molecules of various sizes and complexities.
Further research is needed on the model's interpretability and explainability.
Performance evaluation and scalability verification on large-scale datasets are required.
👍