Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

MORPH: Shape-agnostic PDE Foundation Models

Created by
  • Haebom

Author

Mahindra Singh Rautela, Alexander Most, Siddharth Mansingh, Bradley C. Love, Ayan Biswas, Diane Oyen, Earl Lawrence

Outline

This paper introduces MORPH, a shape-invariant autoregressive foundation model for partial differential equations (PDEs). MORPH is based on a convolutional vision transformer backbone capable of handling multiple fields with varying data dimensions (1D-3D), resolutions, and mixed scalar and vector components. This architecture combines (i) component-wise convolution, which jointly processes scalar and vector channels to capture local interactions; (ii) cross-field attention, which models and selectively propagates information between different physical fields; and (iii) axial attention, which factors global spatiotemporal self-attention along individual spatial and temporal axes to reduce computational burden while maintaining expressiveness. We pretrain multiple model variants on diverse heterogeneous PDE datasets and evaluate their transfer to various downstream prediction tasks. MORPH outperforms models trained from scratch in both zero-shot and full-shot generalization using global model fine-tuning and a parameter-efficient low-rank adapter (LoRA). In extensive evaluations, MORPH matches or surpasses strong baselines and state-of-the-art models.

Takeaways, Limitations

We present a robust model architecture for handling heterogeneous spatiotemporal data sets with different data dimensions, resolutions, and mixed scalar and vector components.
Achieves superior performance over baselines and state-of-the-art models in both zero-shot and full-shot generalization.
Providing a flexible and robust backbone for learning from the heterogeneous and multimodal nature of scientific observations.
Utilize axial attention to reduce computational burden.
Increase reproducibility by making source code, datasets, and models public.
(There is no Limitations specified in the paper)
👍