Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

TMT: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation

Created by
  • Haebom

Author

Enming Zhang, Zhengyu Li, Yanru Wu, Jingge Wang, Yang Tan, Guan Wang, Yang Li, Xiaoping Zhang

Outline

Despite recent advances in Vision Transformer (ViT), adaptation to new target domains suffers from distribution shift. This paper proposes the Transferable Mask Transformer (TMT), a locally adaptive framework that enhances cross-domain representation learning through transferability guidance. TMT dynamically segments images into coherent regions based on structural and semantic similarity and locally estimates domain transferability. This information is then integrated into ViT's self-attention mechanism, enabling it to adaptively focus on regions with low transferability and high semantic uncertainty. Extensive experiments conducted on 20 diverse cross-domain settings demonstrate that TMT mitigates the performance degradation associated with domain shift and outperforms existing approaches.

Takeaways, Limitations

Takeaways:
Improving cross-domain performance with domain-specific transitivity-based ViT adaptation.
Outstanding performance compared to existing methodologies.
Limitations:
The specific Limitations is not mentioned in the paper (although it does mention that it seeks to address difficulties caused by "distribution shifts").
👍