This page organizes papers related to artificial intelligence published around the world. This page is summarized using Google Gemini and is operated on a non-profit basis. The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.
TMT: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation
Created by
Haebom
Author
Enming Zhang, Zhengyu Li, Yanru Wu, Jingge Wang, Yang Tan, Guan Wang, Yang Li, Xiaoping Zhang
Outline
Despite recent advances in Vision Transformer (ViT), adaptation to new target domains suffers from distribution shift. This paper proposes the Transferable Mask Transformer (TMT), a locally adaptive framework that enhances cross-domain representation learning through transferability guidance. TMT dynamically segments images into coherent regions based on structural and semantic similarity and locally estimates domain transferability. This information is then integrated into ViT's self-attention mechanism, enabling it to adaptively focus on regions with low transferability and high semantic uncertainty. Extensive experiments conducted on 20 diverse cross-domain settings demonstrate that TMT mitigates the performance degradation associated with domain shift and outperforms existing approaches.
Takeaways, Limitations
•
Takeaways:
◦
Improving cross-domain performance with domain-specific transitivity-based ViT adaptation.
◦
Outstanding performance compared to existing methodologies.
•
Limitations:
◦
The specific Limitations is not mentioned in the paper (although it does mention that it seeks to address difficulties caused by "distribution shifts").