Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Transferable Mask Transformer: Cross-domain Semantic Segmentation with Region-adaptive Transferability Estimation

작성자
  • Haebom

Author

Jianhua Liu, Zhengyu Li, Yanru Wu, Jingge Wang, Yang Tan, Ruizhe Zhao, Guan Wang, Yang Li

Outline

This paper proposes a region-level adaptation technique to address the performance degradation caused by cross-domain differences in semantic segmentation using Vision Transformers (ViTs). To overcome the limitations of existing global or patch-level adaptation techniques, we dynamically segment images into structurally and semantically consistent regions using the Adaptive Cluster-based Transferability Estimator (ACTE) and assess each region's transferability. Subsequently, the Transferable Masked Attention (TMA) module integrates the region-specific transferability maps into the ViTs' attention mechanism, prioritizing adaptation in regions with low transferability and high semantic uncertainty. Comprehensive evaluation on 20 cross-domain pairs demonstrates an average 2% MIoU improvement over existing methods.

Takeaways, Limitations

Takeaways:
We present a novel method to effectively address the performance degradation of ViTs-based semantic segmentation due to inter-domain differences through domain-level adaptation.
Efficiently analyze regional delivery potential through ACTE and TMA modules and reflect it in the adaptation process.
Excellent performance verification through experimental results for various cross-domain pairs.
Open source code provided.
Limitations:
The ACTE module may be computationally expensive.
Performance improvements may be limited for certain domain combinations.
Additional experiments on different architectures and datasets are needed.
👍