Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Transferable Model-agnostic Vision-Language Model Adaptation for Efficient Weak-to-Strong Generalization

Created by
  • Haebom

Author

Jihwan Park, Taehoon song, Sanghyeok Lee, Miso Choi, Hyunwoo J. Kim

Outline

This paper proposes TransMiter, a lightweight adapter for efficient adaptive knowledge transfer of vision-language models (VLMs). TransMiter captures knowledge gaps between pre-trained and fine-tuned VLMs using an unsupervised learning approach, transferring knowledge without backpropagation. It consists of a small number of layers, has minimal inference cost, and adding a small amount of labeled data improves performance beyond the fine-tuned, robust model. Experimental results demonstrate that TransMiter effectively and efficiently transfers adaptive knowledge across VLMs of various sizes and architectures, while maintaining generalization capabilities.

Takeaways, Limitations

Takeaways:
An efficient method for enabling adaptive knowledge transfer in VLM without backpropagation is presented.
Minimize inference costs with lightweight adapter design.
Improve performance by leveraging small amounts of labeled data.
Maintaining excellent performance and generalization ability across VLMs of various sizes and architectures.
Limitations:
TransMiter's performance improvements may be limited to specific datasets or tasks.
Possible performance degradation due to limitations of unsupervised learning methods.
Further validation of generalization performance across various VLM architectures is needed.
👍