[공지사항]을 빙자한 안부와 근황 
Show more

Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

HMID-Net: An Exploration of Masked Image Modeling and Knowledge Distillation in Hyperbolic Space

Created by
  • Haebom

Author

Changli Wang, Fang Yin, Jiafeng Liu, Rui Wu

Outline

In this paper, we propose HMID-Net, a novel method that integrates mask image modeling (MIM) and knowledge distillation to effectively learn the hierarchical structure of visual and semantic concepts in hyperbolic space. Compared with the existing MERU model that successfully applied multi-modal learning to hyperbolic space, HMID-Net enables more efficient model learning by utilizing MIM and knowledge distillation. In particular, it introduces a knowledge distillation loss function specialized in hyperbolic space to support effective knowledge transfer. Experimental results show that HMID-Net significantly outperforms existing models such as MERU and CLIP on image classification and retrieval tasks.

Takeaways, Limitations

Takeaways:
We demonstrate that efficient and high-performance multimodal model learning is possible by utilizing MIM and knowledge distillation techniques in hyperbolic space.
We propose a new knowledge distillation loss function suitable for hyperbolic spaces and verify its effectiveness.
Achieves performance that surpasses existing best-performing models in various downstream tasks such as image classification and retrieval.
Limitations:
Further research is needed on the generalization performance of the method presented in this paper.
Applicability and performance evaluation for other types of multi-modal data are needed.
Lack of theoretical analysis of MIM and knowledge distillation techniques in hyperbolic space.
👍