Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

EmbeddingGemma: Powerful and Lightweight Text Representations

Created by
  • Haebom

Author

Henrique Schechter Vera, Sahil Dua, Biao Zhang, Daniel Salz, Ryan Mullins, Sindhu Raghuram Panyam, Sara Smoot, Iftekhar Naim, Joe Zou, Feiyang Chen, Daniel Cer, Alice Lisak, Min Choi, Lucas Gonzalez, Omar Sanseviero, Glenn Cameron, Ian Ballantyne, Kat Black, Kaifeng Chen, Weiyi Wang, Zhe Li, Gus Martins, Jinhyuk Lee, Mark Sherwood, Juyeong Ji, Renjie Wu, Jingxiao Zheng, Jyotinder Singh, Abheesht Sharma, Divyashree Sreepathihalli, Aashi Jain, Adham Elarabawy, AJ Co, Andreas Doumanoglou, Babak Samari, Ben Hora, Brian Potetz, Dahun Kim, Enrique Alfonseca, Fedor Moiseev, Feng Han, Frank Palma Gomez, Gustavo Hern Andez Abrego, Hesen Zhang, Hui Hui, Jay Han, Karan Gill, Ke Chen, Koert Chen, Madhuri Shanbhogue, Michael Boratko, Paul Suganthan, Sai Meher Karthik Duddu, Sandeep Mariserla, Setareh Ariafar, Shanfeng Zhang, Shijie Zhang, Simon Baumgartner, Sonam Goenka, Steve Qiu, Tanmaya Dabral, Trevor Walker, Vikram Rao, Waleed Khawaja, Wenlei Zhou, Xiaoqi Ren, Ye Warkentin, Armand Joulin, Tom Duerig, Mojtaba Seyedhosseini

Outline

We present EmbeddingGemma, a lightweight open-text embedding model based on the Gemma 3 language model. This model leverages knowledge from large models through encoder-decoder initialization and geometric embedding distillation, and enhances model robustness and expressiveness through spread-out regularization. We ensure generalization performance by merging checkpoints from various optimization mixes. It demonstrates outstanding performance on the MTEB benchmark, achieving the highest performance among models with fewer than 500 million parameters. Performance is maintained even with model weight quantization or embedding output truncation, making it suitable for low-latency and high-throughput use cases.

Takeaways, Limitations

Takeaways:
Providing high-performance embedding models at low cost.
Suitable for low-latency and high-throughput environments such as on-device applications.
Contributing to research activation through open source models.
Limitations:
Specific Limitations not specified in the paper (e.g., performance degradation on specific tasks, data bias, etc.).
The scope of the study is limited to performance evaluation and technical aspects of the model.
👍