Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

FW-Merging: Scaling Model Merging with Frank-Wolfe Optimization

Created by
  • Haebom

Author

Hao Mark Chen, Shell Xu Hu, Wayne Luk, Timothy Hospedales, Hongxiang Fan

Outline

This paper presents a novel approach, Frank-Wolfe Merging (FW-Merging), to address the limitations of model merging, a data-efficient approach for multi-task learning (MTL). To address the scalability challenges of existing model merging methods when merging multiple models from diverse model sources, FW-Merging formulates model merging as a constraint optimization problem. Inspired by Frank-Wolfe optimization, it linearly approximates the objective function and iteratively selects and merges highly relevant models. FW-Merging can be integrated with existing merging methods to improve performance, and it boasts the advantages of being applicable to diverse model sources and having a consistent memory overhead.

Takeaways, Limitations

Takeaways:
It is suitable for a variety of model sources and is valid even in situations where model and task information is partially known.
It exhibits stable performance even when merging multiple models and has excellent scalability.
It can be further improved in performance by integrating with existing merging methods.
Unlike data-driven merging methods, it maintains a constant memory overhead.
It outperforms state-of-the-art model merging techniques.
Limitations:
The specific Limitations is not specified in the paper.
👍