Daily Arxiv

This page organizes papers related to artificial intelligence published around the world.
This page is summarized using Google Gemini and is operated on a non-profit basis.
The copyright of the paper belongs to the author and the relevant institution. When sharing, simply cite the source.

UI-UG: A Unified MLLM for UI Understanding and Generation

Created by
  • Haebom

Author

Hao Yang, Weijie Qiu, Ru Zhang, Zhou Fang, Ruichao Mao, Xiaoyu Lin, Maji Huang, Zhaosong Huang, Teng Guo, Shuoyang Liu, Hai Rao

UI-UG: Unified MLLM for UI Understanding and Generation

Outline

In this paper, we introduce UI-UG, which integrates UI understanding and generation capabilities. UI-UG combines SFT and GRPO for precise understanding of complex UI data and uses DPO to generate human-friendly UIs. Furthermore, we propose an industrially effective workflow that includes an LLM-friendly DSL design, training strategy, rendering process, and evaluation metrics. Experimental results demonstrate that UI-UG achieves state-of-the-art performance in UI understanding tasks and is competitive in UI generation performance.

Takeaways, Limitations

Takeaways:
By integrating UI understanding and generation capabilities, we have improved the performance of both tasks.
We present a workflow considering industrial applications.
It showed superior performance compared to existing models in UI understanding tasks.
Achieved competitive performance with less computational cost in UI generation.
Limitations:
The paper alone does not provide any insight into the specific Limitations implications of UI-UG (e.g., performance degradation for specific UI types, limitations of the DSL, etc.).
No specific examples of successful industrial applications of the improved workflow were presented.
👍