Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

REAL: Benchmarking Abilities of Large Language Models for Housing Transactions and Services

Created by
  • Haebom

Author

Kexin Zhu, Yang Han

Outline

In this paper, we present REAL (Real Estate Agent Large Language Model Evaluation), the first evaluation tool for evaluating the agent performance of large-scale language models (LLMs) in real estate transaction and service domains. REAL contains 5,316 high-quality evaluation items across four topics: memory, understanding, reasoning, and hallucination, which are organized into 14 categories to evaluate the knowledge and abilities of LLMs in real estate transaction and service scenarios. Experimental results show that even the existing state-of-the-art LLMs still have significant room for improvement before they can be applied to real estate domains.

Takeaways, Limitations

Takeaways: First to present REAL, a standardized assessment tool for evaluating the performance of LLM in the real estate transaction and service sector. Provides empirical analysis of the applicability of LLM to the real estate sector. Presents the current status of LLM and directions for improvement.
Limitations: The scope of the current valuation tool REAL may be limited. It may not perfectly reflect the complexity of actual real estate transactions. There is a possibility of subjectivity and bias in the valuation items. It is necessary to expand the valuation tool to include more diverse and complex scenarios in the future.
👍