Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

SKA-Bench: A Fine-Grained Benchmark for Evaluating Structured Knowledge Understanding of LLMs

Created by
  • Haebom

Author

Zhiqiang Liu, Enpei Niu, Yin Hua, Mengshu Sun, Lei Liang, Huajun Chen, Wen Zhang

Outline

This paper proposes SKA-Bench, a novel benchmark for evaluating the structured knowledge (SK) understanding ability of large-scale language models (LLMs). SKA-Bench includes four types of SKs: knowledge graphs (KGs), tables, KG+text, and tables+text, and consists of questions, answers, positive knowledge units, and negative knowledge units. To precisely evaluate the SK understanding ability of LLMs, we assess four aspects: robustness to noise, order sensitivity, information integration ability, and negative information rejection ability. Experiments on eight representative LLMs reveal that existing LLMs still struggle with SK understanding, and their performance is affected by factors such as the amount of noise, the order of knowledge units, and hallucinations. The dataset and code are available on GitHub.

Takeaways, Limitations

Takeaways:
We present SKA-Bench, a new benchmark that comprehensively and rigorously assesses LLM's ability to understand structured knowledge.
We uncovered the limitations of the existing LLM's ability to understand structured knowledge in various aspects (noise, order, information integration, and rejection of negative information).
We suggest research directions for improving the performance of LLM.
Supports follow-up research through publicly available datasets and code.
Limitations:
The types of structured knowledge covered by SKA-Bench may be limited.
There is room for improvement in evaluation metrics and methodology.
The types of LLM used in the experiment may be more diverse.
👍