This is a page that curates AI-related papers published worldwide. All content here is summarized using Google Gemini and operated on a non-profit basis. Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.
RELRaE: LLM-Based Relationship Extraction, Labeling, Refinement, and Evaluation
Created by
Haebom
Author
George Hannah, Jacopo de Berardinis, Terry R. Payne, Valentina Tamma, Andrew Mitchell, Ellen Piercy, Ewan Johnson, Andrew Ng, Harry Rostron, Boris Konev
Outline
This paper presents a method to enrich XML schemas to lay the foundation for ontology schemas in the process of transforming massive XML data generated from laboratory robotic experiments into knowledge graphs. To this end, we propose a RELRaE framework that leverages large-scale language models (LLMs) at various stages to extract and accurately label relations implicitly existing in XML schemas. We investigate and evaluate the ability of LLMs to accurately generate relation labels, showing that LLMs can effectively support relation label generation in laboratory automation environments and, more generally, play an important role in semi-automated ontology generation frameworks.
Takeaways, Limitations
•
Takeaways:
◦
We present an effective method for extracting and labeling relations in XML schemas using LLM.
◦
Contributes to improving data interoperability in laboratory automation environments.
◦
Contributes to increasing the efficiency of the semi-automatic ontology generation framework.
◦
Presenting the possibility of applying LLM to the field of ontology creation.
•
Limitations:
◦
Lack of detailed information on the performance evaluation of the RELRaE framework (specific metrics, datasets, etc.).
◦
Further research is needed to determine generalizability to different types of XML data and laboratory environments.
◦
A complementary strategy is needed to address the limitations of LLM (e.g., the possibility of generating incorrect labels).
◦
Consideration needs to be given to the computational cost and resource consumption of large-scale language models.