Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Dhati+: Fine-tuned Large Language Models for Arabic Subjectivity Evaluation

Created by
  • Haebom

Author

Slimane Bellaouar, Attia Nehar, Soumia Souffi, Mounia Bouameur

Outline

This paper presents a novel approach for subjectivity analysis in Arabic. While Arabic is linguistically rich and morphologically complex, the lack of large-scale annotated data hinders the development of accurate tools. This study leverages existing Arabic datasets (ASTD, LABR, HARD, SANAD) to build a comprehensive dataset, AraDhati+, and fine-tunes state-of-the-art Arabic language models (XLM-RoBERTa, AraBERT, ArabianGPT) to perform subjectivity classification. By additionally utilizing an ensemble decision-making approach, we achieved a high accuracy of 97.79%, demonstrating that this approach is effective in addressing resource constraints in Arabic processing.

Takeaways, Limitations

Takeaways:
A novel approach for analyzing Arabic subjectivity and the AraDhati+ dataset are presented.
Achieving high accuracy (97.79%) by leveraging a state-of-the-art Arabic language model.
Contributing to solving the resource shortage problem in the field of Arabic natural language processing.
Presenting the possibility of performance improvement through ensemble techniques.
Limitations:
Lack of detailed description of the composition and quality of the AraDhati+ dataset.
Lack of detailed description of the characteristics of the Arabic language model used and the reasons for its choice.
Lack of comparative analysis with other subjectivity analysis methodologies.
Lack of review of the dataset for bias and generalizability.
Lack of performance evaluation in real-world applications.
👍