Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Learning to Defer in Congested Systems: The AI-Human Interplay

Created by
  • Haebom

Author

Thodoris Lykouris, Wentao Weng

Outline

This paper presents a model to improve the efficiency of decision-making systems that combine artificial intelligence (AI) and humans, particularly social media content moderation systems. Existing AI-human pipelines rely on simple threshold-based heuristics that fail to account for AI risk estimation uncertainty, temporal variability in content inflow, human review capacity, and selective sampling. In this paper, we propose a model in which AI observes contextual information to make classification and review decisions and schedules review tasks while accounting for delays in the human review system. During the human review process, AI errors are corrected and new data is acquired, with the goal of minimizing the cost of misclassified tasks. We present a suboptimal learning algorithm that carefully balances the classification loss of selectively sampled datasets, the inherent loss of unreviewed tasks, and the delay loss due to congestion in the human review system. Numerical experiments using an online content dataset demonstrate that our model significantly reduces misclassifications compared to existing methods. These results represent the first demonstration of online learning in a contextual queuing system.

Takeaways, Limitations

Takeaways:
Presenting new models and algorithms to improve the efficiency of AI-human collaboration systems.
Experimentally demonstrated that the number of misclassifications can be significantly reduced in online content moderation systems.
We present new research findings on online learning in contextual queuing systems.
Limitations:
Further research and validation are needed for practical application of the proposed model.
Because these experimental results are based on a dataset from a specific social media platform, further research is needed to determine generalizability.
There is a possibility that the model may not sufficiently reflect human reviewer fatigue or subjectivity.
👍