Daily Arxiv

This is a page that curates AI-related papers published worldwide.
All content here is summarized using Google Gemini and operated on a non-profit basis.
Copyright for each paper belongs to the authors and their institutions; please make sure to credit the source when sharing.

Time-RA: Towards Time Series Reasoning for Anomaly with LLM Feedback

Created by
  • Haebom

Author

Yiyuan Yang, Zichuan Liu, Lei Song, Kai Ying, Zhiguang Wang, Tom Bamford, Svitlana Vyetrenko, Jiang Bian, Qingsong Wen

Outline

To overcome the limitations of conventional binary classification-based time-series anomaly detection, this paper proposes Time-RA (Time-series Reasoning for Anomalies), a novel generative and inference-driven task for time-series anomalies, leveraging large-scale language models (LLMs). We present the RATs40K multimodal benchmark dataset, consisting of approximately 40,000 real-world data samples. Each sample includes numerical time-series data, contextual text, visual representations, detailed anomaly types (14 univariate and 6 multivariate), and structured explanatory reasoning. Accuracy and interpretability are ensured through a sophisticated annotation framework based on GPT-4. Extensive benchmarking of LLMs and multimodal LLMs demonstrates the performance and limitations of current models, emphasizing the importance of supervised learning-based fine-tuning. The dataset and code are made publicly available to support future research.

Takeaways, Limitations

Takeaways:
We propose a new Time-RA task that moves beyond the traditional binary classification anomaly detection and enables fine-grained classification and explanatory inference for anomalies.
RATs40K, a multi-modal (numerical, text, visual) time series anomaly detection benchmark dataset based on real-world data, is released.
Building high-quality datasets with a sophisticated annotation framework based on GPT-4.
Suggesting future research directions through performance evaluation of LLM and multi-mode LLM.
Enabling research through open code and datasets.
Limitations:
Detailed analysis of the performance and limitations of the current model may be lacking.
Further validation of generalization performance on the RATs40K dataset is needed.
Further research is needed on its applicability to various types of time series data.
👍