核心概念
Large language models (LLMs) show promise for zero-shot time series anomaly detection, particularly when leveraging their forecasting capabilities, but they still lag behind state-of-the-art deep learning models in performance.
统计
DETECTOR outperforms PROMPTER by 135% in F1 Score.
LLMs are 30% less performant than the best deep learning models.
DETECTOR outperforms Anomaly Transformer in 7 out of 11 datasets.
LLMs achieve a 14.6% higher F1 score than a simple moving average baseline.
The DETECTOR pipeline surpasses ARIMA in 4 out of 11 datasets.
引用
"If LLMs are genuine anomaly detectors, and can be employed directly in zero-shot (without any additional training), they could serve as off-the-shelf anomaly detectors for users, lifting a considerable amount of this burden."
"Our findings, captured in Figure 1(left), show that LLMs improve on a simple moving average baseline. Moreover, they outperform transformer-based models such as Anomaly Transformer."
"However, there is still a gap between classic and deep learning approaches and LLMs."