MONSTER: Monash Scalable Time Series Evaluation Repository

Angus Dempster, Navid Mohammadi Foumani, Chang Wei Tan, Lynn Miller, Amish Mishra, Mahsa Salehi, Charlotte Pelletier, Daniel F. Schmidt, Geoffrey I. Webb

2025-02-25

MONSTER: Monash Scalable Time Series Evaluation Repository

Summary

This paper talks about MONSTER, a new collection of large datasets designed to help researchers test and improve AI models that classify time series data, which involves analyzing patterns over time.

What's the problem?

Current benchmarks for testing time series classification models are based on small datasets, which makes it hard to develop models that can handle larger, real-world problems. These small datasets also favor models that are optimized for low error rates but don't focus on scalability or practical challenges.

What's the solution?

The researchers created MONSTER, a repository of large datasets that better represent the challenges of real-world time series classification. These datasets are designed to test how well AI models can handle larger amounts of data while maintaining accuracy and scalability. The repository also provides a standardized way to evaluate different models.

Why it matters?

This matters because it pushes the field of time series classification forward by encouraging the development of AI models that can tackle bigger and more complex problems. It helps ensure that research in this area is more relevant to real-world applications, like weather forecasting, stock market analysis, and medical monitoring.

Abstract

We introduce MONSTER-the MONash Scalable Time Series Evaluation Repository-a collection of large datasets for time series classification. The field of time series classification has benefitted from common benchmarks set by the UCR and UEA time series classification repositories. However, the datasets in these benchmarks are small, with median sizes of 217 and 255 examples, respectively. In consequence they favour a narrow subspace of models that are optimised to achieve low classification error on a wide variety of smaller datasets, that is, models that minimise variance, and give little weight to computational issues such as scalability. Our hope is to diversify the field by introducing benchmarks using larger datasets. We believe that there is enormous potential for new progress in the field by engaging with the theoretical and practical challenges of learning effectively from larger quantities of data.

View Paper