EBES: Easy Benchmarking for Event Sequences

Dmitry Osin, Igor Udovichenko, Viktor Moskvoretskii, Egor Shvetsov, Evgeny Burnaev

2024-10-09

EBES: Easy Benchmarking for Event Sequences

Summary

This paper introduces EBES, a new benchmarking tool designed to evaluate how well different models handle event sequences, which are important in fields like healthcare and finance.

What's the problem?

Event sequences often have irregular timing and a mix of different types of data, making it hard to compare how well different models perform on them. Currently, there are no standard benchmarks for these tasks, which can lead to confusion about which models are truly effective.

What's the solution?

The authors created EBES to provide a standardized way to evaluate models on event sequences. This tool includes predefined scenarios for testing, a library for easy dataset addition, and a novel synthetic dataset along with real-world datasets. They also analyzed existing datasets to identify which ones are suitable for comparison and investigated how well models can handle the timing and order of events.

Why it matters?

This research is significant because it establishes a clear framework for evaluating models that work with event sequences. By providing standardized benchmarks, EBES helps researchers compare results more easily, leading to better understanding and improvements in the field. This can ultimately enhance applications in various industries that rely on analyzing complex sequential data.

Abstract

Event sequences, characterized by irregular sampling intervals and a mix of categorical and numerical features, are common data structures in various real-world domains such as healthcare, finance, and user interaction logs. Despite advances in temporal data modeling techniques, there is no standardized benchmarks for evaluating their performance on event sequences. This complicates result comparison across different papers due to varying evaluation protocols, potentially misleading progress in this field. We introduce EBES, a comprehensive benchmarking tool with standardized evaluation scenarios and protocols, focusing on regression and classification problems with sequence-level targets. Our library simplifies benchmarking, dataset addition, and method integration through a unified interface. It includes a novel synthetic dataset and provides preprocessed real-world datasets, including the largest publicly available banking dataset. Our results provide an in-depth analysis of datasets, identifying some as unsuitable for model comparison. We investigate the importance of modeling temporal and sequential components, as well as the robustness and scaling properties of the models. These findings highlight potential directions for future research. Our benchmark aim is to facilitate reproducible research, expediting progress and increasing real-world impacts.

View Paper