Glocal Information Bottleneck for Time Series Imputation

Jie Yang, Kexin Zhang, Guibin Zhang, Philip S. Yu, Kaize Ding

2025-10-09

Glocal Information Bottleneck for Time Series Imputation

Summary

This paper tackles the problem of filling in missing data in time series, which is data collected over time like stock prices or weather patterns.

What's the problem?

When time series data has a lot of missing values, current methods can seem to work well during training, but they often fail when actually predicting the missing parts. This happens because they focus too much on accurately recreating the known data points and don't pay enough attention to the overall structure and relationships within the entire time series. Essentially, they get caught up in the details and lose sight of the big picture, leading to inaccurate predictions and a distorted understanding of the data.

What's the solution?

The researchers propose a new training method called Glocal-IB. It's designed to work with any existing time series model and adds a special 'loss' function that encourages the model to maintain a global understanding of the data. This loss works by making sure the model represents both the observed data and the predicted (imputed) data in a similar way, forcing it to capture the overall patterns and relationships even with missing information. It's like giving the model a 'check' to make sure it's not just memorizing the known data but actually learning the underlying trends.

Why it matters?

This research is important because accurately filling in missing data in time series is crucial for many real-world applications. Better imputation leads to more reliable analysis and predictions in fields like finance, healthcare, and environmental monitoring. By addressing the issue of overfitting to local noise and promoting a global understanding of the data, this method improves the accuracy and robustness of time series analysis, especially when dealing with datasets that have a lot of missing values.

Abstract

Time Series Imputation (TSI), which aims to recover missing values in temporal data, remains a fundamental challenge due to the complex and often high-rate missingness in real-world scenarios. Existing models typically optimize the point-wise reconstruction loss, focusing on recovering numerical values (local information). However, we observe that under high missing rates, these models still perform well in the training phase yet produce poor imputations and distorted latent representation distributions (global information) in the inference phase. This reveals a critical optimization dilemma: current objectives lack global guidance, leading models to overfit local noise and fail to capture global information of the data. To address this issue, we propose a new training paradigm, Glocal Information Bottleneck (Glocal-IB). Glocal-IB is model-agnostic and extends the standard IB framework by introducing a Global Alignment loss, derived from a tractable mutual information approximation. This loss aligns the latent representations of masked inputs with those of their originally observed counterparts. It helps the model retain global structure and local details while suppressing noise caused by missing values, giving rise to better generalization under high missingness. Extensive experiments on nine datasets confirm that Glocal-IB leads to consistently improved performance and aligned latent representations under missingness. Our code implementation is available in https://github.com/Muyiiiii/NeurIPS-25-Glocal-IB.

View Paper