Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)

Oluwasegun Adegoke

2025-10-22

Predicting the Unpredictable: Reproducible BiLSTM Forecasting of Incident Counts in the Global Terrorism Database (GTD)

Summary

This research focuses on predicting the number of terrorist incidents each week, using data from 1970 to 2016. The goal is to see if advanced machine learning can accurately forecast these events, and provide a reliable starting point for future research.

What's the problem?

Predicting terrorism is really hard because it's complex and influenced by many factors. Existing methods often aren't very accurate, and it's difficult to compare different approaches because research isn't always shared openly. The challenge is to build a forecasting model that's better than simpler methods and is also transparent so others can build upon it.

What's the solution?

The researchers built a forecasting system using a type of neural network called a Bidirectional LSTM, which is good at understanding patterns in sequences of data. They compared this to simpler forecasting methods like looking at past trends and using a standard statistical model called ARIMA. They carefully set up the data to ensure fair comparisons and tested their model on data it hadn't seen before. They also experimented with different ways to feed information into the model, like varying the amount of historical data used and the features included, to see what worked best.

Why it matters?

This work is important because it provides a strong, publicly available baseline for forecasting terrorism incidents. The BiLSTM model they developed outperformed other methods, meaning it's more accurate at predicting weekly incident counts. By sharing their code and data, they're helping other researchers improve these forecasts and potentially contribute to efforts to understand and prevent terrorism. The study also highlights which factors seem to be most important in predicting incidents, like past incident counts.

Abstract

We study short-horizon forecasting of weekly terrorism incident counts using the Global Terrorism Database (GTD, 1970--2016). We build a reproducible pipeline with fixed time-based splits and evaluate a Bidirectional LSTM (BiLSTM) against strong classical anchors (seasonal-naive, linear/ARIMA) and a deep LSTM-Attention baseline. On the held-out test set, the BiLSTM attains RMSE 6.38, outperforming LSTM-Attention (9.19; +30.6\%) and a linear lag-regression baseline (+35.4\% RMSE gain), with parallel improvements in MAE and MAPE. Ablations varying temporal memory, training-history length, spatial grain, lookback size, and feature groups show that models trained on long historical data generalize best; a moderate lookback (20--30 weeks) provides strong context; and bidirectional encoding is critical for capturing both build-up and aftermath patterns within the window. Feature-group analysis indicates that short-horizon structure (lagged counts and rolling statistics) contributes most, with geographic and casualty features adding incremental lift. We release code, configs, and compact result tables, and provide a data/ethics statement documenting GTD licensing and research-only use. Overall, the study offers a transparent, baseline-beating reference for GTD incident forecasting.

View Paper