# BOOM: Benchmark of Observability Metrics

Canonical source: https://huggingface.co/datasets/Datadog/BOOM

Official code: https://github.com/DataDog/toto/tree/main/boom

Official leaderboard: https://huggingface.co/spaces/Datadog/BOOM

Introducing source: [Toto](../../wiki/sources/toto-2025.md)

## Dataset Type

Observability metrics forecasting benchmark.

## Temporal Structure

BOOM represents one metric query as one univariate or multivariate time series. Groups returned by the same query become related variates. The benchmark uses Datadog internal pre-production monitoring data, with metadata for sampling frequency, series length, and variate count.

## Actions Or Interventions

None. BOOM is a passive forecasting benchmark. It does not include deployments, rollbacks, autoscaling changes, traffic-control commands, remediations, or other operator actions as first-class forecast-conditioning channels.

## Reported Scale

- About 350 million time-series points.
- 2,807 metric queries.
- 32,887 variates.
- Up to 100 variates per benchmark entry.
- Metric taxonomy includes gauge, rate, distribution, and count.
- Domain taxonomy includes application usage, infrastructure, database, networking, and security.

The Toto paper also defines BOOMlet as a smaller representative subset with 32 metric queries, 1,627 variates, and about 23 million observation points.

## Suitability Note

BOOM is the most directly relevant public benchmark in this repository for observability-style high-dimensional forecasting. It captures high cardinality, nonstationarity, missing intervals, heavy tails, sparse spikes, scale changes, and grouped metric variates. It is not sufficient for action-conditioned observability world models because it does not expose operational interventions.

## Access And License Notes

The Hugging Face dataset page lists Apache-2.0. Datadog states that BOOM was generated from internal monitoring of pre-production systems and does not include customer data.