GAIA / MicroSS

Source

Core Claim

GAIA is an AIOps dataset collection for anomaly detection, log analysis, fault localization, and related tasks. MicroSS is the graph/system-relevant subset: a business-simulation microservice system with metrics, traces, business logs, and anomaly-injection records.

Dataset Notes

  • MicroSS contains more than 6500 metrics, more than 7000000 log items, and detailed trace data collected over an initial two-week window.
  • The trace schema includes service names, host IPs, trace IDs, span IDs, parent IDs, URLs, status codes, and messages.
  • Companion Data contains 406 anomaly-detection and metric-prediction series, including 279 labeled series, plus about 218736 log records.
  • MicroSS is system-level, but the public layout is not a single graph object. Service call structure is reconstructed from traces.

Why It Matters

GAIA/MicroSS is useful when a benchmark needs AIOps-style metrics, logs, traces, business logs, and anomaly-injection records from a whole microservice scenario. It is less graph-native than ChronoGraph but richer in trace/log context.

Gotchas

  • Companion Data should not be treated as a whole-service-graph dataset; it is closer to single-series KPI and log-task data.
  • License metadata is inconsistent: the repository LICENSE file is GPL-2.0, while the README license section says Apache 2.0.
  • Anomaly injections are exogenous benchmark events, not logged operator remediations.