GAIA / MicroSS
Source
- Dataset metadata snapshot: gaia-micross-2021
- Official GitHub: https://github.com/CloudWise-OpenSource/GAIA-DataSet
Core Claim
GAIA is an AIOps dataset collection for anomaly detection, log analysis, fault localization, and related tasks. MicroSS is the graph/system-relevant subset: a business-simulation microservice system with metrics, traces, business logs, and anomaly-injection records.
Dataset Notes
- MicroSS contains more than 6500 metrics, more than 7000000 log items, and detailed trace data collected over an initial two-week window.
- The trace schema includes service names, host IPs, trace IDs, span IDs, parent IDs, URLs, status codes, and messages.
- Companion Data contains 406 anomaly-detection and metric-prediction series, including 279 labeled series, plus about 218736 log records.
- MicroSS is system-level, but the public layout is not a single graph object. Service call structure is reconstructed from traces.
Why It Matters
GAIA/MicroSS is useful when a benchmark needs AIOps-style metrics, logs, traces, business logs, and anomaly-injection records from a whole microservice scenario. It is less graph-native than ChronoGraph but richer in trace/log context.
Gotchas
- Companion Data should not be treated as a whole-service-graph dataset; it is closer to single-series KPI and log-task data.
- License metadata is inconsistent: the repository
LICENSEfile is GPL-2.0, while the README license section says Apache 2.0. - Anomaly injections are exogenous benchmark events, not logged operator remediations.