# RCAEval

Canonical source: <https://github.com/phamquiluan/RCAEval>
Official dataset: <https://zenodo.org/records/14590730>
Official package: <https://pypi.org/project/RCAEval/>
Introducing source: [RCAEval](../../wiki/sources/rcaeval-2025.md)

## Dataset Type

RCAEval is a root-cause-analysis benchmark and evaluation framework for microservice systems. It packages datasets, loaders, metrics, and reproducible RCA baselines for coarse-grained root-cause service ranking and fine-grained root-cause indicator ranking.

## System Structure

The benchmark covers Online Boutique, Sock Shop, and Train Ticket. These are whole microservice systems rather than isolated service series. The graph is mostly implicit in system topology and trace relationships, while the framework exposes data to methods through metrics, logs, traces, anomaly timestamps, and tabular/data-frame interfaces.

## Dataset Suites

- RE1: 375 metric-only cases across three systems, covering CPU, memory, disk, delay, and packet loss faults.
- RE2: 270 multi-source cases with metrics, logs, and traces, adding socket faults.
- RE3: 90 multi-source code-level fault cases.

The README describes nine datasets because each suite is split across the three systems. The Zenodo record lists compressed dataset files totaling about 5.16 GB.

## Data Structure

Each case directory follows a benchmark-service-fault-instance convention and may include:

- `metrics.json`: time-series metrics.
- `inject_time.txt`: fault injection timestamp.
- `logs.csv`: logs for RE2 and RE3.
- `traces.csv`: traces for RE2 and RE3.

The framework also requires telemetry inputs to be usable as pandas data frames with a `time` column for many baselines.

## Inputs And Outputs

Inputs are metrics, logs, traces, and the anomaly-detected timestamp. Outputs are ranked root-cause services or root-cause indicators. Evaluation metrics include `AC@k` and `Avg@k`.

## Reported Baselines

RCAEval includes metric-based, trace-based, and multi-source RCA baselines, including RUN, CausalRCA, CIRCA, RCD, MicroCause, EasyRCA, MSCRED, BARO, epsilon-Diagnosis, TraceRCA, MicroRank, PDiagnose, multi-source BARO, multi-source RCD, multi-source CIRCA, and TORAI.

## Actions Or Interventions

Fault injections are controlled experimental events. RCAEval does not expose logged operator remediations, rollback decisions, autoscaling changes, or other controllable action channels for action-conditioned world-model training.

## Access And License Notes

The Zenodo dataset record lists CC-BY-4.0. The GitHub framework repository uses MIT. This knowledge base records metadata only and does not mirror dataset archives.