RCAEval

Source

Dataset metadata snapshot: rcaeval-2025
Official GitHub: https://github.com/phamquiluan/RCAEval
Official dataset: https://zenodo.org/records/14590730
Official package: https://pypi.org/project/RCAEval/
arXiv: https://arxiv.org/abs/2412.17015
ACM WWW 2025: https://dl.acm.org/doi/10.1145/3701716.3715290

Core Claim

RCAEval is a reproducible root-cause-analysis benchmark for microservice systems. It contributes datasets, loaders, evaluation metrics, and baseline implementations for metric-based, trace-based, and multi-source RCA.

Dataset Notes

RCAEval covers Online Boutique, Sock Shop, and Train Ticket.
It organizes nine datasets under RE1, RE2, and RE3, with 735 failure cases and 11 fault types.
RE1 is metric-only. RE2 and RE3 include metrics, logs, and traces.
Each failure case includes annotated root-cause service and root-cause indicator labels.
The Zenodo archives total about 5.16 GB compressed.

Reported Baselines

The framework includes RUN, CausalRCA, CIRCA, RCD, MicroCause, EasyRCA, MSCRED, BARO, epsilon-Diagnosis, TraceRCA, MicroRank, PDiagnose, multi-source BARO, multi-source RCD, multi-source CIRCA, and TORAI.

Why It Matters

RCAEval is the most practical evaluation harness in this group for comparing RCA methods. It is less graph-native than ChronoGraph or ops-lite, but stronger as a reproducible benchmark environment with many baselines.

Gotchas

The input surface is often flattened into files or data frames, so graph structure must usually come from system knowledge or traces.
Fault injections are benchmark events, not an operator action channel.
The dataset record is CC-BY-4.0, while the framework code is MIT.

Alex Knowledge Base

Explorer

RCAEval

RCAEval

Source

Core Claim

Dataset Notes

Reported Baselines

Why It Matters

Gotchas

Links Into The Wiki

Graph View

Table of Contents

Backlinks