OpenRCA

Source

Core Claim

OpenRCA reframes microservice/software RCA as an LLM and agent benchmark. A model receives a natural-language query and must inspect telemetry to produce root-cause datetime, component, and reason.

Dataset Notes

  • The OpenReview paper reports 335 failures from three enterprise software systems and more than 68 GB of telemetry.
  • The GitHub README names the systems as Telecom, Bank, and Market.
  • The telemetry directory contains logs, metrics, and traces under date-stamped folders.
  • Inputs include KPI time series, dependency trace graphs, semi-structured logs, and natural-language queries.

Reported Baselines

OpenRCA introduces RCA-agent, which uses Python retrieval and analysis to avoid forcing all telemetry into the LLM context. The repository also includes standard, balanced, and oracle-style evaluation scripts.

Why It Matters

OpenRCA is the best fit in this group for evaluating LLM-agent investigation behavior over large telemetry, not for training a pure numeric graph time-series model.

Gotchas

  • The scoring is strict: the answer must match all required root-cause elements.
  • The README links telemetry through Google Drive and does not state a separate telemetry dataset license.
  • OpenRCA is diagnostic; it does not provide operator remediation actions.