Context is Key

Summary

Context is Key (CiK) is a ServiceNow benchmark for probabilistic forecasting with essential natural-language context. It is the wiki’s main dataset anchor for testing whether a model can combine numeric time-series history with textual context that changes or constrains the forecast.

Official Artifacts

Dataset Shape

CiK contains 71 manually designed tasks, five deterministic instances per task, and 355 examples in the Hugging Face test split. The paper reports 2,644 underlying time series from seven domains, with observations ranging from 10-minute to monthly frequency.

Role In The Wiki

Use CiK when the question is whether a time-series model can use textual context as a first-class conditioning signal. It is especially relevant for text-conditioned time series, event-aware forecasting, context compression, and world-model interfaces that need domain knowledge or scenario information.

CiK should not be used as evidence that a model handles action-conditioned world modeling. Some contexts describe future events, constraints, or causal relationships, but the benchmark does not expose a policy-controlled action channel.

Gotchas

  • Benchmark-first, not training-corpus-first: the official Hugging Face dataset has only evaluation splits.
  • Natural-language context is essential by design, so numeric-only forecasters are being tested on an intentionally incomplete interface.
  • Current CiK tasks are univariate and text-only.
  • Results depend on RCRPS and task weighting; compare with ordinary forecasting leaderboards carefully.