FADE

Summary

FADE, or Forgetting through Adaptive Decay, is an online continual-learning method that adapts per-parameter weight decay rates with meta-gradients. It treats weight decay as a learnable memory horizon rather than only as regularization.

Interface

  • Memory unit: neural-network weight or final-layer parameter.
  • Forgetting control: per-parameter decay rate lambda_i.
  • Adaptation signal: online meta-gradient through the weight update.
  • Main use case in the paper: online non-stationary learning where some targets are stable and others change.

Role In The Wiki

FADE is the local object card for controlled parameter forgetting. It is useful because the wiki often treats forgetting as a retention failure, but continual agents also need to discard stale mappings. The durable distinction is selective forgetting versus destructive forgetting.

For time-series and operational world models, FADE is not direct evidence yet. It suggests a design principle: stable dynamics, rare safety-relevant knowledge, temporary incident facts, and changing policies may need different memory horizons.

Evidence