A Contextual-Bandit Approach to Personalized News Article Recommendation
Source
- Raw Markdown: paper_yahoo-contextual-bandit-2010.md
- PDF: paper_yahoo-contextual-bandit-2010.pdf
Core Claim
The Yahoo! Front Page news recommendation line uses randomized logged traffic to evaluate contextual bandit policies over article actions.
Action-Time-Series Notes
- The sequence is a temporal log of recommendation decisions, contexts, actions, and click rewards.
- It is valuable for action-response modeling but often lacks a rich next-state observation.
- It belongs in a weak-time-series / bandit category for world-model comparison.