AI539: Introduction to Online Learning – Fall 2024

Course Description

In this course, we will focus on algorithms for online learning and sequential decision-making including online convex optimization and bandits, examine their theoretical guarantees, and applications in real-world machine learning problems such as information retrieval (advertising, recommender system, web search and ranking).

Course Information

Lectures

Tuesday & Thursday 2 - 3:50pm, KEC 1005

Instructor

Huazheng Wang
Email: huazheng.wang [at] oregonstate.edu
Office: KEC 3097
Office hours: Thursday 4 - 6pm (TBD)

Contact

We will use Canvas for slides and assignments, and Discord for communication. See Canvas announcements for the link to Discord channel.

Prerequisites

  • Familiar with probability, statistics, linear algebra, calculus and machine learning.

  • Python. We will use Python for programming assignments.

Schedule

Week Date Lecture Readings Notes
Week 1 9/26 Introduction to the course
Week 2 10/1 Review of linear algebra, statistics and optimization
10/3 Online Gradient Descent
Week 3 10/8 Online-to-Batch Conversion HW1 posted.
10/10 Follow the Regularized Leader
Week 4 10/15 Online Learning with Expert Advice
10/17 Stochastic Multi-Armed Bandits
Week 5 10/22 Regret Lower Bounds HW1 due. HW2 posted.
10/24 Multi-Armed Bandits and Linear Bandits
Week 6 10/29 Thompson Sampling
10/31 Generalized Linear Model, Kernel, and Neural Bandits
Week 7 11/5 Bandit Convex Optimization HW2 due. HW3 posted.
11/7 Contextual Bandits
Week 8 11/12 Non-stationary Regret Minimization
11/14 Online Learning to Rank
Week 9 11/19 Combinatorial Bandits Learning HW3 due. HW4 posted.
11/21 Vulnerability and Robustness of Online Learning
Week 1011/26 Online Reinforcement Learning
11/28 Thanks giving
Week 1112/3 Project presentations HW4 due.
12/5 Project presentations

Gradings

  • Homework – (4*15%) 60%

  • Paper presentation - 10%

  • Final project – (proposal 5%, presentation 10%, report 15%) 30%

  • Total – 100%

Resources

Suggested readings:
A Modern Introduction to Online Learning, Francesco Orabona.
Bandit Algorithms by Tor Lattimore and Csaba Szepesvári