AI539: Introduction to Online Learning – Fall 2024

Course Description

In this course, we will focus on algorithms for online learning and sequential decision-making including online convex optimization and bandits, examine their theoretical guarantees, and applications in real-world machine learning problems such as information retrieval (advertising, recommender system, web search and ranking).

Course Information

Lectures

Tuesday & Thursday 2 - 3:50pm, KEC 1005

Instructor

Huazheng Wang
Email: huazheng.wang [at] oregonstate.edu
Office: KEC 3097
Office hours: Thursday 4 - 6pm (TBD)

Contact

We will use Canvas for slides and assignments, and Discord for communication. See Canvas announcements for the link to Discord channel.

Prerequisites

Familiar with probability, statistics, linear algebra, calculus and machine learning.
Python. We will use Python for programming assignments.

Schedule

Week	Date	Lecture	Readings	Notes
Week 1	9/26	Introduction to the course
Week 2	10/1	Review of linear algebra, statistics and optimization
	10/3	Online Gradient Descent
Week 3	10/8	Online-to-Batch Conversion		HW1 posted.
	10/10	Follow the Regularized Leader
Week 4	10/15	Online Learning with Expert Advice
	10/17	Stochastic Multi-Armed Bandits
Week 5	10/22	Regret Lower Bounds		HW1 due. HW2 posted.
	10/24	Multi-Armed Bandits and Linear Bandits
Week 6	10/29	Thompson Sampling
	10/31	Generalized Linear Model, Kernel, and Neural Bandits
Week 7	11/5	Bandit Convex Optimization		HW2 due. HW3 posted.
	11/7	Contextual Bandits
Week 8	11/12	Non-stationary Regret Minimization
	11/14	Online Learning to Rank
Week 9	11/19	Combinatorial Bandits Learning		HW3 due. HW4 posted.
	11/21	Vulnerability and Robustness of Online Learning
Week 10	11/26	Online Reinforcement Learning
	11/28	Thanks giving
Week 11	12/3	Project presentations		HW4 due.
	12/5	Project presentations

Gradings

Homework – (4*15%) 60%
Paper presentation - 10%
Final project – (proposal 5%, presentation 10%, report 15%) 30%
Total – 100%

Resources

Suggested readings:
A Modern Introduction to Online Learning, Francesco Orabona.
Bandit Algorithms by Tor Lattimore and Csaba Szepesvári