Recent Changes

Friday, April 15

  1. page Lecture Notes and references edited Lecture 23 (Wed April 13): X-armed bandits, Monte Carlo Tree Search Lecture Slides {MCTS-Zooming.…
    Lecture 23 (Wed April 13): X-armed bandits, Monte Carlo Tree Search
    Lecture Slides {MCTS-Zooming.pdf}{MTCSSlidesPartI.pdf}
    Figures in these slides were copied from Munos, 2014 (reference below).
    Main Reference
    (view changes)
    1:08 pm
  2. 1:07 pm
  3. file LectureSlidesPartI.pdf (deleted) uploaded Deleted File
    1:06 pm
  4. file LectureSlidesPartI.pdf (deleted) uploaded
    1:06 pm

Thursday, April 14

  1. page Assignments edited ... Homework 3 {HW3.pdf} Due on Mar 30 in beginning of class, or before the class by email to ie…
    ...
    Homework 3 {HW3.pdf}
    Due on Mar 30 in beginning of class, or before the class by email to ieor8100-01-spring2016@columbia.edu.
    Solutions {HW3sol.pdf}
    Homework 2 (with corrections and hints) {HW2.pdf}
    Due on Feb 24 in beginning of class, or before the class by email to ieor8100-01-spring2016@columbia.edu.
    (view changes)
    1:19 pm
  2. file HW3sol.pdf uploaded
    1:19 pm
  3. page Lecture Notes and references edited ... Remi Munos. From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Opti…
    ...
    Remi Munos. From Bandits to Monte-Carlo Tree Search: The Optimistic Principle Applied to Optimization and Planning. 2014. <hal-00747575v5> [ pdf {MCTSMunos.pdf} ]
    Other related references
    ...
    by Levente Kocsis , CsabaKocsis.
    Csaba
    Szepesvári, ECML
    UCT algorithm: first optimistic (UCB like) approach to Monte Carlo Tree Search
    Sylvain Gelly, Yizao Wang, Remi Munos, Olivier Teytaud. ModificationModification of UCT with Patterns
    in
    in Monte-Carlo Go. [Research
    Sylvain Gelly, Yizao Wang, Remi Munos, Olivier Teytaud.[Research
    Report] 2006.
    Application of Optimistic tree-search algorithm UCT to Computer Go
    Lecture 21-22 (Wed April 6, Mon April 11): UCRL2 algorithm.
    (view changes)
    3:53 am
  4. page Lecture Notes and references edited ... Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs Lecture 20 (Mon A…
    ...
    Optimistic Linear Programming gives Logarithmic Regret for Irreducible MDPs
    Lecture 20 (Mon April 4): Exploration-exploitation in MDP/reinforcement learning.
    Lecture Notes {Lecture 20.pdf}
    Lecture 19 (Wed Mar 30): Bayesian bandit problem and Introduction to MDP.
    Lecture notes {Lecture 19.pdf}
    (view changes)
    3:51 am
  5. file Lecture 20.pdf uploaded
    3:50 am
  6. page Lecture Notes and references edited ... Lecture Notes {Lecture 4.pdf} Lecture 3 (Wednesday Jan 27, 2016) : UCB algorithm, lower boun…
    ...
    Lecture Notes {Lecture 4.pdf}
    Lecture 3 (Wednesday Jan 27, 2016) : UCB algorithm, lower bounds
    Lecture notes part 1 {Lecture 3
    includes UCB algorithm and regret analysis
    Coming soon : LectureThese lecture notes part 2 withfrom Robert Kleinberg's class at Cornell contain a proof of problem-independent lower bound proofdiscussed in class.
    References:
    Peter Auer, Nicolo Cesa-Bianchi, and Paul Fischer. Finite-time analysis of the multiarmed bandit problem. Machine Learning, 47((2-3)), 2002.
    Regret analysis for UCB algorithm
    These lecture notes from Robert Kleinberg's class at Cornell contain a proof of problem-independent lower bound discussed in class.
    Lecture 2 (Monday Jan 25, 2016) : UCB algorithm, lower bounds
    Lecture Notes {Lecture 2.pdf} (Last edited on 1/31/2016)
    (view changes)
    3:01 am

More