Panel: The future of reinforcement learning

Поділитися
Вставка
  • Опубліковано 23 січ 2022
  • Speakers:
    Geoff Gordon, Partner Researcher, Microsoft Research Montreal
    Emma Brunskill, Associate Professor, Computer Science Department, Stanford University
    Craig Boutilier, Principal Scientist, Google
    Sham Kakade, Senior Principal Researcher, Microsoft Research NYC
    Joelle Pineau, Managing Director, Facebook AI Research; Associate Professor, McGill University
    Csaba Szepesvari, Team Lead, DeepMind; Professor, University of Alberta
    This panel brings together a variety of experts from industry and academia to discuss the question, what is the future of reinforcement learning? Reinforcement learning is an important research area in AI currently, and it has been an important research area in human and animal behavior since at least the middle of the 20th century. More recently, reinforcement learning research has been energized by a series of positive results, often based on deep models, in areas such as personalization and game-playing. However, there remain a wide variety of open questions, both theoretical and practical. We’ll gather expert perspectives on which open questions are the most important as well as where the likely answers might come from.
    Learn more about the 2021 Microsoft Research Summit: Aka.ms/researchsummit
  • Наука та технологія

КОМЕНТАРІ • 3

  • @bananabatsy3708
    @bananabatsy3708 2 роки тому +1

    Please make this unlisted. More people need to see this. Thank!

  • @user-to9ub5xv7o
    @user-to9ub5xv7o 6 місяців тому

    Chapters Summary:
    1. Introduction and Panelist Introductions: (0:06 - 0:40)
    - Geoff Gordon introduces the panel on the future of reinforcement learning at the Microsoft Research Summit. Panelists are asked to introduce themselves and highlight an important trend in RL research.
    2. Emma Brunskill's Insights: (0:40 - 2:30)
    - Emma Brunskill, an Associate Professor at Stanford, discusses her work in AI systems, especially in reinforcement learning (RL). She emphasizes the increasing application of RL across various fields and the gap between academic research in RL and its practical applications.
    3. Csaba Szepesvari's Perspective: (2:30 - 4:10)
    - Csaba Szepesvari from the University of Alberta and DeepMind highlights the potential and challenges in making RL algorithms more robust and scalable. He expresses interest in applications and the need for further research, particularly in areas like partial observability.
    4. Sham Kakade's Viewpoint: (4:10 - 5:26)
    - Sham Kakade, a professor at the University of Washington and researcher at Microsoft, discusses his work in AI and machine learning, with a focus on RL and natural language processing. He finds the results in program synthesis and mathematical reasoning using planning algorithms to be particularly intriguing.
    5. Joelle Pineau's Contributions: (5:26 - 7:31)
    - Joelle Pineau from McGill University and Facebook AI Research Labs talks about her journey in RL research, with a focus on healthcare and robotics applications. She emphasizes the importance of reproducibility, scientific integrity, and considering broader criteria like fairness and safety in RL.
    6. Craig Boutilier's Remarks: (7:31 - 10:55)
    - Craig Boutilier from Google Research discusses his focus on user-centric recommender systems and the critical role of RL in understanding user preferences. He identifies gaps in RL, particularly in dealing with latent state, complex action spaces, and behavioral phenomena.
    7. Panel Discussion on Combinatorial Thinking in RL: (10:55 - 32:42)
    - Initial discussion on handling combinatorial challenges in RL, focusing on generalization and compositionality in action and state spaces (10:55 - 20:26).
    - Further exploration of theoretical aspects, such as the interplay between representations and dynamics, and the impact of RL's interactive nature on learning and generalization (20:26 - 32:42).
    8. Exploring Special vs. General-Purpose Architectures in RL: (32:42 - 43:56)
    - Introduction of the topic comparing special-purpose architectures versus general-purpose architectures in RL, with specific references to value iteration networks and transformers (32:42 - 33:59).
    - Discussion on the efficacy of different architectures in RL, the importance of rigorous experimentation, and the potential limitations and strengths of these architectures (33:59 - 43:56).
    9. Closing Remarks: (43:56 - 44:19)
    - Geoff Gordon thanks the panelists for their insights and contributions, suggesting the need for a follow-up discussion given the depth of topics covered.