Eindhoven Reinforcement Learning Seminar
Eindhoven Reinforcement Learning Seminar
  • 21
  • 24 960

Відео

Mark Tensen "Sculpting with Reinforcement Learning"
Переглядів 762 роки тому
marktension.nl onformative.com
Sasha Vezhnevets "OPtions as REsponses"
Переглядів 1262 роки тому
Paper: dl.acm.org/doi/pdf/10.5555/3524938.3525840
Towards Interpretable Deep Reinforcement Learning
Переглядів 5892 роки тому
Paul Weng paul.weng@sjtu.edu.cn A Survey on Interpretable Reinforcement Learning arxiv.org/abs/2112.13112 Differentiable logic machines arXiv.org/abs/2102.11529 Hierarchical rule induction arXiv.org/abs/2112.13418
An Inherently Interpretable Reward Free HRL Model for Discrete Environments
Переглядів 1142 роки тому
Alexander Demin and Denis Ponomaryov alexandredemin@yandex.ru, ponom@iis.nsk.su Manuscript: arxiv.org/abs/2202.07414
Deceptive Diversity of Modern RL methods
Переглядів 1692 роки тому
Join us in our attempt to de-obfuscate the field
Ben Eysenbach "Diversity is All you Need"
Переглядів 1,3 тис.3 роки тому
Speaker: Ben Eysenbach, ben-eysenbach.github.io/ Eysenbach et. al. "Diversity is all you need" arxiv.org/pdf/1802.06070.pdf Gupta, Eysenbach et. al. "Unsupervised Meta-Learning for Reinforcement Learning" arxiv.org/pdf/1806.04640.pdf
Vadim Liventsev "Multi-task and Transfer Reinforcement Learning"
Переглядів 1973 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Board: view.ziteboard.com/shared/39102430732612 Schaul et.al. "Universal Value Function Approximators" proceedings.mlr.press/v37/schaul15.pdf Rusu, Rabinowitz et. al. "Progressive Neural Networks" arxiv.org/pdf/1606.04671.pdf Andrychowicz et.al. "Hindsight Experience Replay" arxiv.org/pdf/1707.01495.pdf
Vadim Liventsev "Distributed computing with HOGWILD!"
Переглядів 933 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Slides: drive.google.com/file/d/1mJq7Pv_-kPizkhabyvLFSqsbsF94_567/view?usp=sharing HOGWILD! Paper: arxiv.org/abs/1106.5730 Source for the story about HOGWILD! origins: ua-cam.com/video/c5T7600RLPc/v-deo.html
Max Ryabinin "Learning@Home"
Переглядів 1683 роки тому
Tweet questions at Max at m_ryabinin or email them to mryabinin0 [at] gmail.com Slides: surfdrive.surf.nl/files/index.php/s/ExXDbvq4xH9IVc7/download Paper: arxiv.org/abs/2002.04013v3 Project website: learning-at-home.github.io/
Nathanaël Fijalkow "Distribution-based search for programming by example"
Переглядів 953 роки тому
Speaker: Nathanaël Fijalkow, nathanael-fijalkow.github.io/
Vadim Liventsev "Deep Genetic Programming to Improve Clinical Protocols"
Переглядів 1033 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Slides: surfdrive.surf.nl/files/index.php/s/945GmmuyoI8Ji4Z/download References: BF : a language for general-purpose program synthesis arxiv.org/abs/2101.09571 Neurogenetic Programming Framework for Explainable Reinforcement Learning: arxiv.org/abs/2102.04231 CodeBERT: A Pre-Trained Model for Programming ...
Zeyu Sun "TreeGen: A Tree-Based Transformer Architecture for Code Generation"
Переглядів 2423 роки тому
Speaker: Zeyu Sun zysszy.github.io/ Paper: arxiv.org/abs/1911.09983
Błażej Osiński "Data-Driven Driving"
Переглядів 1203 роки тому
Błażej Osiński "Data-Driven Driving"
‪Błażej Osiński "Model-based Reinforcement Learning for Atari"
Переглядів 4603 роки тому
‪Błażej Osiński "Model-based Reinforcement Learning for Atari"
Wouter Kool "Attention, Learn to Solve Routing Problems!"
Переглядів 3,9 тис.3 роки тому
Wouter Kool "Attention, Learn to Solve Routing Problems!"
Reza Nazari "Reinforcement Learning for Solving the Vehicle Routing Problem"
Переглядів 8 тис.3 роки тому
Reza Nazari "Reinforcement Learning for Solving the Vehicle Routing Problem"
Elias B. Khalil "Learning Combinatorial Optimization Algorithms over Graphs"
Переглядів 2,1 тис.3 роки тому
Elias B. Khalil "Learning Combinatorial Optimization Algorithms over Graphs"

КОМЕНТАРІ

  • @phucnguyenthang4808
    @phucnguyenthang4808 8 місяців тому

    Thanks for your sharing❤

  • @magi-1
    @magi-1 11 місяців тому

    Shoulda discussed the model architecture in more detail because the 3 attention mechanisms used are non trivial. Would like to understand the reasoning behind the design.

  • @alihusseinwheeb835
    @alihusseinwheeb835 Рік тому

    Very good explanation

  • @InquilineKea
    @InquilineKea Рік тому

    Does this work if you play different games and adopt policies learned on some games to other games?

  • @InquilineKea
    @InquilineKea Рік тому

    Or novelty quotient (read on lesswrong) or kevin frans!

  • @marouaghamri4762
    @marouaghamri4762 Рік тому

    Hello Could you please tell me how could you integrate AI in simulation ??

  • @ksymbol7404
    @ksymbol7404 2 роки тому

    young young young

  • @CharlesVanNoland
    @CharlesVanNoland 2 роки тому

    @16:18 The idea of maximizing future diversity reminds me of a paper from 2013 titled "Causal Entropic Forces" which talks about intelligence being a force which maximizes future entropy. www.alexwg.org/publications/PhysRevLett_110-168702.pdf

  • @dealcooper4988
    @dealcooper4988 2 роки тому

    nice work !and thanks for sharing !i am curious about how to input specific problem‘s parameter like depot and customer‘s locations and demands in the code. i can’t find a appropriate file to write my problem’s specific parameter.

  • @true-4ce
    @true-4ce 2 роки тому

    Thx a lot, great presentation!

  • @astaragmohapatra9
    @astaragmohapatra9 3 роки тому

    Can you provide the slides of this talk?

  • @come4pvp
    @come4pvp 3 роки тому

    Can we use a distance matrix instead of Cartesian coordinates and euclidean distance (that would be more similar to a real-world problem) in order to use your solution?

    • @pranavdave6973
      @pranavdave6973 Рік тому

      did u find something for this? I am facing a similar dilemma

  • @ΝίκοςΚουδούνας-η7υ

    Hello and great work! The reward is the tour length or the negative tour length?

  • @eindhovenreinforcementlear2884
    @eindhovenreinforcementlear2884 3 роки тому

    The paper on airplane schedule optimization I am referring to at 42:00 is this: www.tandfonline.com/doi/full/10.1080/03081060.2017.1355887