- 21
- 24 960
Eindhoven Reinforcement Learning Seminar
Netherlands
Приєднався 9 гру 2020
Відео
Mark Tensen "Sculpting with Reinforcement Learning"
Переглядів 762 роки тому
marktension.nl onformative.com
Sasha Vezhnevets "OPtions as REsponses"
Переглядів 1262 роки тому
Paper: dl.acm.org/doi/pdf/10.5555/3524938.3525840
Towards Interpretable Deep Reinforcement Learning
Переглядів 5892 роки тому
Paul Weng paul.weng@sjtu.edu.cn A Survey on Interpretable Reinforcement Learning arxiv.org/abs/2112.13112 Differentiable logic machines arXiv.org/abs/2102.11529 Hierarchical rule induction arXiv.org/abs/2112.13418
An Inherently Interpretable Reward Free HRL Model for Discrete Environments
Переглядів 1142 роки тому
Alexander Demin and Denis Ponomaryov alexandredemin@yandex.ru, ponom@iis.nsk.su Manuscript: arxiv.org/abs/2202.07414
Deceptive Diversity of Modern RL methods
Переглядів 1692 роки тому
Join us in our attempt to de-obfuscate the field
Ben Eysenbach "Diversity is All you Need"
Переглядів 1,3 тис.3 роки тому
Speaker: Ben Eysenbach, ben-eysenbach.github.io/ Eysenbach et. al. "Diversity is all you need" arxiv.org/pdf/1802.06070.pdf Gupta, Eysenbach et. al. "Unsupervised Meta-Learning for Reinforcement Learning" arxiv.org/pdf/1806.04640.pdf
Vadim Liventsev "Multi-task and Transfer Reinforcement Learning"
Переглядів 1973 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Board: view.ziteboard.com/shared/39102430732612 Schaul et.al. "Universal Value Function Approximators" proceedings.mlr.press/v37/schaul15.pdf Rusu, Rabinowitz et. al. "Progressive Neural Networks" arxiv.org/pdf/1606.04671.pdf Andrychowicz et.al. "Hindsight Experience Replay" arxiv.org/pdf/1707.01495.pdf
Vadim Liventsev "Distributed computing with HOGWILD!"
Переглядів 933 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Slides: drive.google.com/file/d/1mJq7Pv_-kPizkhabyvLFSqsbsF94_567/view?usp=sharing HOGWILD! Paper: arxiv.org/abs/1106.5730 Source for the story about HOGWILD! origins: ua-cam.com/video/c5T7600RLPc/v-deo.html
Max Ryabinin "Learning@Home"
Переглядів 1683 роки тому
Tweet questions at Max at m_ryabinin or email them to mryabinin0 [at] gmail.com Slides: surfdrive.surf.nl/files/index.php/s/ExXDbvq4xH9IVc7/download Paper: arxiv.org/abs/2002.04013v3 Project website: learning-at-home.github.io/
Nathanaël Fijalkow "Distribution-based search for programming by example"
Переглядів 953 роки тому
Speaker: Nathanaël Fijalkow, nathanael-fijalkow.github.io/
Vadim Liventsev "Deep Genetic Programming to Improve Clinical Protocols"
Переглядів 1033 роки тому
Speaker: Vadim Liventsev, vadim.me Feel free to email questions to v.liventsev [at] tue.nl Slides: surfdrive.surf.nl/files/index.php/s/945GmmuyoI8Ji4Z/download References: BF : a language for general-purpose program synthesis arxiv.org/abs/2101.09571 Neurogenetic Programming Framework for Explainable Reinforcement Learning: arxiv.org/abs/2102.04231 CodeBERT: A Pre-Trained Model for Programming ...
Zeyu Sun "TreeGen: A Tree-Based Transformer Architecture for Code Generation"
Переглядів 2423 роки тому
Speaker: Zeyu Sun zysszy.github.io/ Paper: arxiv.org/abs/1911.09983
Błażej Osiński "Model-based Reinforcement Learning for Atari"
Переглядів 4603 роки тому
Błażej Osiński "Model-based Reinforcement Learning for Atari"
Wouter Kool "Attention, Learn to Solve Routing Problems!"
Переглядів 3,9 тис.3 роки тому
Wouter Kool "Attention, Learn to Solve Routing Problems!"
Reza Nazari "Reinforcement Learning for Solving the Vehicle Routing Problem"
Переглядів 8 тис.3 роки тому
Reza Nazari "Reinforcement Learning for Solving the Vehicle Routing Problem"
Elias B. Khalil "Learning Combinatorial Optimization Algorithms over Graphs"
Переглядів 2,1 тис.3 роки тому
Elias B. Khalil "Learning Combinatorial Optimization Algorithms over Graphs"
Thanks for your sharing❤
Shoulda discussed the model architecture in more detail because the 3 attention mechanisms used are non trivial. Would like to understand the reasoning behind the design.
Very good explanation
Does this work if you play different games and adopt policies learned on some games to other games?
Or novelty quotient (read on lesswrong) or kevin frans!
Hello Could you please tell me how could you integrate AI in simulation ??
young young young
@16:18 The idea of maximizing future diversity reminds me of a paper from 2013 titled "Causal Entropic Forces" which talks about intelligence being a force which maximizes future entropy. www.alexwg.org/publications/PhysRevLett_110-168702.pdf
This, good sir, is why we keep the comments open. Thank you!
nice work !and thanks for sharing !i am curious about how to input specific problem‘s parameter like depot and customer‘s locations and demands in the code. i can’t find a appropriate file to write my problem’s specific parameter.
Thx a lot, great presentation!
Can you provide the slides of this talk?
Can we use a distance matrix instead of Cartesian coordinates and euclidean distance (that would be more similar to a real-world problem) in order to use your solution?
did u find something for this? I am facing a similar dilemma
Hello and great work! The reward is the tour length or the negative tour length?
negative tour length
The paper on airplane schedule optimization I am referring to at 42:00 is this: www.tandfonline.com/doi/full/10.1080/03081060.2017.1355887