AlphaZero

Connor Shorten

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 26 вер 2024

КОМЕНТАРІ • 14

@connor-shorten 4 роки тому ⁺²
1:05 Overview, AlphaGo Zero Recap
2:10 AZ vs. AGZ (Reward Scale)
2:35 AZ vs. AGZ (Data Augmentation)
3:35 AZ vs. AGZ (Self-Play Selection)
4:22 AZ vs. AGZ (Hyperparameter Tuning)
5:08 Go and Convolutions
6:02 Challenges with Chess and Shogi
7:33 State Representations for New Games
8:58 MCTS vs. Alpha-Beta Search
10:38 Handcrafted Chess Features
11:17 Training Details
12:10 Results
13:17 Result with respect to Thinking Time
13:39 Discovered Chess Opening Moves
@citiblocsMaster 4 роки тому ⁺⁷
This channel is so underrated. Keep up the good stuff
@connor-shorten 4 роки тому
Thank you so much!!
@typedeckeralt7628 3 роки тому ⁺¹
This was a really good explanation thx alot hope you get more attention on the platform you are underrated. Keep up the great work my friend!
@connor-shorten 3 роки тому
Thank you so much, I really appreciate that, really glad you found the video useful!
@typedeckeralt7628 3 роки тому
@@connor-shorten Also can you explain that which network policy or value did they first use while generating self play games and which network was trained on them. Since if they used both of them then how would they improve over time since they were being fed games that were based on their own predictions and thus they wouldn't have found any new ideas. Also of they generated self play games based on MCTS then did they have a handcrafted evaluation function too at the starting to generate these games?
@typedeckeralt7628 3 роки тому
@@connor-shorten Since as you mentioned at 4:08 that it generates games with latest parameters but then if it trains on the games it generate using its own parameters won't every game be the same and wouldn't that mean that the parameters would not change since it always will train itself on games generated using the previous parameters and thus would have no higher target to achieve than those parameters that it already has?
@gaieepo.jeffrey Рік тому
Awesome video. One detail that I'd like to ask. In the paper, it states that "The board is oriented to the perspective of the current player". I'm not very clear how it affects the feature encoding for model input. Take chess as example. Normally I orient the board with black on top rows and white on bottom rows, so for the beginning step the first 6 layers (6 types of piece) would be binary data indicating white pieces and the subsequent 6 layers for black. When it comes to the second move, shall I perform a vertical flip (180 rotation does not work since king-queen position) and then make the first 6 feature planes for the "new white" (old black with flipped position) and 6 more for the "new black"? If that is the case, how is the "color to move" encoding gonna help with the model training process? Hopefully you could help me clarify this issue.
@simonstrandgaard5503 4 роки тому
Great explanation.
@abelkidanemariam6485 4 роки тому ⁺³
Thanks very much... can you please help us how to read the math part on the research paper i found it very hard
@connor-shorten 4 роки тому
Could you be more specific as to which part of this paper you are having trouble with?
@abelkidanemariam6485 4 роки тому ⁺¹
@@connor-shorten thanks for your replay i think the hard part of most paper is to combine the the math part with coding and implement it your self and it would be harder if the writers does't include git hub. for example take the dropout learning algorithm it's pretty simple or may be one line of code in keras. but for some one trying to understand it form the paper would be challenging. you can take any paper for example i think what most people would like to know is how can i test even the simpler ones.
@connor-shorten 4 роки тому ⁺¹
@@abelkidanemariam6485 Hey Abel, I would recommend using either TensorFlow or PyTorch in order to get better control over low level implementation like custom layers such as dropout. I think it helps to understand the dimensions of these Tensor Variables that flow through neural networks and frame your thinking about layers and transformations as operations on these data structures. Additionally, I would just recommend to overall stay patient with your learning and don't be overwhelmed if you can't instantly understand complicated math. I was recently frustrated trying to understand the math behind Contrastive Predictive Coding and then I watched an explanation of Word2Vec that made the CPC loss click for me, it's tough to say when you will achieve that breakthrough in understanding, but if you stay at it, it seems to work itself out. Good luck with your learning journey!
@mim8312 3 роки тому
I think that too many people are focusing on the games playing, which I also follow, as if this were an ordinary player. Since I have significant knowledge, and since I believe that Hawking and Musk were right, I am really anxious by the self-taught nature of this AI.
This particular AI is not the worrisome thing, albeit it has obvious, potential applications in military logistics, military strategy, etc. The really scary part is how fast this was developed after AlphaGO debuted.
We are not creeping up on the goal of human-level intelligence. We are likely to shoot past that goal amazingly soon without even realizing it, if things continue progressing as they have.
The first AIs will also be narrow and not very competent or threatening, even if they become "superhuman" in intelligence. They will also be harmless, idiot savants at first.
Upcoming Threat to Humanity.
The scary thing is the fact that computer speed (and thereby, probably eventually AI intelligence) doubles about every year, and will likely double faster when super-intelligent AIs start designing chips, working with quantum computers as co-processors, etc. How fast will our AIs progress to such levels that they become indispensable -- while their utility makes hopeless any attempts to regulate them or retroactively impose restrictions on beings that are smarter than their designers?
At first, they may have only base functions, like the reptilian portion of our brain. However, when will they act like Nile crocodiles and react to any threat with aggression? Ever gone skinny dipping with Nile crocodiles?
I fear that very soon, before we realize it, we will all be doing the equivalent of skinny dipping with Nile crocodiles, because of how fast AIs will develop by the time that the children born today reach their teens or middle age. Like crocodiles that are raised by humans, AIs may like us for a while. I sure hope that lasts. As the announcer in Jeopardy said about a program that was probably not really an advanced AI long ago, I, for one, welcome our future, AI overlords.

Наступне

Автоматичне відтворення