Just clearing one thing which I left ambiguous - I was wrong to point out that Deepmind did use search ( for training ), since they obviously meant they did not use it for runtime. I should've at least mentioned that. The reason was - I always misunderstood how AlphaZero worked (and did not read AlphaZero paper). It does use MCTS for runtime in addition to CNN, although not for deep searches, it's a bit more complicated.
@7:55 what they mean is that it doesn't search during runtime. During training it uses stockfish to find the best moves, which does use a search tree. But once it has completed training and you actually play against it, it can't search. A more correct title would be "grandmaster-level chess without 'runtime' search"
There's an interesting video " ChatGPT rêve-t-il de cavaliers électriques ?" that provides some counterpoints. It cites a paper where they showed that an LLM can create a board image for Othello "Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task". And it points out that one of the issues may be how we format the questions, if you use a PGN format and ask it to fill in the next move its level increases significantly. I'm in the camp that think LLMs will never be as good as top level engines at chess (other than cheating by just recreating/asking a top level engine) but it's food for thought PS: sorry that my recommended video is in french
Thanks for the link, this video goes more in depth than mine for sure. Regarding Othello - I've quickly went through the paper. It does seem they claim that, but I'd like to verify it. And throw other problems at it too. I'd imagine some problems, even if they look 2D, might be conceptualized easier than others.
@@FutureIsAmazing569 Chess must be more complicated, Othello has a single rule that's hard to visualize (the flipping of other pieces) while chess has a variety of pieces and an unintuitive notation system (it's never explicitly stated where a piece comes from). But I'd argue that (in a purely informal sense) they're the same class of problems. If a system can visualize Othello, then a more powerful version can visualize chess. That however doesn't mean it can calculate deeply. PS: I am taking the paper at face value, it's possible they're being dishonest
@@mnm1273 There's also a moment, that the paper is a bit similar to the chess paper from Deepmind in regards of training a smaller, specialized transformer model, not a LLM. In general I can totally accept the idea that even simple perceptrons are able to from 2D abstractions. That's really not a problem. It's when you ask a general purpose LLM to do that.
Good video! I tried givjng it a screenshot of my game and it had no idea of my moves. Tried suggesting moves which move my pieces through opponents and also got the colors mixed up
I love that they can play the game even in a rudimentary way, because it means their capabilities are essentially the same as HAL-9000 from 2001: A Space Odyssey.
Well, HAL-9000 could both play chess and explain what's happening on the board at the same time. Not sure at what level was it playing though.. But ability to reason was certainly way better than the current generation of LLMs
Llm's are "next step prediction machines" for me. So those predictions depends on training data that creates database of decisions. It feels like it's an overfitting, a --search engine-- for me. It finds patterns and knowledge from data in the training phase. As a computer scientist i'm impressed current developments of llm's but still i think it's just a huge database and linear algebra on top of database -_- Real intelligence needs more abstract structures that emerges from those basic elements, currently they struggle to build such structures. Chain of thoughts is very primitive one but it's one of those abstract structures.
@@djan0889 in our brains, abstract structures often emerge without us going back and forth, but rather by multiple neurons entering a synchronicity and firing at once. That is certainly not happening in current architectures we use. So yeah, I would tend to agree with you
@@FutureIsAmazing569 Yes, I agree, but I didn’t say it should go 'back and forth', it's just one of those primitive version of brains features. Those structures emerged by evolution . Also I think we don't need to copy human brain. Silicon chips may develop brand new cognitive skills using evolutionary approach by simulating a lot of architectures and functions. Currently trend is making search in vector space. -_- And AGI don't need a lot of data to start unlike llm's. We are on wrong path with false promises :/
Just clearing one thing which I left ambiguous - I was wrong to point out that Deepmind did use search ( for training ), since they obviously meant they did not use it for runtime. I should've at least mentioned that.
The reason was - I always misunderstood how AlphaZero worked (and did not read AlphaZero paper). It does use MCTS for runtime in addition to CNN, although not for deep searches, it's a bit more complicated.
why in the world is this channel so underrated,
I genuinely thought that this is some big youtuber,
nice video bro
The channel is still very young, no worries. Thanks for the kind words tho!
@7:55 what they mean is that it doesn't search during runtime. During training it uses stockfish to find the best moves, which does use a search tree. But once it has completed training and you actually play against it, it can't search.
A more correct title would be "grandmaster-level chess without 'runtime' search"
There's an interesting video " ChatGPT rêve-t-il de cavaliers électriques ?" that provides some counterpoints. It cites a paper where they showed that an LLM can create a board image for Othello "Emergent World Representations: Exploring a Sequence Model Trained on a Synthetic Task". And it points out that one of the issues may be how we format the questions, if you use a PGN format and ask it to fill in the next move its level increases significantly.
I'm in the camp that think LLMs will never be as good as top level engines at chess (other than cheating by just recreating/asking a top level engine) but it's food for thought
PS: sorry that my recommended video is in french
Thanks for the link, this video goes more in depth than mine for sure. Regarding Othello - I've quickly went through the paper. It does seem they claim that, but I'd like to verify it. And throw other problems at it too. I'd imagine some problems, even if they look 2D, might be conceptualized easier than others.
@@FutureIsAmazing569 Chess must be more complicated, Othello has a single rule that's hard to visualize (the flipping of other pieces) while chess has a variety of pieces and an unintuitive notation system (it's never explicitly stated where a piece comes from). But I'd argue that (in a purely informal sense) they're the same class of problems. If a system can visualize Othello, then a more powerful version can visualize chess.
That however doesn't mean it can calculate deeply.
PS: I am taking the paper at face value, it's possible they're being dishonest
@@mnm1273 There's also a moment, that the paper is a bit similar to the chess paper from Deepmind in regards of training a smaller, specialized transformer model, not a LLM. In general I can totally accept the idea that even simple perceptrons are able to from 2D abstractions. That's really not a problem. It's when you ask a general purpose LLM to do that.
Very interesting, I actually talked with my brother about AI and poker just a week ago so looking forward to your next video :)
very good quality videos !
Good video! I tried givjng it a screenshot of my game and it had no idea of my moves. Tried suggesting moves which move my pieces through opponents and also got the colors mixed up
Great video! Always been interested in how LLM's would be at chess and this is a very insightful explanation of that. Thanks!
great video 👍
Shockingly underrated
I love that they can play the game even in a rudimentary way, because it means their capabilities are essentially the same as HAL-9000 from 2001: A Space Odyssey.
Well, HAL-9000 could both play chess and explain what's happening on the board at the same time. Not sure at what level was it playing though.. But ability to reason was certainly way better than the current generation of LLMs
Llm's are "next step prediction machines" for me. So those predictions depends on training data that creates database of decisions. It feels like it's an overfitting, a --search engine-- for me. It finds patterns and knowledge from data in the training phase. As a computer scientist i'm impressed current developments of llm's but still i think it's just a huge database and linear algebra on top of database -_- Real intelligence needs more abstract structures that emerges from those basic elements, currently they struggle to build such structures. Chain of thoughts is very primitive one but it's one of those abstract structures.
@@djan0889 in our brains, abstract structures often emerge without us going back and forth, but rather by multiple neurons entering a synchronicity and firing at once. That is certainly not happening in current architectures we use. So yeah, I would tend to agree with you
@@FutureIsAmazing569 Yes, I agree, but I didn’t say it should go 'back and forth', it's just one of those primitive version of brains features. Those structures emerged by evolution . Also I think we don't need to copy human brain. Silicon chips may develop brand new cognitive skills using evolutionary approach by simulating a lot of architectures and functions. Currently trend is making search in vector space. -_- And AGI don't need a lot of data to start unlike llm's. We are on wrong path with false promises :/