This is a must watch for anyone studying gaussian processes, scaling laws, parameter tuning. Greg's unconventional career path is the stuff of movies. Watch til the end - beautifully ties together multiple lines of research.
Great guest. I had to listen again to the year of graduation. Excellent way to begin with Greg's story - and hear about this early conversations with Freedman. I wish I had more time to listen to the whole lecture. In any case, thanks to both of you.
uhh I think maybe you forgot that the vast majority of people are not able to understand or have interests in random matrices and neural networks but rather "LETS WATCH THIS TRAIN GO INTO A BIG HOLE WHOAAAA HAHAHHAHA"
First episode I've seen of the show and it was great. You managed technical complexity and audience understanding really well so I managed to get something from it it despite being out of my depth at some points. Thanks a lot for sharing!
chatGPT below "Let's connect these three concepts: the Law of Large Numbers, the Central Limit Theorem, and Random Matrix Theory. Law of Large Numbers (LLN): This fundamental theorem in probability theory tells us that as the number of independent trials increases, the sample mean will converge to the population mean. It forms the backbone of statistics and probability, underpinning the concept of statistical convergence. Central Limit Theorem (CLT): This theorem builds upon the LLN by stating that the sum of a large number of independent and identically distributed random variables will approximate a normal distribution, regardless of the shape of the original distribution (given they have a defined mean and variance). This is why the normal distribution appears so frequently in nature and why many statistical methods are based on the assumption of normality. Random Matrix Theory (RMT): This field studies the properties of matrices whose entries are random variables. While it might initially seem like a separate domain, it connects to the LLN and CLT in the sense that it deals with high-dimensional analogues of these principles. For instance, the behavior of eigenvalues of large random matrices (as described by Wigner's semicircle law) is essentially a high-dimensional version of the CLT. Here's one way to think about the connections: The Law of Large Numbers is about the behavior of averages of random variables. It tells you what happens if you compute a certain quantity (the average) many times independently. The Central Limit Theorem is about the distribution of those averages. It tells you that the averages themselves have a specific distribution (normal), regardless of the distribution of the underlying random variables (as long as certain conditions are met). Random Matrix Theory can be seen as an extension of these concepts to the case where the quantities of interest are not just single numbers (like averages), but entire matrices. In this context, the analogue of an "average" might be something like an eigenvalue, and we are interested in the distribution of these eigenvalues." - chatGPT
What do you have to assume about the probability measure of the variables for the Central Limit Theorem to work? Is it something like that the probability densities should belong to Schwartz space? I'm under impression that you need Fourier transform in the proof. It's probably difficult to make the Fourier transform work rigorously with arbitrary probability measures?
I was okay (barely) up until the kernel regime, exposing my inexperience with neural nets. Think I can go research some more NN fundamentals, then watch this video 10 more times. Probably need to exercise some math & code in that direction as well. Thanks GY+TN trying to educate this old dog .. 🥝🐐
@@asaaa978 The future based on extrapolation from today's settings (which might be inaccurate because some revolutionary addition occurs later that wasn't factored in).
Oh I didn't realize that was specifically with regards to my comment about content for CC. Forseeable future means I have enough interest in math/physics to keep that topic going exclusively for now. But now that I've had Sean Carroll on, it's easier for me to get philosophical with topics without being an outright philosophy topic.
fwiw, you can avoid using limits if you express continuity as the quotient of the number of continuous elemements divided by the total number of elements. If the total number is equal to the sum of the of the number of preserved state components, those added and those removed, then a canonically smooth curve will always have continuity greater than zero. A filtration will always have continuity equal to 1. and a maximally coarse operator will always have continuity zero. if the total is not defined by content it can also be parameterized by any structure which is at least 2nd-cryptomorphic to a presentation of a taylor series.
Just started watching it, I'm more of a fan of math but not really knowledgeable. In the beginning, in the definition of the law of large numbers if we drop the identical distribution requirement - will the whole thing converge to the expectation of individuals expectations? Like averaging all expectations?
You need some control over the distributions if they are not iid. Imagine each distribution being a delta distribution (a single outcome) on either 0 or 1 (i.e. a coin flip with a definite outcome). By interleaving arbitrarily long sequences of delta distribution at 0 or delta distribution at 1, you can make the running average fluctuate arbitrarily.
@@TimothyNguyen Ah, right! I see, thank you. And thanks for the videos. This was the second one I've watched after the Grant Sanders one. I enjoy the format very much!
"I don't want to steal the show" - Timothy Nguyen, after he has interrupted 20 times to say obvious things about, or things not really related, or things he just say so that everybody shall understand how smart he is. Greg Yang has an interesting and non-standard approach to explaining LLN and CLT, Nguyen over and over tries to steer it into the intuition and proofs that he is used to. Would be way better to let Yang talk, and maybe Nguyen could actually get some to him new intuitions about these theorems.
Thanks for the feedback. The Cartesian Cafe is a discussion between two scientific peers where questions and clarifications happen in real time (as is necessary to be on the same page). If you prefer to hear Greg talk solo and would find that more helpful you can watch his other talks on UA-cam, which my conversation is not trying to replicate.
All of your math could have been learned by doing computer graphics before stepping foot onto the Harvard campus. Literally all of it applies in high end computer graphics. Everything from time and memory usage calculations to the distribution of vertices.
This is a must watch for anyone studying gaussian processes, scaling laws, parameter tuning. Greg's unconventional career path is the stuff of movies. Watch til the end - beautifully ties together multiple lines of research.
😮😅😂i
Great guest. I had to listen again to the year of graduation. Excellent way to begin with Greg's story - and hear about this early conversations with Freedman. I wish I had more time to listen to the whole lecture. In any case, thanks to both of you.
Yes, I love Greg's story! We need more like his.
I really enjoyed listening to him talking about his story. Seems so down to earth
I can't believe this has as little view as it has. This is really such quality content! Thank you for making these videos for us!
The viewers not the regular 14 yr olds Mr. Beast fans. I reckon many are working professionals in relevant fields.
uhh I think maybe you forgot that the vast majority of people are not able to understand or have interests in random matrices and neural networks but rather "LETS WATCH THIS TRAIN GO INTO A BIG HOLE WHOAAAA HAHAHHAHA"
First episode I've seen of the show and it was great. You managed technical complexity and audience understanding really well so I managed to get something from it it despite being out of my depth at some points. Thanks a lot for sharing!
THANK YOU, TWO AMAZING PERSONS, FOR SHARING SUCH A GREAT LESSON!
Really fascinating stuff! Thanks so much for giving it exposure here. The conversational format was fun, hard to follow at times but enjoyable
I also fucking love math. Mathematics rules. I love hearing it so forcefully stated.
tears in my eyes, v inspirational
Your channel is so awesome. Please keep on making them.
Amazing interview! Really appreciate it!
chatGPT below
"Let's connect these three concepts: the Law of Large Numbers, the Central Limit Theorem, and Random Matrix Theory.
Law of Large Numbers (LLN): This fundamental theorem in probability theory tells us that as the number of independent trials increases, the sample mean will converge to the population mean. It forms the backbone of statistics and probability, underpinning the concept of statistical convergence.
Central Limit Theorem (CLT): This theorem builds upon the LLN by stating that the sum of a large number of independent and identically distributed random variables will approximate a normal distribution, regardless of the shape of the original distribution (given they have a defined mean and variance). This is why the normal distribution appears so frequently in nature and why many statistical methods are based on the assumption of normality.
Random Matrix Theory (RMT): This field studies the properties of matrices whose entries are random variables. While it might initially seem like a separate domain, it connects to the LLN and CLT in the sense that it deals with high-dimensional analogues of these principles. For instance, the behavior of eigenvalues of large random matrices (as described by Wigner's semicircle law) is essentially a high-dimensional version of the CLT.
Here's one way to think about the connections:
The Law of Large Numbers is about the behavior of averages of random variables. It tells you what happens if you compute a certain quantity (the average) many times independently.
The Central Limit Theorem is about the distribution of those averages. It tells you that the averages themselves have a specific distribution (normal), regardless of the distribution of the underlying random variables (as long as certain conditions are met).
Random Matrix Theory can be seen as an extension of these concepts to the case where the quantities of interest are not just single numbers (like averages), but entire matrices. In this context, the analogue of an "average" might be something like an eigenvalue, and we are interested in the distribution of these eigenvalues."
- chatGPT
I guess the algorithm really wanted me to see this, fascinating topic though 😊
What do you have to assume about the probability measure of the variables for the Central Limit Theorem to work? Is it something like that the probability densities should belong to Schwartz space? I'm under impression that you need Fourier transform in the proof. It's probably difficult to make the Fourier transform work rigorously with arbitrary probability measures?
I was okay (barely) up until the kernel regime, exposing my inexperience with neural nets. Think I can go research some more NN fundamentals, then watch this video 10 more times. Probably need to exercise some math & code in that direction as well. Thanks GY+TN trying to educate this old dog .. 🥝🐐
I once learned the central limit theorem as follows: No matter how the parents look like, the children will always be normal. It this true ?
Thanks for the new video, your channel is great! Are you also planning to add some content about Rene Descartes' philosophical writings?
I plan on focusing on math and science for the forseeable future. I'm not opposed to doing more philosophical stuff later.
@@TimothyNguyen Would also recommend Al-Ghazali in that discussion of Descartes work as they thought about the same things.
@@TimothyNguyen what do you mean by "foreseeable future"?
@@asaaa978 The future based on extrapolation from today's settings (which might be inaccurate because some revolutionary addition occurs later that wasn't factored in).
Oh I didn't realize that was specifically with regards to my comment about content for CC. Forseeable future means I have enough interest in math/physics to keep that topic going exclusively for now. But now that I've had Sean Carroll on, it's easier for me to get philosophical with topics without being an outright philosophy topic.
Love the chat.. Where can we hear some of Greg's 'beats'? :D
soundcloud.com/officialzeta
Ha! That's pretty great. I studied math at Cal, and made electronic music too. Cool and interesting guy.
Came here after Timothy Nguyen's tweets for Greg yang
fwiw, you can avoid using limits if you express continuity as the quotient of the number of continuous elemements divided by the total number of elements. If the total number is equal to the sum of the of the number of preserved state components, those added and those removed, then a canonically smooth curve will always have continuity greater than zero. A filtration will always have continuity equal to 1. and a maximally coarse operator will always have continuity zero.
if the total is not defined by content it can also be parameterized by any structure which is at least 2nd-cryptomorphic to a presentation of a taylor series.
Meandering in his story
Wow, he's a genius! His math skills surpass mine by far. At least let me listen to the music he produced. Can you tell me where I can find his music?
Thank you
Actual talk begins at 33:00
Just started watching it, I'm more of a fan of math but not really knowledgeable. In the beginning, in the definition of the law of large numbers if we drop the identical distribution requirement - will the whole thing converge to the expectation of individuals expectations? Like averaging all expectations?
You need some control over the distributions if they are not iid. Imagine each distribution being a delta distribution (a single outcome) on either 0 or 1 (i.e. a coin flip with a definite outcome). By interleaving arbitrarily long sequences of delta distribution at 0 or delta distribution at 1, you can make the running average fluctuate arbitrarily.
@@TimothyNguyen Ah, right! I see, thank you. And thanks for the videos. This was the second one I've watched after the Grant Sanders one. I enjoy the format very much!
language and theory is a part of nature not everything
you'd better realize strongly always
"I don't want to steal the show" - Timothy Nguyen, after he has interrupted 20 times to say obvious things about, or things not really related, or things he just say so that everybody shall understand how smart he is. Greg Yang has an interesting and non-standard approach to explaining LLN and CLT, Nguyen over and over tries to steer it into the intuition and proofs that he is used to. Would be way better to let Yang talk, and maybe Nguyen could actually get some to him new intuitions about these theorems.
Thanks for the feedback. The Cartesian Cafe is a discussion between two scientific peers where questions and clarifications happen in real time (as is necessary to be on the same page). If you prefer to hear Greg talk solo and would find that more helpful you can watch his other talks on UA-cam, which my conversation is not trying to replicate.
the video is good but due to the handwriting, following the matter becomes difficult.
I see you’re interviewing one of those Raving radicals.
All of your math could have been learned by doing computer graphics before stepping foot onto the Harvard campus. Literally all of it applies in high end computer graphics. Everything from time and memory usage calculations to the distribution of vertices.
1:00:34
Ha now at x ai
You may know math but you don't know how to teach.
This Lecture is amazing for other person.
Maybe we know this video have next day, and we could know how to supposed them.
he needs to say the word LIKE abit more
Like, a hundred times more ?