A brilliant man who is nonetheless wrong about the brain and innate processing. He's right that it needn't work that way, but he's wrong that it doesn't work that way.
do you have some pointers for me on that?, it is quite an interesting topic but whenever I (as a computer scientist) have a look into some studies they say 'its more genetically than we've thought' (e.g. personality of persons)
@@martinschulze5399 There is a huge literature on the molecular mechanisms that wire up brains. The brain isn't all that different from a face in some ways. We all have faces but identical twins have the same face and the reason is 99% due to their genes. The brain is the same. The fact that brains wire up almost identically in all humans is a remarkable testimony to the information contained within our genomes.
@@edwardruthazer1849 Yea, as a computer scientist with some experience in genomics, I was shocked by his comments on that. Further yet he made the most elementary mistake about evolution possible. He suggested that learned connections in the brain can somehow be passed on. There is no known biological mechanism for anything close to encoding the connections of trillions of neurons into the genome. All that can be encoded is a bias for a set of connections. If that bias is reproductively successful, it gets passed on. The only question is how specific that bias can get. Taking the number of neurons in the brain, their variance of connections and comparing that to the size of the genome (3 billion nucleotides, 6 Gigabits) you see a limitation on how much can be encoded genetically. This implies that there is a large learned component and that maybe the exact connections don't matter that much. But it is far from implying that these connections are learned.
It is incredible that a presentation of this caliber on artificial intelligence , made by nothing more and nothing less than one of its main creators is of little interest to people. I say this because of the small number of likes
The back propagation question probably only applies to certain parts of the brain, such as those involved in reasoning and thinking. The parts of the brain doing visual or audio processing are likely feed forward such as the neural signals coming down the auditory canal or the visual signals coming down the optical nerve. These have to be feed forward because the primary job is to process inputs in real time. The issue is that in vision, whether or not certain sets of neurons in the visual cortex are 'predisposed' to fire given certain signals or whether or not this is "learned" behavior. In my opinion the issue is that the neural networks in the brain likely have low level networks that are predisposed to certain behaviors but higher level networks adapt due to experience. The issue for computer science is that the neural networks used in AI are static monolithic fixed data structures which have no direct relationship to how the brain actually works. There are no monolithic networks of neurons in the brain as opposed to a cloud of various networks of neurons representing various features that represent the entirety of what is seen or heard. A more accurate way of looking at it is that the brain is "feature encoding" all the minute details of what it sees as a biological form of digital signal processing. And the end result is a 'mental image' of the real world which is stitched together or reconstructed from all the groups of neural networks representing features of the real world. From that "cloud" of data, a separate part of the brain can then do reasoning based on the content of the neural cloud where the "reasoning" is basically weighting taking place across MULTIPLE networks of data, not simply smashing all features down into a single output as computer neural networks do. The difference is like this: a human brain sees a dog because it sees fur, it sees a shape low to the ground, it some legs, it sees a tail, it sees a certain shape of the head, etc. Now the REASONING part goes like this. If there is a shape that looks like a head with a snout and a certain shape of nostrils plus something that looks like fur and something that looks like 4 legs, it most likely is a dog. This is the reasoning part. There are no labels involved in this. And why it is intelligent is because the reasoning can take into account not seeing all 4 legs or only seeing just the snout and head or even just seeing the back and the tail and still make a reasonable assumption that it is a dog based on the features detected. And that reasoning layer happens in a separate part of the brain from the pure image feature processing part that simply collects all the visual data into a coherent mental image of the real world. Other animals don't have that higher order reasoning side of the brain or at least not as complex of a reasoning side of the brain. To make something like this in computer neural networks you would have to create feature encoders which take images and convert them into networks of feature data, with the only rule that they be consistent and coherent. So if the same calibration image is shown to such a program, it will always produce the same neural network output every time. THAT becomes the data that is used for inference and training the "thinking" neural network which uses back propagation and other techniques to "reason" about what those networks of neurons represent. And the key difference is this, the current way cannot answer "why" it is a dog, whereas the latter system would say because x, y and z, meaning the parameters and weights are not hidden in a network they are first order functions in the "thinking" part of the brain that change based on input and preserve parameters, weights and values.
what you have described with "feature encoders" sound very similar to convolutional maps in convolutional neural network models. if you already have not, you might enjoy learning about them
@@MASQUALER0 Convolutions are a computer algorithm designed for processing pixels in computer images to produce certain effects, such as gaussian blur. The problem with them is the are expensive and do not work like the "feature encoders" I am describing, because they operate after the fact. Encoding in computer vision starts in the imaging hardware such as the CCD or CMOs which often use Bayer encoding to produce an R, G, B image. Computers have no way of knowing what an individual RGB pixel value represents in an image file as it is just a set of byte values. So most neural networks have to do extra work to 'figure out' what each RGB value "represents" in an image but that often means using convolution layers, which are expensive. And no matter whether or not you are using neural networks and machine learning or traditional computer programs, this same algorithm serves the same purpose regardless. Yes, you can visualize these convolution layers but how you got there is the issue. When I talk about "feature encoding" in the human brain, I am talking about the efficient process of converting light signals into networks (plural) of neural signals that are part of the low level process of vision found in most animals with eyes. This task is basically analogous to the analog to digital conversion done by computer imaging chips. The difference here is that the vision systems in biology output neural signals and therefore don't need a separate step to "convert" the data which is what is happening in convolution layers. Because biological vision systems have evolved over millions of years, this process is very efficient, whereas convolution algorithms are very expensive. Not to mention the "conversion" step and "thinking" step are all combined in a single model, which means any learned "features" are embedded in a single model not as a network of separate sets of models that are independent features unto themselves.
@@willd1mindmind639 "Computers have no way of knowing what an individual RGB pixel value represents" Why I can't I say "brains have no way of knowing what one activation of retinal neuron represents (neurons have 3 opsin types for different wavelengths of light approximately corresponding to RGB)"? In terms of the rest of your argument, people have tried learning from raw image data before DSP (digital signal processing). All that happens is that the lower level convolution layers learn to do DSP. My question to you is, why do you see signal processing as fundamentally different than feature processing? They both take an input signal and give you an output signal that contains most of the same information as the input but encoded differently so that it is more useful for the next task. You can do learn the parameters of your feature processors (ex. convolutional parameters). Or you can have fixed, predetermined ones like those used for DSP.
@@brooklyna007 At base level, all computing is based on the movements of electrons and therefore there are only two discrete "types" in that architecture at a physical level. Biology is based on molecules which have a much more diverse range of "types" that can be discretely encoded physically. That is the main distinction here that I am calling out. And that physical discreteness in biology is what allows for intrinsic distinction between things on a fundamental level. Everything in computing is a calculation based on mathematical models that have no true capability to distinguish types because there is no discreteness in the data itself on a fundamental level. Meaning all pixels in a file look the same because they are all the same numeric data type and there is nothing physically that distinguishes one from another. Same with a set of text tokens in a language model and so forth. Therefore the difference between how neuron networks operate over data is that those distributions are based on already discrete values. And those discrete values give every item encoded by biological vision a distinct and discrete signature automatically based purely on the inherent properties of how biology works. No such thing exists in computing and this is why you require labels or human defined examples as the initial kernel to use in training. So in my example about pixels, to understand a specific color, humans have to give an example of that color value in pixels to use in training. Whereas "green" as a color is going to already be a discrete element in the brain based on the molecules used to represent any color in the visual spectrum. So there is already a physically discrete distinction between that color and other colors. And that will automatically trigger network activation, learning and understanding without any 'pre training' or labels. A good example of this biological discreteness can be seen in the camouflage systems of certain animals such as octopus and cuttlefish.
@@willd1mindmind639 Your first sentence was a massive and false assumption. There are many-valued logic computers that have been researched. It is just that simulating multi-valued systems with binary logic is just as powerful. That has been proven mathematically and confirmed with experiments. Then you mention physical discreteness as some driving factor to allow distinction. But digital computers are physically discrete as well. "Everything in computing is a calculation based on mathematical models that have no true capability to distinguish types because there is no discreteness in the data itself on a fundamental level." Please learn some statistics, information theory and harmonic analysis. This statement is just ignorant and bold. "Meaning all pixels in a file look the same because they are all the same numeric data type and there is nothing physically that distinguishes one from another. " Pixels have positions (relative position is what matters, i.e. their context) that distinguish them from other pixels. Pixels simply share structure but they are distinct. "Same with a set of text tokens in a language model and so forth. " Text components also have a context that distinguishes them from either seemingly identical text components. "Therefore the difference between how neuron networks operate over data is that those distributions are based on already discrete values." False. Not only were none of your premises true but this conclusion is directly provably false. Digital computers are discrete systems. That is literally what the word "Digital" means in the context of computing. It is discrete computing.
As for the innate answer (48:30), its quite erroneous and in fact self contradictory since Hinton first says that nothing is innate and then he explains how innate parameters are learned by evolution, he might be making the claim that he can learn the innate bits as part of his parameter search since his search is much faster than what the brain is doing (so he can do evolutionary search + parameter tuning) but that is not separating the innate and the learning bits. Language typology is quite a well developed and reproducible field and you can't be serious if you're going to claim that all of it is part of the general cognitive learning mechanism.
I think his claim was more along the lines of, innate skills are by definition innate, but we are not explicitly told what is innate and what isn't innate. In order to make that distinction, we must go through an evolutionary learning process. An analogy would be that suppose we are given a set of Legos that forms a ship and the instruction manual with an incomplete, unsorted set of pictures that shows the building progression of the Lego set. Since the pictures are unsorted and incomplete, this operates as a "fog of war" property like in video games. The only thing we are given and know for certain is the "initial value", where to start. Thus the only way to build the Lego set is to learn by trial and error: which pieces go together and which pieces don't. By learning which pieces go together (validated by the fact that the Lego pieces start looking more and more like a ship), we start to validate the instruction manual and gain a better understanding of it.
@@evankim4096 I've been seriously thinking about this for about ten years or so, No disrespect intended but the question is quite empiric, there is no way to explain why human languages are all so similar, it is like as nonsensical as reaching an alien planet and where everybody talks different versions of Java and you would claim, yes, they learn Java because Java is easiest to learn. Its ridiculous. There are things that guide human cognition, some of them might be at the neural level, but in most interesting problems and especially in terms that are uniquely human, they are more to be found in modular structure or inter modular connectivity, and I don't think Hinton would in any case disagree with this point if the question was put to him. In the point I related to he seems quite uncharacteristically (but quite in pace with the times) disrespectful towards Chomsky's work and towards linguists in general and as a linguist who does NLP and thus is an admirer I must say that he might very well be mistaken on this point, I mean it could be possible to do both the work of evolution as that of the language acquiring baby in a single search but it isn't trivial to see whether that is the case and in any case evolution doesn't seem to have found it that easy to do since language only happened once.
BTW, just to elaborate innate, innate is at least as innate as the facial structure and feature recognition module in the brain which is quite known and has been somewhat elaborated by now.
@@veltzerdoron I agree, he did seem quite dismissive of linguists in general and of any possible insight they could contribute towards NLP models in computer science. My knowledge in NLP is limited, but in general, it's hard to accept points of views which are dismissive towards the other side (maybe he just has a sarcastic personality). When he was asked about which side to work on (computer science vs neuroscience), he seemed quite dismissive of other computer scientists in general when he indicated that they were working in the opposite direction as he was (his direction was towards neuroscience)
I like the general direction taken on re-engineering backpropogation and in particular the idea of feedback involved but I think the solution is forced and unnatural and focuses too much on individual neurons. Much simpler to introduce the idea of a "chemical" which in computing terms would be an indication to rapidly change neural connection weightings much like we do when we get pleasure from "positive feedback" from our actions. Also ultimately the neural networks need to be much more interconnected rather than layered. All IMO.
Lecture starts at 5:13
Hope they invite Jeff Hawkins of Numenta to one of these.
amazing content
A brilliant man who is nonetheless wrong about the brain and innate processing. He's right that it needn't work that way, but he's wrong that it doesn't work that way.
do you have some pointers for me on that?, it is quite an interesting topic but whenever I (as a computer scientist) have a look into some studies they say 'its more genetically than we've thought' (e.g. personality of persons)
@@martinschulze5399 There is a huge literature on the molecular mechanisms that wire up brains. The brain isn't all that different from a face in some ways. We all have faces but identical twins have the same face and the reason is 99% due to their genes. The brain is the same. The fact that brains wire up almost identically in all humans is a remarkable testimony to the information contained within our genomes.
@@edwardruthazer1849 Yea, as a computer scientist with some experience in genomics, I was shocked by his comments on that. Further yet he made the most elementary mistake about evolution possible. He suggested that learned connections in the brain can somehow be passed on. There is no known biological mechanism for anything close to encoding the connections of trillions of neurons into the genome. All that can be encoded is a bias for a set of connections. If that bias is reproductively successful, it gets passed on. The only question is how specific that bias can get. Taking the number of neurons in the brain, their variance of connections and comparing that to the size of the genome (3 billion nucleotides, 6 Gigabits) you see a limitation on how much can be encoded genetically. This implies that there is a large learned component and that maybe the exact connections don't matter that much. But it is far from implying that these connections are learned.
It is incredible that a presentation of this caliber on artificial intelligence , made by nothing more and nothing less than one of its main creators is of little interest to people. I say this because of the small number of likes
The quick answer is no.
The back propagation question probably only applies to certain parts of the brain, such as those involved in reasoning and thinking. The parts of the brain doing visual or audio processing are likely feed forward such as the neural signals coming down the auditory canal or the visual signals coming down the optical nerve. These have to be feed forward because the primary job is to process inputs in real time. The issue is that in vision, whether or not certain sets of neurons in the visual cortex are 'predisposed' to fire given certain signals or whether or not this is "learned" behavior. In my opinion the issue is that the neural networks in the brain likely have low level networks that are predisposed to certain behaviors but higher level networks adapt due to experience. The issue for computer science is that the neural networks used in AI are static monolithic fixed data structures which have no direct relationship to how the brain actually works. There are no monolithic networks of neurons in the brain as opposed to a cloud of various networks of neurons representing various features that represent the entirety of what is seen or heard. A more accurate way of looking at it is that the brain is "feature encoding" all the minute details of what it sees as a biological form of digital signal processing. And the end result is a 'mental image' of the real world which is stitched together or reconstructed from all the groups of neural networks representing features of the real world. From that "cloud" of data, a separate part of the brain can then do reasoning based on the content of the neural cloud where the "reasoning" is basically weighting taking place across MULTIPLE networks of data, not simply smashing all features down into a single output as computer neural networks do.
The difference is like this: a human brain sees a dog because it sees fur, it sees a shape low to the ground, it some legs, it sees a tail, it sees a certain shape of the head, etc. Now the REASONING part goes like this. If there is a shape that looks like a head with a snout and a certain shape of nostrils plus something that looks like fur and something that looks like 4 legs, it most likely is a dog. This is the reasoning part. There are no labels involved in this. And why it is intelligent is because the reasoning can take into account not seeing all 4 legs or only seeing just the snout and head or even just seeing the back and the tail and still make a reasonable assumption that it is a dog based on the features detected. And that reasoning layer happens in a separate part of the brain from the pure image feature processing part that simply collects all the visual data into a coherent mental image of the real world. Other animals don't have that higher order reasoning side of the brain or at least not as complex of a reasoning side of the brain. To make something like this in computer neural networks you would have to create feature encoders which take images and convert them into networks of feature data, with the only rule that they be consistent and coherent. So if the same calibration image is shown to such a program, it will always produce the same neural network output every time. THAT becomes the data that is used for inference and training the "thinking" neural network which uses back propagation and other techniques to "reason" about what those networks of neurons represent. And the key difference is this, the current way cannot answer "why" it is a dog, whereas the latter system would say because x, y and z, meaning the parameters and weights are not hidden in a network they are first order functions in the "thinking" part of the brain that change based on input and preserve parameters, weights and values.
what you have described with "feature encoders" sound very similar to convolutional maps in convolutional neural network models. if you already have not, you might enjoy learning about them
@@MASQUALER0 Convolutions are a computer algorithm designed for processing pixels in computer images to produce certain effects, such as gaussian blur. The problem with them is the are expensive and do not work like the "feature encoders" I am describing, because they operate after the fact. Encoding in computer vision starts in the imaging hardware such as the CCD or CMOs which often use Bayer encoding to produce an R, G, B image. Computers have no way of knowing what an individual RGB pixel value represents in an image file as it is just a set of byte values. So most neural networks have to do extra work to 'figure out' what each RGB value "represents" in an image but that often means using convolution layers, which are expensive. And no matter whether or not you are using neural networks and machine learning or traditional computer programs, this same algorithm serves the same purpose regardless. Yes, you can visualize these convolution layers but how you got there is the issue.
When I talk about "feature encoding" in the human brain, I am talking about the efficient process of converting light signals into networks (plural) of neural signals that are part of the low level process of vision found in most animals with eyes. This task is basically analogous to the analog to digital conversion done by computer imaging chips. The difference here is that the vision systems in biology output neural signals and therefore don't need a separate step to "convert" the data which is what is happening in convolution layers. Because biological vision systems have evolved over millions of years, this process is very efficient, whereas convolution algorithms are very expensive. Not to mention the "conversion" step and "thinking" step are all combined in a single model, which means any learned "features" are embedded in a single model not as a network of separate sets of models that are independent features unto themselves.
@@willd1mindmind639 "Computers have no way of knowing what an individual RGB pixel value represents"
Why I can't I say "brains have no way of knowing what one activation of retinal neuron represents (neurons have 3 opsin types for different wavelengths of light approximately corresponding to RGB)"? In terms of the rest of your argument, people have tried learning from raw image data before DSP (digital signal processing). All that happens is that the lower level convolution layers learn to do DSP. My question to you is, why do you see signal processing as fundamentally different than feature processing? They both take an input signal and give you an output signal that contains most of the same information as the input but encoded differently so that it is more useful for the next task. You can do learn the parameters of your feature processors (ex. convolutional parameters). Or you can have fixed, predetermined ones like those used for DSP.
@@brooklyna007 At base level, all computing is based on the movements of electrons and therefore there are only two discrete "types" in that architecture at a physical level. Biology is based on molecules which have a much more diverse range of "types" that can be discretely encoded physically. That is the main distinction here that I am calling out. And that physical discreteness in biology is what allows for intrinsic distinction between things on a fundamental level.
Everything in computing is a calculation based on mathematical models that have no true capability to distinguish types because there is no discreteness in the data itself on a fundamental level. Meaning all pixels in a file look the same because they are all the same numeric data type and there is nothing physically that distinguishes one from another. Same with a set of text tokens in a language model and so forth.
Therefore the difference between how neuron networks operate over data is that those distributions are based on already discrete values. And those discrete values give every item encoded by biological vision a distinct and discrete signature automatically based purely on the inherent properties of how biology works. No such thing exists in computing and this is why you require labels or human defined examples as the initial kernel to use in training. So in my example about pixels, to understand a specific color, humans have to give an example of that color value in pixels to use in training. Whereas "green" as a color is going to already be a discrete element in the brain based on the molecules used to represent any color in the visual spectrum. So there is already a physically discrete distinction between that color and other colors. And that will automatically trigger network activation, learning and understanding without any 'pre training' or labels. A good example of this biological discreteness can be seen in the camouflage systems of certain animals such as octopus and cuttlefish.
@@willd1mindmind639 Your first sentence was a massive and false assumption. There are many-valued logic computers that have been researched. It is just that simulating multi-valued systems with binary logic is just as powerful. That has been proven mathematically and confirmed with experiments.
Then you mention physical discreteness as some driving factor to allow distinction. But digital computers are physically discrete as well.
"Everything in computing is a calculation based on mathematical models that have no true capability to distinguish types because there is no discreteness in the data itself on a fundamental level."
Please learn some statistics, information theory and harmonic analysis. This statement is just ignorant and bold.
"Meaning all pixels in a file look the same because they are all the same numeric data type and there is nothing physically that distinguishes one from another. "
Pixels have positions (relative position is what matters, i.e. their context) that distinguish them from other pixels. Pixels simply share structure but they are distinct.
"Same with a set of text tokens in a language model and so forth. "
Text components also have a context that distinguishes them from either seemingly identical text components.
"Therefore the difference between how neuron networks operate over data is that those distributions are based on already discrete values."
False. Not only were none of your premises true but this conclusion is directly provably false. Digital computers are discrete systems. That is literally what the word "Digital" means in the context of computing. It is discrete computing.
As for the innate answer (48:30), its quite erroneous and in fact self contradictory since Hinton first says that nothing is innate and then he explains how innate parameters are learned by evolution, he might be making the claim that he can learn the innate bits as part of his parameter search since his search is much faster than what the brain is doing (so he can do evolutionary search + parameter tuning) but that is not separating the innate and the learning bits. Language typology is quite a well developed and reproducible field and you can't be serious if you're going to claim that all of it is part of the general cognitive learning mechanism.
The rest of the talk is fascinating as are all the rest of Hinton's similar talks on this subject. The transpose redundancy is downright enigmatic.
I think his claim was more along the lines of, innate skills are by definition innate, but we are not explicitly told what is innate and what isn't innate.
In order to make that distinction, we must go through an evolutionary learning process.
An analogy would be that suppose we are given a set of Legos that forms a ship and the instruction manual with an incomplete, unsorted set of pictures that shows the building progression of the Lego set. Since the pictures are unsorted and incomplete, this operates as a "fog of war" property like in video games. The only thing we are given and know for certain is the "initial value", where to start.
Thus the only way to build the Lego set is to learn by trial and error: which pieces go together and which pieces don't.
By learning which pieces go together (validated by the fact that the Lego pieces start looking more and more like a ship), we start to validate the instruction manual and gain a better understanding of it.
@@evankim4096
I've been seriously thinking about this for about ten years or so, No disrespect intended but the question is quite empiric, there is no way to explain why human languages are all so similar, it is like as nonsensical as reaching an alien planet and where everybody talks different versions of Java and you would claim, yes, they learn Java because Java is easiest to learn. Its ridiculous.
There are things that guide human cognition, some of them might be at the neural level, but in most interesting problems and especially in terms that are uniquely human, they are more to be found in modular structure or inter modular connectivity, and I don't think Hinton would in any case disagree with this point if the question was put to him.
In the point I related to he seems quite uncharacteristically (but quite in pace with the times) disrespectful towards Chomsky's work and towards linguists in general and as a linguist who does NLP and thus is an admirer I must say that he might very well be mistaken on this point, I mean it could be possible to do both the work of evolution as that of the language acquiring baby in a single search but it isn't trivial to see whether that is the case and in any case evolution doesn't seem to have found it that easy to do since language only happened once.
BTW, just to elaborate innate, innate is at least as innate as the facial structure and feature recognition module in the brain which is quite known and has been somewhat elaborated by now.
@@veltzerdoron I agree, he did seem quite dismissive of linguists in general and of any possible insight they could contribute towards NLP models in computer science.
My knowledge in NLP is limited, but in general, it's hard to accept points of views which are dismissive towards the other side (maybe he just has a sarcastic personality). When he was asked about which side to work on (computer science vs neuroscience), he seemed quite dismissive of other computer scientists in general when he indicated that they were working in the opposite direction as he was (his direction was towards neuroscience)
I like the general direction taken on re-engineering backpropogation and in particular the idea of feedback involved but I think the solution is forced and unnatural and focuses too much on individual neurons. Much simpler to introduce the idea of a "chemical" which in computing terms would be an indication to rapidly change neural connection weightings much like we do when we get pleasure from "positive feedback" from our actions. Also ultimately the neural networks need to be much more interconnected rather than layered. All IMO.
Totally agree with you and I’ve been thinking the same. Do you know such approaches exist in the literature or somewhere?
But ANNs work even without a dropout
And thus most of them are not generalized.
hmm, could not follow the explanation
Too many texts, easy to get lost. Hard to follow his presentation.