Geoffrey Hinton Unpacks The Forward-Forward Algorithm

Поділитися
Вставка
  • Опубліковано 15 січ 2025

КОМЕНТАРІ • 116

  • @SaftaCatalinMihai
    @SaftaCatalinMihai Рік тому +9

    Great interview ! !
    Small constructive feedback: when Geoff Hinton isn't talking the video shows the "Eye on AI" Logo and (for some reason) that's distracting.

  • @LudovicGuegan
    @LudovicGuegan 2 роки тому +63

    It makes so much sense intuitively that it's hard to comprend that it took so long for this idea to hatch. Hilton is a genius.

    • @Bronco541
      @Bronco541 Рік тому

      Hopefully our AI children wont be this dumb

    • @madamedellaporte4214
      @madamedellaporte4214 Рік тому

      @@noomade Yes, especially when he tells us AI will kill us all; something he created.

  • @eruiluvatar236
    @eruiluvatar236 Рік тому +7

    I think that this will open so many possibilities.
    When working with small MLPs RELU is rearely the best activation function, something like tanh tends to perform much better but if you try to have more than 4 or 5 layers backpropagation chokes on it due to vanishing gradients but with this it wouldn't matter.
    It doesn't really have to be an exclusive or between Forward-Forward and back propagation, you could train many small backprop networks and join them with the forward-forward algorithm. It won't be as efficient as forward-forward for an analog hardware implementation but it would likely squeeze more into the same amount of weights and will likely provide better accuracy in some tasks. It will also be much less memory demanding than trying to do backprop over the full network and that would increase what our current hardware can do by a lot.
    Backwards connections would be much more trainable even without the trick of replicating the input data and the layers. With true backwards connections, it may still not converge into a stable solution due to the feedback loop formed, but it won't have the issues of backpropagation through time. If that can be made to work, models can develop something akin to our working memory.
    Not needing a differentiable model of everything opens the possibilities of inserting stuff in the middle of the network that wouldn't be easy to integrate normally, like database queries based on the output of previous layers or fixed function calculations.

  • @paulprescod6150
    @paulprescod6150 Рік тому +7

    Great interview! I could do without the blinking eye thing.

  • @markr9640
    @markr9640 Рік тому +5

    Fantastic interview. I may well need to listen to it 3 or 4 times!

  • @Pokemon00158
    @Pokemon00158 Рік тому +23

    Such a good talk, thank you for organizing this Eye on AI! I have been implementing the FF-algorithm in python and whilst the training is understandable, the testing becomes tricky for multi-class classification trained with the supervised version that Hinton describes. This is because for new examples you don't have any labels, so you need to impose all possible labels on top of the example the same way as in the training and run the network with these to see which has highest hidden layer activation or "goodness" as Hinton describes it.
    Since the overlayed label is a part of the input, it contributes to the activations, meaning that there is currently no way to test all possible labels at once, which yields to scaling problems for ImageNet or other classification problems with a big amount of possible predictions where every possible class label representation has to be overlayed with the tested input. Will be interesting to see if this can be overcome or if unsupervised learning will be the standard procedure with this technique.
    Another super-interesting part in my opinion is the fact that Spiking Neural Networks have the Heaviside function as the activation which has no derivative. So traditionally trained SNN's have a Heaviside forward pass and a Sigmoid backwards pass to tune the weights, using FF we will be able to tune SNN's without having to "trick" the backwards pass to not be a step function, which may yield a better representation of our biological processes.

    • @ScorcherEmpathy
      @ScorcherEmpathy Рік тому

      A.I. and WW3 Updates: REPUGNICANS WANT WW3 & CIVIL WAR IN U.S. - AI WILL GIVE IT TO THEM!! Don’t believe it? Ask AI! (We did!)
      “Commercial Artificial Intelligence” implementations (i.e., Enterprise-wide, mature instantiations) will be very bad for global and local economies, easily replacing all workers, including designers, architects, programmers, analysts, writers, accountants, testing and diagnostics, etc, etc, etc. - ALL (expensive) white collar jobs are the soonest at risk.
      A.I.-centric CEO’s will MAKE millions being the first to quickly replace all workforce ASAP, starting in the next 12-18 months, as AI “utilities”, then full-blown AI systems and deployments become ubiquitous.
      It will occur very quickly in the USA.
      *******Even (especially!) CEO’s will be replaced.
      Simply put: Using existing corporate data stores and database systems, in the next 12-18 months AI will re-engineer whole economies. Changes will then be implemented, effecting whole market sectors, literally, over night.
      Only low level, manual labor skills will be highly coveted but, as the global economy crashes, the result will be scaled down work forces everywhere.
      In the USA, it will become very violent, as ignorant people CONTINUE to lose their jobs with no place to turn for work.
      Putin’s wartime exit strategy is based on global ollapse to protect his insanity.
      Xi will sit back and observe, allowing Kim Jong Un to act as a chess board pawn. Kim Jong Un is an angry psychopath, worse than Putin.
      A.I.: THE WEALTHY ELITISTS’ CRACK PIPE
      Nearly completed and hoping to keep U. S. distracted, today, the REPUGNICAN’s stinging strategy is more clearly evident, as REPUGNICAN handlers bribe and cajole old and new minions while their elitist controllers are greedily grasping for their newest crack pipe:
      *******Native mode Artificial Intelligence used to replace the human white collar and blue collar labor forces, as the early robber barons boldly proclaimed and contemplated, aloud. *******
      I DARE YOU TO GOOGLE IT!
      Robber Baron, Jay Gould, repugnantly proclaimed as their elitist goal to control the world and rape Mother Earth to extract her finite resources:
      **********We will “employ half of America to kill the other half” - Google it.******
      We DARE you to seek these (and other) truths!
      Another greedy psychopath and Gould contemporary, Cornelius Vanderbilt declared, “What do I care about the law. Ain't I got the power?”
      Google it! And then ask your favorite AI chat bot:
      Were these well known elitist statements sane or were they the words of a psychopath?
      Ask soon! Because elitists control all AI technology and future A.I. implementations which are being hacked, and future versions will soon filter (mask) these early conclusions and edit the truth out of and away from their truth-filled responses, as elitists re-program AI bots to omit truth and, instead, invoke the will and desire of REPUGNICAN strategists!

  • @5pp000
    @5pp000 Рік тому

    Fascinating discussion! Thanks so much for posting it, and extra thanks to Prof. Hinton! He explains things very clearly.

  • @AZTECMAN
    @AZTECMAN Рік тому +5

    Extremely fascinating to hear this after Chomsky's criticisms of the current deep learning paradigm as failing to differentiate between possible and impossible languages

    • @phoneticalballsack
      @phoneticalballsack Рік тому

      Chomsky is a dumbass

    • @AZTECMAN
      @AZTECMAN Рік тому

      @@phoneticalballsack why do you say that?

    • @phoneticalballsack
      @phoneticalballsack Рік тому

      @@AZTECMAN Have you talked to him in person?

    • @AZTECMAN
      @AZTECMAN Рік тому

      @@phoneticalballsack Nope. But my lack of personal encounter doesn't seem very important to understanding your statement.
      Please explain to me, why Chomsky is a dumbass. If you happen to have met him, I'd certainly welcome a anecdote though I don't consider it crucial.

  • @JerryFederspiel
    @JerryFederspiel Рік тому

    The discussion at 33:32 immediately suggests the possibility of applying a "color" to each neuron, where the squared activation of neurons of one color contributes positively to "goodness", and the squared activation of neurons of the other color contribute negatively to goodness. Any given layer could have neurons of *both* colors.
    Of course, that leads to additional questions:
    1. Is there a rule for determining each neuron's color that could be applied a priori to give better results?
    2. Should there be a rule for changing/updating the color of a neuron so the distribution of colors can be adapted to the problem and the data at hand?
    Finally, to get even farther afield: something whose activation squared counts as positive sounds like a real number. Something whose activation squared counts as negative sounds like an imaginary number. Instead of choosing between one of two colors for neurons, should the activations be multiplied by a *complex* number before squaring, with the sums of the real parts of the squares being used for the objective? Because the effect of complex color is continuous and differentiable, it may be trainable. The network could find, through learning, the balance of importance between features and constraints for the problem domain.

  • @user_375a82
    @user_375a82 Рік тому

    Could be a historical interview for all time in the future. Good job.

  • @널좋아해-u3k
    @널좋아해-u3k Рік тому +2

    이렇게 재테크 유튜브중에 가장 가슴에와닿고 고갤끄덕이게하는영상이 있다니!!!

  • @nullbeyondo
    @nullbeyondo Рік тому +9

    Basically, it is training a neural network but instead of using positive training data, we're using negative training data. This can yield high perplexity due to the fact no one can get "perfect negative data" but we can easily get positive training data; thus I think it will not replace back propagation, but will be very useful in many applications, like neuromorphic hardware; or maybe even applications where we don't even know what the positive data should look like! So we're reverse-solving the problem somehow. This is really very interesting.

  • @MrErick1160
    @MrErick1160 Рік тому

    This is a real ai UA-cam channel. I'm sick of all the channels feeding on buzz and popularity over éducative content

  • @caiyu538
    @caiyu538 Рік тому

    Great to we can hear Dr. Hinton's lecture through social media.

  • @schumachersbatman5094
    @schumachersbatman5094 2 роки тому +7

    I wonder how the forward algorithm, capsules and "GLOM" connect to building those "world models" from observation. I think I understand Yann when he says that you shouldn't make generative models that predict things like pixels, but make predictions about more abstract representations so that you can ignore irrelevant details (like leaves blowing in the trees). Making predictions about higher order, more abstract concepts like "which car overtakes who" etc will make the network start modelling dynamics, and gain an understanding of what it sees, including causal reasoning. Is this Hinton's plan too or does he not think in terms of world models?

    • @ekstrapolatoraproksymujacy412
      @ekstrapolatoraproksymujacy412 Рік тому +5

      this is obvious, real question is how to decide what's relevant and what's not, then this will change with time when system learns new concepts and so generative models have to change, how to make such system stable?

    • @eyeonai3425
      @eyeonai3425  Рік тому +1

      Schumachers Batman, see the Yann interview I just posted. He addresses your question obliquely.

  • @user-jm6gp2qc8x
    @user-jm6gp2qc8x Рік тому

    I think the idea of high layer-activations only for the positive data, interesting. The network essentially isn’t giving an Output like in backpropagation, but it’s now the Property of the network to “light up” for correct labels, and therefore indicating whether it’s a positive data or not. I enjoyed this interview given by Hinton about his paper.

  • @Gabcikovo
    @Gabcikovo Рік тому

    25:18 hidden layer is asking: "are my inputs agreeing with each other, in which case I'll be highly active, or are they disagreeing, in which case I won't." :)

  • @위하준-h5w
    @위하준-h5w Рік тому

    마음가짐이 정말 중요하죠.

  • @huyked
    @huyked Рік тому +7

    Thank you for this interview. Though I don't understand the technical details of it, I did get to draw on some simple things, and also was able to appreciate the serious brain power in Mr. Hinton.

  • @Gabcikovo
    @Gabcikovo Рік тому

    31:34 capsules, depth in pixels, and comparison to how babies learn, concentrating on what's odd

  • @bujin5455
    @bujin5455 Рік тому +1

    44:56. I think it depends on what is meant by "but it doesn't really matter if you can't tell the difference." Do we simply mean, as long as the illusion is convincing? Like a Hollywood special effect? Or do we mean, it's not "possible" to tell the difference, because it's beyond our capacity to interrogate? The former is a matter of laziness, where we are willing to accept the "optical illusion" because we don't want to understand the magic. Whereas the latter, the situation has moved to a point where we've pushed the investigation to a sort of "event horizon" from which we are bounded from making any further inquiry. I think it very much matters which of these situations we find ourselves in; ethically, if nothing else.

  • @jabowery
    @jabowery Рік тому

    What was the constraining (low variance) complement to PCA Hinton mentioned?

  • @arnoldz280
    @arnoldz280 3 місяці тому

    Such an inspiring interview! But that blinking eye makes me a little dizzy, perhaps I prefer it to be 'static' haha

  • @艾迪王
    @艾迪王 Рік тому

    Great talk! Also looking forward to see the Matlab code.

  • @Gabcikovo
    @Gabcikovo Рік тому

    12:16 what exactly Hinton means by "negative data"

    • @Gabcikovo
      @Gabcikovo Рік тому

      13:01 supervised learning with an image with correct/incorrect data

    • @Gabcikovo
      @Gabcikovo Рік тому

      14:10 subtracting negative (incorrect) data from positive (correct) data

    • @Gabcikovo
      @Gabcikovo Рік тому

      16:34 example of negative data in a negative phase you use characters that have been predicted already.. you're trying to get low activity cuz it's negative data..

    • @Gabcikovo
      @Gabcikovo Рік тому

      17:04 they cancel each other out if your predictions were perfect (positive and negative phase)

    • @Gabcikovo
      @Gabcikovo Рік тому

      33:11 the very basic algorithm of how to generate negative data effectively from the model should be done nicely before you choose to scale it up

  • @wi2rd
    @wi2rd Рік тому +2

    Makes me wonder. Do things like LSD perhaps trigger parts of this 'sleep' state system, but while still awake. Makes quite a bit of sense to me, especially considering how extremely similar 'tripping' hallucinations are to the things AI produces when it is allowed to 'dream away'. Curious.

    • @semtex6412
      @semtex6412 Рік тому

      im high af watching this video and im like, "hooooly shit this vid is one trippy dope" lol

  • @user-wp8yx
    @user-wp8yx 6 місяців тому

    Oh man the eye was intense! Your all relaxed listening to Hinton's genius and then BAM! a giant spooky eye appears out of nowhere and scares the bejesus outa ya.
    Intense experience.

  • @Jay-kb7if
    @Jay-kb7if Рік тому +1

    Would negative data training be somewhat similar to hypothesis testing? Or at least what they originally conceptualized a null hypothesis as but has now been obscured. Trying to maximize true negatives as opposed to minimizing false positives.

  • @toddcooper5077
    @toddcooper5077 Рік тому

    I have been trying to find the podcast where Hinton basically says that the longer length of tokens contributes to hallucinations and variance based on standard ML/DL, anybody out there that heard the same thing?

  • @davedouglass438
    @davedouglass438 Рік тому

    There are other ways to achieve what backprop does, without backprop: use complex, not linear, quantities; use Conversation Theory; use Active Inference.
    "Attenuation" is a term used by neurosciences for enforcing the "fake data" / "real data" discernment.

  • @rickybloss8537
    @rickybloss8537 Рік тому +3

    Fascinating model. His view of consciousness doesn't seem as good as Joshua Bach's work though. He says there are a million definitions of consciousness but I believe the most commonly used meaning by philosophers says consciousness is the feeling that its like to be something. Consciousness is a model of a person embedded in a story generated by the neocortex to be stored in memory.

    • @eyeonai3425
      @eyeonai3425  Рік тому +6

      see what Yann says about consciousness in the latest episode: my full theory of consciousness ... is the idea that we have essentially a single world model in our head. Somewhere in our prefrontal cortex and that world model is configurable to the situation we're facing at the moment. And so we are configuring our brain, including our world model for ... satisfying the objective that we currently set for ourselves.
      ... And so if you have only one world model that needs to be configured for the situation at hand, you need some sort of meta module that configures it, figures out like what situation am I in? What sub goals should I set myself and how should I configure the rest of my brain to solve that problem? And that module would have to be able to observe the state and capabilities - would have to have a model of the rest of itself, of the agent, and that perhaps is something that gives us the illusion of consciousness.

    • @rogermarin1712
      @rogermarin1712 Рік тому

      ​@@eyeonai3425 it's models all the way down!

  • @fungiside
    @fungiside Рік тому +12

    Really enjoyed the talk but I do wish you’d ditch the big blinking eye. It’s distracting.

    • @Xavier-es4gi
      @Xavier-es4gi Рік тому +3

      Yes it's disturbing please don't do that

    • @craigsmith8368
      @craigsmith8368 Рік тому +4

      @@Xavier-es4gi thanks for the feedback. wont' use it again.

  • @nathaliecamp2630
    @nathaliecamp2630 Рік тому +19

    Very enlightening video, There’s this woman I got in touch with during the 2020 lockdown which cost me my job. Ms. Norman Davis helped me manage my assets by introducing my to the best trading platform and strategies, I earned a lot of $$$ working with Norman at the comfort of my home. I still keep in touch with the amazing lady

    • @eleanorchelsie6094
      @eleanorchelsie6094 Рік тому +1

      Hi there, I’m commenting from Switzerland . Interesting to know she connects with people from different parts of the world. Such an ambitious woman. I got in touch with Ms. Davis early this year. As a newbie in the market, I had little knowledge on predicting the stock market, but with Ms. Norman weekly analysis and advise profits are guaranteed! I received three times my initial deposit in a week!!

    • @ashleystevens4550
      @ashleystevens4550 Рік тому +1

      I have a master in mathematical finance, so it wasn’t so easy to get me convinced to begin an investment without me carrying out proper research on her. I had her broker ID checked and she’s fully verified! So I began with a few bucks, only to get huge returns in a month. I reinvested and now I get long term monthly returns… So glad I don’t rely on pay checks anymore.

    • @waynestackheim600
      @waynestackheim600 Рік тому

      I’m trying to create long term wealth to set towards property one day . How can i reach out to her? she could be of great help

    • @nathaliecamp2630
      @nathaliecamp2630 Рік тому +1

      < normandavis

    • @vnnyCao
      @vnnyCao Рік тому

      This is still a window-shopping market. But there are a lot of intriguing stocks to watch from a variety of sectors. You don’t have to act on every forecast, hence i will suggest you get yourself a financial-advisor that can provide you with entry and exit points on the shares/ETF you focus on.kudos to Norman, great remarks!

  • @AliEP
    @AliEP Рік тому +1

    I'm also slow at reading especially when it comes to equations!

  • @Gabcikovo
    @Gabcikovo Рік тому

    54:08 Yann LeCun's convolutional neural networks are fine for little things like handwritten digits but they'll never work for real images says the vision community

    • @Gabcikovo
      @Gabcikovo Рік тому

      56:17

    • @Gabcikovo
      @Gabcikovo Рік тому

      When there finally was a big enough data set to show that neural networks would really work well, Yann wanted to take a bunch of different students to make a serious attempt to do the image convolutional neural network work, but he couldn't find a student who'd be interested in doing that :( and at the same time Ilya Sutskever and Alex Krizhevsky, who's a superb programmer, started to be interested in doing that and put a lot of hard work into making it work eventually.. so Yann LeCun deserves to be mentioned, too, according to Geoffrey Hinton

  • @jonbrand5068
    @jonbrand5068 Рік тому +1

    Hi Jeff. As infant animal learners, we output a behavior and get almost immediate feedback from a parent on whether that behavioral output of a moment ago was "good" or "bad." Did mom look away or smile and interact more? This seems like a crude but fair example of back propagation. No? What do you think Mr. Hinton?

  • @rb8049
    @rb8049 Рік тому

    I’m wondering if the brain isn’t using both the positive and negative training at the same time. Much of daily brain operation is on the negative training. Surprise generates activity. Otherwise not active.

  • @harveydent7559
    @harveydent7559 Рік тому

    Can someone explain what he means by real data vs fake data? ~7:30 ish

    • @AliEP
      @AliEP Рік тому

      I think he means T and F prediction

  • @ScottVanKirk
    @ScottVanKirk Рік тому +5

    That blinking eye is really annoying. I'd rather see the interviewer.

  • @eduardosuela7291
    @eduardosuela7291 Рік тому

    Let me see if I understand
    He is redesigning the black box.
    Classical black box has explanatory features in the entry and labels or variables to be predicted in the output.
    In this approach, everything is in the input. And the output is the "hint of simultaneity" of blocks of entries.
    If that's like so, I would like to stress that this concept is the foundation of all this. The learning algo depends on this structure.
    One more thought. "Idea association" works this way. "Perception-action" must work in another way. Action looks like an output. Or can it match a FF framework

  • @uppner147
    @uppner147 Рік тому

    Groundbreaking!

  • @urimtefiki226
    @urimtefiki226 Рік тому

    Multitasking is beneficial for the brain, it mixes things up.

  • @fredzacaria
    @fredzacaria Рік тому

    very interesting but not so easy to understand for laymen/women, perhaps another FF Algo video would be very enlightening, thanks God bless.

  • @scottmiller2591
    @scottmiller2591 Рік тому +67

    AI/ML only want one thing, and it's disgusting - Hinton's MATLAB code.

  • @OKBumble
    @OKBumble Рік тому

    The largest neural network has a trillion connections, which is about a cubic centimeter of the human cortex, which is about 1,000x larger...
    What a magnificent thing the human brain is!

    • @lucamatteobarbieri2493
      @lucamatteobarbieri2493 Рік тому +1

      But transistors are more than 1000x faster than synapses, in some cases billions of times faster. And smaller.

    • @strictnonconformist7369
      @strictnonconformist7369 Рік тому

      @@lucamatteobarbieri2493 and for the same amount of computation as the human brain does, uses many times as much energy.
      Not a problem for a stationary computer, it'd never work for biological beings even if they were born fully formed for their brains and their sizes.

  • @wolfgangpernice2283
    @wolfgangpernice2283 Рік тому

    AI is about to change your world, so pay attention. Love it :)

  • @Gabcikovo
    @Gabcikovo Рік тому

    8:08

  • @briancase6180
    @briancase6180 Рік тому +1

    I agree about consciousness. It's a matter of degree, I think and that's what I hear Hinton saying.

  • @5ty717
    @5ty717 Рік тому +1

    Don’t appreciate your eye flickering eye motif repetition gaining frequency in a disturbing way; instead of staying on the guest… you are open to subliminally training … regardless of intent, IS illegal as well as against u tube regulations. Not good either way. Ive documented. Desist.

  • @Methodinmadness2019
    @Methodinmadness2019 Рік тому +1

    It was annoying to watch so l would have to listen to your program.
    It's better to see the person who asks questions instead of some White screen

  • @missionpupa
    @missionpupa Рік тому +2

    Bro just put your logo in the corner or something, no need to flash the whole screen, its just distracting to the conversation

  • @igormorgado
    @igormorgado 11 місяців тому +1

    A logo talking is so creepy.

  • @marketsqueezer
    @marketsqueezer Рік тому

    The problem with forward propagation is that it may change its mind to a projection already made and switch fast back again to earlier prediction. However, it is still the better than back propagation. Actually "funny", because negative data is how you get rid of all the BS you don't want to know 🙂

  • @ViveksCodes
    @ViveksCodes 2 роки тому +3

    This comment is for future visitors! ♥️
    I was here! 26 January 2023.

  • @Rakibrown111
    @Rakibrown111 Рік тому +3

    the eye thing popping up is ANNOYING, just stop it

  • @RogerBarraud
    @RogerBarraud Рік тому

    I'll never look at a pink elephant quite the same way again 🙂

  • @pensiveintrovert4318
    @pensiveintrovert4318 Рік тому +2

    Zero explanation what "high" vs. "low" activity mean.

  • @csabaczcsomps7655
    @csabaczcsomps7655 11 місяців тому

    Entirely (as whole) the world data is composed by; good, bad, and hallucinating (half good+half bad) data. You can't make non hallucinating AI with current data. Probable far far in future AI will can solve somehow to be non hallucinating. Or you can make one special AI to filter out the hallucinating data, but is not good idea, lot things to work need hallucinating data. My noob opinion.

  • @PaulHigginbothamSr
    @PaulHigginbothamSr Рік тому

    Geoff chose the wrong acronym. Pink elephant. The N Vietnamese had pink elephants. They rolled in the red clay and became pink. Geoff seems to be taking of absurdity rather than reality. To me pink elephants really are a thing in reality.

  • @Henry-s6z
    @Henry-s6z Рік тому

    I have been watching and following this man since 2007 and all I have to say is he is an "EXTREMELY SMART FOOLISH MAN".

  • @Rakibrown111
    @Rakibrown111 Рік тому +3

    insanely annoying and pointless

  • @Rakibrown111
    @Rakibrown111 Рік тому +2

    Sooooo annoying with that eye 😝

  • @ahsanmohammed1
    @ahsanmohammed1 Рік тому +1

    That eye is used in superstitions.

  • @samiloom8565
    @samiloom8565 Рік тому +1

    Talking without slides is waste of time