The Most Important Algorithm in Machine Learning

Back Propagation in training neural networks step by step

Activation Functions in Deep Learning | Sigmoid, Tanh and Relu Activation Function

Новий концерт Єдиного Кварталу від 2 червня 2024. Повний випуск

ТЫ С МАМОЙ В БОЛЬНИЦЕ😂#shorts

PEDRO PEDRO Championship!!! Who’s the champion?? 🤔🤯 @Mamiko #beatbox #challenge #fyp

Neural Network Backpropagation Example With Activation Function

Mikael Laine

Переглядів 34 560

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 12 чер 2024
The simplest possible back propagation example done with the sigmoid activation function.
Some brief comments on how gradients are calculated in actual implementations.
Edit: there is a slight omission/error in the da/dw expression, as pointed out by Laurie Linnett. The video has da/dw = a(1-a), but it should be ia(1-a), because the argument to a is the function (iw), whose derivative (with respect to w) is i.
Наука та технологія

КОМЕНТАРІ • 38

@maxim25o2 4 роки тому ⁺⁸
There is many peoples who are teaching back propagation, but after watching tons of movies I think not many of them know really how its works. No body are calculating and showing numbers in equation. This is first tutorial what are answering to all my questions to back propagation. Many others peoples just are copying work of somebody else not understanding it.
Tutorial is greate, step by step, explaining equations and breaking it to simplest understanding form. Great Job!
@ss5380 3 місяці тому
You are a life saver!! Thank you for breaking the whole process down in such an understandable way!!
@jackmiller2614 4 роки тому
Thanks so much for this video -- I have spent hours looking for a clean explanation of this and I have finally found it!
@laurielinnett8072 4 роки тому ⁺²¹
I think da/dw should be i*a*(1-a).
Let z=i*w, then a=1/(1+exp(-z)) and da/dz=a*(1-a).
Then dz/dw=i, so da/dw=(da/dz)*(dz/dw)=i*a*(1-a)
Never the less an excellent presentation Mikael showing backpropagation and weight updating for a simple example without distracting subscripts and superscripts. Keep up the good work.
LML
@mikaellaine9490 4 роки тому ⁺⁹
Darn, you're correct! I forgot to add the derivative of the inner function w*i, which would indeed be i as a multiplier to a(1-a).
@yasserahmed2781 3 роки тому ⁺⁷
been repeating the calculations several times on paper and trying to understand how the "i" disappeared, i even thought that the video implicitly assumed that i was 1 or something haha should always check the comments right away.
@redditrewindeverybody-subs9336 4 роки тому ⁺³
Thanks for your videos! I'm finally able to implement backpropagation because I (kinda) understood the Maths behind it thanks to you! Please keep more vids coming!
@Sandium 3 роки тому
I was having difficulties wrapping my head around Backpropagation.
Thank you very much for this video!
@jiangfenglin4359 3 роки тому
Thank you so much for making these videos! I love your explanations. :)
@dmdjt 4 роки тому
Thank you very much for your effort and excellent explanation!
@nasirrahim5610 Рік тому
Your explanation is amazing 👏
@flavialan4544 3 роки тому
You are a real teacher!
@raymond5887 4 роки тому
Thanks for the awesome explanation! I finally know how to do back prop now haha.
@obsidianhead 3 роки тому
Thank you for this excellent video
@justchary Рік тому
Thank you very much. This was very helpful.
@kyju77 2 роки тому
Hi,
I will join to other with thanks for this video! Amazing explanation.
Just one question: your example was made, let say, with single "training session".
When I have dozens or hundreds "training session" I calculate average for final error. What about da/dw for example?? Shall I also calculate average for all trainings and then apply ?
Or there is another approach ?
Thanks again.
@vincentjr8013 3 роки тому ⁺⁴
How bias will update for multilayer network?
@benwan8927 2 роки тому
good and clear explanation
@trevortyne534 Рік тому
Excellent explanation Mikael ! Trev T Sydney
@sumayyakamal8857 3 роки тому
Thank you so much. I often hear Hadamard multiplication is used, but that's used for what?
@amukh1_dev274 Рік тому
Thank you! You earned a sub ❤🎉
@FPChris 2 роки тому
As you go back when do you update each weight? Do you go back to w1, adjust it, do a new forward pass, go back only to w2, do a new forward pass, go back only to w3. ?
@nickpelov Рік тому
Question: there are faster activation functions, but how do they affect the backpropagation? When using sigmoid function it also is contained in the derivative. That's not the case for other functions. Is it worth the effort when the back propagation would be a lot slower? Well once the network is finished it'll be used multiple times, so I guess you can use a lot more computing power on learning and using the network on a device with less computing power. Correct me if I'm wrong.
@BB-sd6sm 3 роки тому
great video mate
@onesun3023 4 роки тому
Why do you use lowercase Phi for the activation?
@zamanmakan2729 3 роки тому ⁺¹
Sorry, a(1-a) is the derivative of what? I didn't get how we reached there.
@TheRainHarvester Рік тому
It seems like picking up the numbers would require indirection /following pointers /memory fetch to slow memory, but just recalculating would take fewer clock cycles.
@TheRainHarvester Рік тому
Storing probably wins vs recursive calcs which would be required for multiple branches of a wide NN.
@nickpelov Рік тому
I don't understand why you would calculate da/dw in advance and not during the back propagation. Do we use it more than once? For each iteration the da/dw has different value, so I don't see why we should calculate it upfront. We can just take the output a and calculate a(1-a) during the backpropagation.
@nickpelov Рік тому
In the table at 12:37 there is no way to see when you should stop. maybe you should have included the actual output y, or at least show y on screen. So the goal is to reach a=0.5 right?
@kishorb.surwade6722 3 роки тому
Nice explanation. One special request, if you can give illustration in MS EXCEL, it will give more understanding
@edwardmontague2021 Рік тому
Defined as a function in Maxima CAS .
sigmoid(x):=1/(1+exp(-x))$
About da/dw = d sigmoid( w*x + b) / dw , where x ==a from previous layer.
Using Maxima CAS, I obtain (x*%e^(w*x+b))/(2*%e^(w*x+b)+%e^(2*w*x)+%e^(2*b)).
Whereas with a= sigmoid( w*x + b) and the derivative defined as a*(1-a) I obtain
(%e^(w*x+b))/(2*%e^(w*x+b)+%e^(2*w*x)+%e^(2*b)) ; which differs by the multiplier
x .
Which is correct ?
@youssryhamdy4923 2 роки тому
Sound of this video is low, please, try to make it higher.
Thanks
@andreaardemagni6401 8 місяців тому
Unfortunately the volume of this video is too low to watch it from the phone. Such a shame :(
4 роки тому
please make videos about neural networks on python
@knowledgeanddefense1054 Рік тому
Fun fact, did you know Einstein and Hawking were socialists? Just thought you may find that interesting :)
@vidumini23 4 роки тому
Thank you so much for the clear excellent explanation and effort.

Наступне

Автоматичне відтворення

The Most Important Algorithm in Machine Learning

The Most Important Algorithm in Machine Learning

Back Propagation in training neural networks step by step

Back Propagation in training neural networks step by step

Activation Functions in Deep Learning | Sigmoid, Tanh and Relu Activation Function

Activation Functions in Deep Learning | Sigmoid, Tanh and Relu Activation Function

Новий концерт Єдиного Кварталу від 2 червня 2024. Повний випуск

Новий концерт Єдиного Кварталу від 2 червня 2024. Повний випуск

ТЫ С МАМОЙ В БОЛЬНИЦЕ😂#shorts

ТЫ С МАМОЙ В БОЛЬНИЦЕ😂#shorts

PEDRO PEDRO Championship!!! Who’s the champion?? 🤔🤯 @Mamiko #beatbox #challenge #fyp

PEDRO PEDRO Championship!!! Who’s the champion?? 🤔🤯 @Mamiko #beatbox #challenge #fyp

Climbing to 18M Subscribers 🎉

Climbing to 18M Subscribers 🎉

0:03 / 9:21The Absolutely Simplest Neural Network Backpropagation Example

0:03 / 9:21The Absolutely Simplest Neural Network Backpropagation Example

Основы глубинного обучения, лекция 2 - обратное распространение ошибки, операция свёртки

Основы глубинного обучения, лекция 2 — обратное распространение ошибки, операция свёртки

Backpropagation Algorithm | Neural Networks

Backpropagation Algorithm | Neural Networks

Convolutional Neural Network from Scratch | Mathematics & Python Code

Convolutional Neural Network from Scratch | Mathematics & Python Code

Backpropagation : Data Science Concepts

Backpropagation : Data Science Concepts

How to Create a Neural Network (and Train it to Identify Doodles)

How to Create a Neural Network (and Train it to Identify Doodles)

Neural Networks Explained from Scratch using Python

Neural Networks Explained from Scratch using Python

The Essential Main Ideas of Neural Networks

The Essential Main Ideas of Neural Networks

Самый выгодный ИГРОВОЙ КОМПЬЮТЕР с Wildberries

Самый выгодный ИГРОВОЙ КОМПЬЮТЕР с Wildberries

Я ПЕРЕКУП КОМПЬЮТЕРОВ - вызвал мастера, сколько заработал?

Я ПЕРЕКУП КОМПЬЮТЕРОВ — вызвал мастера, сколько заработал?

Apple WWDC: iOS 18 updates iPhone home screen customization options

Apple WWDC: iOS 18 updates iPhone home screen customization options

Если вы найдете этот USB-накопитель, застрявший в стене, не трогайте его #shorts

Если вы найдете этот USB-накопитель, застрявший в стене, не трогайте его #shorts

How charged your battery?

How charged your battery?

ХОТЕЛ КУПИТЬ ПЕРВЫЙ КОМП APPLE-1 1976 ГОДА ВЫПУСКА! #ломбард #viral #shorts

ХОТЕЛ КУПИТЬ ПЕРВЫЙ КОМП APPLE-1 1976 ГОДА ВЫПУСКА! #ломбард #viral #shorts

Плохо работает переключатель? РЕШЕНИЕ 😱

Плохо работает переключатель? РЕШЕНИЕ 😱