Machine Learning / Deep Learning Tutorials for Programmers playlist: ua-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html Keras Machine Learning / Deep Learning Tutorial playlist: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
I just loved how you teach, all the videos i have watched are so well and deeply explained. You are a really good teacher. Thanks for sharing the knowledge
Thanks, obinna! You want the loss to be as close to zero as possible. So, when you think about if a particular value for loss is "good" or "bad," then you should consider how close it is to zero. I would say that the 0.3 value in the video is decent, but not great. We'd like to get loss down even closer to zero if we can.
deeplizard thanks alot. I ran something similar and I had a loss of 0.36, someone told me it was high and wanted to know the threshold for food and bad was. So Thanks alot
no one explained it this easily... no one! I know when I see a gem of a course! Dont pay a penny for any course before you finish these.. every video here
Question: in the description of MSE, deeplizard, you had the calculation of 0.25 - 0 but I'm confused by this. 0.25 was the prediction of label cat and 0 was the label's index, not ideal value. When the model is performing correctly, would not 1.00 (or 100%) prediction of cat be the better outcome? So I would have expected the MSE calc to square(0.25 - 1) as the first error value to average. Did I miss something?
Yeah I was a bit confused by this myself. I think what might be happening is that with only two categories, you can assign one to correspond to 0, the other to correspond to 1, and then have the neural net output a number in the interval [0, 1] to specify its confidence in each category. With more than two categories, I think you may need to code things a different way, and perhaps the idea you brought up would be more characteristic of that. Either way I'm very unsure of the finer points and I plan to watch the rest of these videos to find out. Going back to the example, I think that when she said the neural net picked 0.25 that implied it was 0.25 away from "cat" but 0.75 away from "dog".
{ "question": "At the end of each epoch of training, loss is calculated on a model's:", "choices": [ "Predictions", "Weights", "Biases", "Observations" ], "answer": "Predictions", "creator": "Chris", "creationDate": "2019-12-07T04:58:15.954Z" }
Hey Krish - Sparse categorical cross entropy is the same loss function as categorical cross entropy (also called _log loss)._ If you need more on plain categorical cross entropy/log loss, check out this post: www.kaggle.com/dansbecker/what-is-log-loss Now once you know what categorical cross entropy is, we can discuss the difference between that and _sparse_ categorical cross entropy in Keras. The only difference between the two is the format that the labels must be in before fitting the model. For example, if we had labels formatted where a label for a cat was 0 and a label for a dog was 1, we would use sparse categorical cross entropy because it accepts "sparse" labels (integers). If instead the labels were one-hot-encoded so that cat = [0,1] and dog = [1,0], we'd use categorical cross entropy. My video on one-hot-encoding: ua-cam.com/video/v_4KWmkwmsU/v-deo.html
Absolutely fantastic videos. Love the explanations and the depth of the concept. Total amazeballs! PS: You open milkyway.PNG every time you record a video? was there no better way to do it?...lol
I'm trying to find an example of using the loss of one neural network (ideally shallow) and using it to update the weights of another neural network. Ideally, both neural networks are shallow neural networks that begin with equal weights. Can you point me to a research paper, website primer, patent, or any document available online whatsoever that describes this design? Thanks.
i am on ann in image dataset. i have some query on ANN Model for image classification, InvalidArgumentError: Incompatible shapes: [100,240,240,1] vs. [100,1].can you please suggest some solutions.
Great video. So the loss function has to be specified in two places: within the compile line and additionally as an attribute of the model? Could you comment on that please , thanks!
Thank you very much for the great video. Is it possible to get a hold of the data you are using, so that I can run the code that you show in the video?
The loss is not only calculated and backpropagated after each epoch (1 epoch = pass through all training data) but after each training step (after each training sample is passed through the network)
I don't think so. It makes more sense to get the mean of different data and then updating the weights and bias because if your doing it after every sample the chance of overfitting is pretty high. Please correct me if I'm wrong! :)
Im trying to learn specifically about a deep q network/learning and I dont understand how the loss functions works in this case. There is no model so there is no value to compare the output to? Can anyone explain
As far as I understand (still a noob) MSE is used for linear regression and binary crossentropy for logistic regression. In your case you have used MSE (at least explained it) for a binary classification model. Why? (thanks!)
hi, when you said accummulate all the individual errors for each input, does it mean individual erros for each sample? Bcoz what I understand is, each node in the input layer represents a feature from each sample. So let's say I have 100 patients (100 samples), so there will be 100 errors right? with which their accummulated error will be computed with one of the loss function algorithms?
These videos are great. But this one is not quite correct I think. At 0:45 the NN outputs 2 values. If the above is higher than the one below it thinks that it is a cat, as the correct label for the example would be (1, 0). ("100% cat, 0% dog.) So the error for this example would be (upperVal-1)^2+(lowerVal-0)^2
Your right, I think the formula for error would be 0.25 -1 (instead of 0.25 - 0) (cat need to be 1 as it is the true value) because normally the true value takes 1 (100% proba) and all other possibilities (classes) take 0 (One Hot Encoding)
@@analytics48 For your first comment: Almost. The thing is that the NN shown outputs two values. Let's say 0.7 for the upper (cat) and 0.1 for lower(dog). Then the error is (0.7-1)^2+(0.1-0)^2
@@analytics48 For your second comment: If the image is an elephant, then the correct label is (0,0,1,0). Your NN should have 4 output values called (val_0, val_1, val_2,val_3). Then the loss value would be (val_0-0)^2+(val_1-0)^2+(val_2-1)^2+(val_3-0)^2
To be clear: The correct label for elephant is 2 which corresponds to the "label vector" (0,0,1,0). You can get the loss function by computing NN output vector - label vector and then the sum of the squares of each entry.
One last thing: Everything in the video would be fine actually if the NN had only one output and the closer it is to 0,the more it thinks that it is a cat.
diagram has an array of inputs and outputs. explanation said for each input calulate the output but in this context you are talking about a labeled image not an array element.
Depending on which loss function you use, the order of the desired value and the calculated value may not matter. (Example: desired value - calculated value OR calculated value - desired value) This would be true if you were squaring the errors for example. If you square the difference, it will not matter which value you subtract from which. Let me know if this helps.
Hey Ameer - The accuracy is just the percentage of samples the network predicted correctly. If I have 10 samples, and my network predicted 8 correctly, then accuracy is 0.8, or 80%. Loss, on the other hand, is all about what is covered in this video. Loss and accuracy do not sum to 1.
@@deeplizard Thank you very much for your great training content and reply this private message. I am more interested about you. Hope you could share more about yourself/your view about what you are interested it would be more fun and valuable. What you are up to? love!!!
Thanks, Alice! I'm Mandy 😊 My partner Chris and I are the creators of deeplizard. We have a second channel where we vlog and share more about ourselves and our experiences. You can check it out and say hi to us there at ua-cam.com/users/deeplizardvlog
loss is calculated at the end of each training sample not at end of epoch accuracy is calculated optionally. please give correction tip. or correct me if I am wrong
Greetings Jobson! We have enabled auto-generated English subtitles for all videos. I'm not sure why the three videos you commented on are not showing English subtitles. We will look into this. In the mean time, you can visit the blog pages on deeplizard.com for any of our UA-cam videos. There are fully written lecture notes in English for each video there.
Machine Learning / Deep Learning Tutorials for Programmers playlist: ua-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html
Keras Machine Learning / Deep Learning Tutorial playlist: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
this course will last all the time thank u so much
Loss in a neural network? More like "I'm lost in the absolute greatness"...of these videos. Thanks so much for making and sharing them!
This channel is a gold mine!
Thank u so much
couldn't agree more
Very nice presentation. Well structured, easy to understand, smooth and at a speed that I'm confortable with - not too slow and not too fast.
Thank you very much for the video! The simple conceptual explanations helps me understand quickly!
So quick question: How much difference in validation loss and testing loss is acceptable?
I just loved how you teach, all the videos i have watched are so well and deeply explained. You are a really good teacher. Thanks for sharing the knowledge
hello, very nice video but i noticed that in the end, the loss became 0.3, is that a good thing or a bad thing, is it high or is it low. thank you
Thanks, obinna! You want the loss to be as close to zero as possible. So, when you think about if a particular value for loss is "good" or "bad," then you should consider how close it is to zero. I would say that the 0.3 value in the video is decent, but not great. We'd like to get loss down even closer to zero if we can.
deeplizard thanks alot. I ran something similar and I had a loss of 0.36, someone told me it was high and wanted to know the threshold for food and bad was. So Thanks alot
no one explained it this easily... no one! I know when I see a gem of a course! Dont pay a penny for any course before you finish these.. every video here
Question: in the description of MSE, deeplizard, you had the calculation of 0.25 - 0 but I'm confused by this. 0.25 was the prediction of label cat and 0 was the label's index, not ideal value. When the model is performing correctly, would not 1.00 (or 100%) prediction of cat be the better outcome? So I would have expected the MSE calc to square(0.25 - 1) as the first error value to average. Did I miss something?
Yeah I was a bit confused by this myself. I think what might be happening is that with only two categories, you can assign one to correspond to 0, the other to correspond to 1, and then have the neural net output a number in the interval [0, 1] to specify its confidence in each category. With more than two categories, I think you may need to code things a different way, and perhaps the idea you brought up would be more characteristic of that. Either way I'm very unsure of the finer points and I plan to watch the rest of these videos to find out. Going back to the example, I think that when she said the neural net picked 0.25 that implied it was 0.25 away from "cat" but 0.75 away from "dog".
i am in love with this vocal..!! sorry for that. how much pretty she explain without any interrupt !!
Quality content! Well explained with easy examples!
Thanks, Jimmy!
Wonderful explanation as always
Thanks for these very simple and short explanations!
You're welcome, Jacob!
You're gem of a person!
Thank you. Fantastic videos :D
I seldom make comments, but this is literally better then the 100 dollar udemy course I took
why loss nan
{
"question": "At the end of each epoch of training, loss is calculated on a model's:",
"choices": [
"Predictions",
"Weights",
"Biases",
"Observations"
],
"answer": "Predictions",
"creator": "Chris",
"creationDate": "2019-12-07T04:58:15.954Z"
}
Thank you for all of your quiz question contributions, Chris! They are all on the corresponding blogs at deeplizard.com now.
how does sparse categorical cross entropy work?
Hey Krish - Sparse categorical cross entropy is the same loss function as categorical cross entropy (also called _log loss)._ If you need more on plain categorical cross entropy/log loss, check out this post: www.kaggle.com/dansbecker/what-is-log-loss
Now once you know what categorical cross entropy is, we can discuss the difference between that and _sparse_ categorical cross entropy in Keras. The only difference between the two is the format that the labels must be in before fitting the model.
For example, if we had labels formatted where a label for a cat was 0 and a label for a dog was 1, we would use sparse categorical cross entropy because it accepts "sparse" labels (integers). If instead the labels were one-hot-encoded so that cat = [0,1] and dog = [1,0], we'd use categorical cross entropy.
My video on one-hot-encoding: ua-cam.com/video/v_4KWmkwmsU/v-deo.html
Very nice video! I will recommend the playlist for my friends
Absolutely fantastic videos. Love the explanations and the depth of the concept. Total amazeballs!
PS: You open milkyway.PNG every time you record a video? was there no better way to do it?...lol
Thank you! Lol and yes, in later videos, you'll no longer see the open PNG. Was just learning how to edit in these early ones 😆
I'm trying to find an example of using the loss of one neural network (ideally shallow) and using it to update the weights of another neural network. Ideally, both neural networks are shallow neural networks that begin with equal weights. Can you point me to a research paper, website primer, patent, or any document available online whatsoever that describes this design? Thanks.
i am on ann in image dataset. i have some query on ANN Model for image classification,
InvalidArgumentError: Incompatible shapes: [100,240,240,1] vs. [100,1].can you please suggest some solutions.
Great video. So the loss function has to be specified in two places: within the compile line and additionally as an attribute of the model? Could you comment on that please , thanks!
Just need to specify loss once in the compile line.
Thank you very much for the great video. Is it possible to get a hold of the data you are using, so that I can run the code that you show in the video?
We show how to generate this data set here:
deeplizard.com/learn/video/UkzhouEk6uY
thank you a lot for your help
thank you very much for this clear and helpful explanation.
These videos are great
Great video! Auto-generated subtitles are in Dutch! Changing it to English would be of great help to others!
The loss is not only calculated and backpropagated after each epoch (1 epoch = pass through all training data) but after each training step (after each training sample is passed through the network)
I don't think so.
It makes more sense to get the mean of different data and then updating the weights and bias because if your doing it after every sample the chance of overfitting is pretty high.
Please correct me if I'm wrong! :)
Which dataset you used ?
Im trying to learn specifically about a deep q network/learning and I dont understand how the loss functions works in this case. There is no model so there is no value to compare the output to? Can anyone explain
For DQNs, check the Reinforcement Learning course:
deeplizard.com/learn/playlist/PLZbbT5o_s2xoWNVdDudn51XM8lOuZ_Njv
As far as I understand (still a noob) MSE is used for linear regression and binary crossentropy for logistic regression. In your case you have used MSE (at least explained it) for a binary classification model. Why? (thanks!)
hi, when you said accummulate all the individual errors for each input, does it mean individual erros for each sample? Bcoz what I understand is, each node in the input layer represents a feature from each sample. So let's say I have 100 patients (100 samples), so there will be 100 errors right? with which their accummulated error will be computed with one of the loss function algorithms?
Kindly increase the font size in your videos. The explanations are good and would be more mmeaningful with clear text
Thanks for the feedback, sourav. In later videos, the font size is increased, and I zoom in on the code :)
These videos are great.
But this one is not quite correct I think. At 0:45 the NN outputs 2 values. If the above is higher than the one below it thinks that it is a cat, as the correct label for the example would be (1, 0). ("100% cat, 0% dog.)
So the error for this example would be
(upperVal-1)^2+(lowerVal-0)^2
Your right, I think the formula for error would be 0.25 -1 (instead of 0.25 - 0) (cat need to be 1 as it is the true value) because normally the true value takes 1 (100% proba) and all other possibilities (classes) take 0 (One Hot Encoding)
@@analytics48 For your first comment: Almost. The thing is that the NN shown outputs two values. Let's say 0.7 for the upper (cat) and 0.1 for lower(dog). Then the error is
(0.7-1)^2+(0.1-0)^2
@@analytics48 For your second comment:
If the image is an elephant, then the correct label is (0,0,1,0).
Your NN should have 4 output values called (val_0, val_1, val_2,val_3).
Then the loss value would be
(val_0-0)^2+(val_1-0)^2+(val_2-1)^2+(val_3-0)^2
To be clear: The correct label for elephant is 2 which corresponds to the "label vector" (0,0,1,0).
You can get the loss function by computing
NN output vector - label vector
and then the sum of the squares of each entry.
One last thing:
Everything in the video would be fine actually if the NN had only one output and the closer it is to 0,the more it thinks that it is a cat.
diagram has an array of inputs and outputs. explanation said for each input calulate the output but in this context you are talking about a labeled image not an array element.
Awesome playlist! I am a fan.
Question - Is n't it target prediction - actual prediction? So desired value - calculated value?
Thanks lizzy!
Depending on which loss function you use, the order of the desired value and the calculated value may not matter. (Example: desired value - calculated value OR calculated value - desired value) This would be true if you were squaring the errors for example. If you square the difference, it will not matter which value you subtract from which. Let me know if this helps.
@@deeplizard Alright so for let's say MSE this order doesn't matter, right. Thanks ;)
Thanks for the question, I was also wondering. And thanks deeplizard for answering, you guys are awesome.
Thank you for your videos :)
Nice video!
quality content, thank you
please make a video loss vs accuracy.. they seem to be the opposite of each other but they don't add up to be one!
Hey Ameer - The accuracy is just the percentage of samples the network predicted correctly. If I have 10 samples, and my network predicted 8 correctly, then accuracy is 0.8, or 80%. Loss, on the other hand, is all about what is covered in this video. Loss and accuracy do not sum to 1.
why loss goes up sometimes, not alwsys down?
I have a question, why this channel name is deeplizard? can you explain that?
_I could a tale unfold whose lightest word_
_Would harrow up thy soul._
👻🦎
@@deeplizard Thank you very much for your great training content and reply this private message. I am more interested about you. Hope you could share more about yourself/your view about what you are interested it would be more fun and valuable. What you are up to? love!!!
Thanks, Alice! I'm Mandy 😊
My partner Chris and I are the creators of deeplizard. We have a second channel where we vlog and share more about ourselves and our experiences. You can check it out and say hi to us there at ua-cam.com/users/deeplizardvlog
loss is calculated at the end of each training sample not at end of epoch accuracy is calculated optionally. please give correction tip. or correct me if I am wrong
amazing
Why to use MSE if it's a non convex function?
This won't guarantee the global optimum.
Instead log loss function should be used
I can't activate the subtitles in english
Greetings Jobson! We have enabled auto-generated English subtitles for all videos. I'm not sure why the three videos you commented on are not showing English subtitles. We will look into this. In the mean time, you can visit the blog pages on deeplizard.com for any of our UA-cam videos. There are fully written lecture notes in English for each video there.
Nice videos. However, I don't think I like the way pronounce the word "input".
i cant understand your english!
There is a text version here: deeplizard.com/learn/video/Skc8nqJirJg