Machine Learning / Deep Learning Tutorials for Programmers playlist: ua-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html Keras Machine Learning / Deep Learning Tutorial playlist: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
I have finally understood the difference between the validation and test sets as well as the importance of the validation set. Thanks for the clear and sample explanation.
@@adanegebretsadik8390 Lets consider that you have a dataset D which we will split as follow: 1- 70% Of D as train set = T' 2- 30% of D as test set = S We will further split the train set T' as follow: 1- 70% of T as train set = T 2- 30% of T as valid set = V To construct a good classifier model we need it to learn all the important information related to T and then validate it first on D than test it in final on S. The perfect model will have a good score in both D and T. As a simple explanation to this setup would be this: If you are learning a new course (Machine Learning) D, you will have to pass some labs (V). If you have scored a good score in V you are eligible to pass the final test S with confidence. Otherwise you will have to re-learn the course material D and test yourself a second time on V until you achieve a good results in V.
@@mohamedlichouri5324 thank you so much bro i finally understood but one which is not clear for me is how to split the V from train in keras python? again thank you
@@adanegebretsadik8390 I often use train_test_split function like this: from sklearn.model_selection import train_test_split 1- Split the data to 70% T' and 30% Test S. X_trn, X_test, y_trn, y_test = train_test_split(X, y, test_size=0.30) 2- Resplit T' to 70% Train T and 30% Valid V. X_train, X_valid, y_train, y_valid = train_test_split(X_trn, y_trn, test_size=0.30)
thank your sir. but i want to contact you to share your deep knowledge about machine learning since all the tips that i get from you are very essential for me till now. so do you mind if i contact you by social medias? for general information i am masters student in computer engineering so it may help me do my thesis.
You said that “test set is unlabeled” but actually it is a labeled dataset. Of course it could be unlabeled because it isn’t adding anything to the model while it is training, but we use a labeled test set to quickly determine our models performance when it has finished training.
Hi @Gábor Pelesz ... That's what I thought too. I was wondering if I could get your insights on the main difference between validation and test set. From what I understand, the validation set is used with training. Meaning, after training say a Logistic Regression model (~100,000 iterations with specific hyperparamters)... then we deploy the validation set on this trained model... after which some metrics are calculated. If the error is bad, then we tune the hyperparamters, or do whatever is necessary... and then train some more based on the changes. Then after that... we validate again using the validation data... and this goes on until the get a satisfactory error chart. Wouldn't the TEST set now be redundant? Since we already achieved good performance on the validation set. From what I've self-learned, we basically sample all sets (training, validation and test), from basically the same distribution... right? Would appreciate any insights.
@@alfianabdulhalin1873 1. In an ideal world, where we can train with data that is completely cover the space of the variables, the test set might be useless because it isn't adding any information to us (i.e redundant, was already in the training set). Therefore our models performance would be exactly what it achieved while training. But sadly we are so far from this world that with additional test sets, we are only able to speculate the performance of our models. So summing up, training and validation set is, let's say, 80%. The 20% that's left is more likely (and also it is important) to be unique and different. 2. We are training with the training set so our model is most biased towards the training set. Let's assume the model is tested against the validation set after the model went through all our data once and want to start over for another iteration (i.e if we have 10 training samples, then after every 10th step we test against validation set). While validating, we modify some hyperparameters accordingly (e.g. learning rate). What's important is that we change things after seeing how our validation tests performed, thus our model is also biased towards the validation set (although not as much as towards the training set). This emphasizes the relevance of a test set, a set of datapoints that the model probably never seen before (also the test set is important to be unique, different than the others, to make sense). Hope your questions are answered!
Yes, the reason we pass labels with test data is to determine the accuracy, otherwise, those labels play no other role. It is like, you pass your unlabelled test data through the model collect all the predictions and then using the correct labels to compute the accuracy.
@@tamoorkhan3262 Do you mean test dataset is another validation set? After all, they are the same in the sense that their labels will not be used to update model parameter, and their labels are only used to generate some accuracy numbers.
@@aroonsubway2079 To the best of my knowledge the main point in distringuishing between validation set and test set is the following. During the training phase, we want to maximize the performance (accuracy) calculated on the validation set. By doing this after a while we are adjusting hyperparameters (n' of neurons, activation functions, n' of epochs...) to perform well in "that particular" validation set! (That's why cross-validation is generally a good choce) The test set should be considered "one shot". We do not generally adjust hyperparameters to have a better performance on test set, because that was the role of the validation set. (Also the test set is labelled) It's an approximation but in general: 👉 train set -> to adjust weigths of our model 👉 valid set -> to adjust hyperparmaters 👉 test set -> calculate final accuracy
To the best of my knowledge the main point in distringuishing between validation set and test set is the following. During the training phase, we want to maximize the performance (accuracy) calculated on the validation set. By doing this after a while we are adjusting hyperparameters (n' of neurons, activation functions, n' of epochs...) to perform well in "that particular" validation set! (That's why cross-validation is generally a good choce) The test set should be considered "one shot". We do not generally adjust hyperparameters to have a better performance on test set, because that was the role of the validation set. (Also the test set is labelled) It's an approximation but in genral: 👉 train set -> to adjust weigths of our model 👉 valid set -> to adjust hyperparmaters 👉 test set -> calculate final accuracy
Loved all the videos and extremely clear with the concepts and the foundations of ML, often we run models but don't have in depth understanding of what exactly it is. Your explanation is by far the best across all videos I have seen. I can actually go ahead and explain the concepts to others with full clarity. Thank you so much for your efforts. One request, I think there is one concept that got missed, " regularizers ". It will be nice to have a short video on that too. Thanks again for your precious time and super awesome explanation. Looking forward to being an expert like you :)
{ "question": "The test set differs from the train and validation sets by:", "choices": [ "Being applied after training and being unlabeled", "Being applied after training and being labeled", "Being randomly selected data", "Being hand-selected data" ], "answer": "Being applied after training and being unlabeled", "creator": "Chris", "creationDate": "2019-12-11T04:29:35.828Z" }
Best video series so far found which explains the concepts of Neural networks :) ... One small suggestion.. better if the font size of 'Jupiter Note book' is bit bigger. So it will be more easier to check the codes :)
Thanks for the suggestion, Hiroshi! In later videos that show code, I've started zooming in to maximize the individual code cells I'm covering. As an example, you can see the code starting at 7:33 in this video: ua-cam.com/video/ZjM_XQa5s6s/v-deo.htmlm33s Let me know what you think of this technique.
I have one question. In Tensorflow's Object detection API they tell us to create a training directory and a test directory and as usual 90-10 distribution. But we gotta label all of them. So this means the test directory in case of Tensorflow's API is actually Validation set right?
Hey Nirbhay - Not necessarily. Sometimes we'll label our test sets so we can see the stats from how well the model predicted on the test data. For example, we may want to plot a confusion matrix with the results from the test set. More on this here: ua-cam.com/video/km7pxKy4UHU/v-deo.html If the test set is labeled, we just have to take extra precaution to make sure that the labels are not made available to the model, like they are for training and validations sets.
Ok I have to dig up more in order to understand it. By the way the page at the weblink you sent ain't available. Could you please post it again? or perhaps the title of the video? Thanks :)
One question: At 1:40 you said weights won't be updated based on validation loss. If so, how does validation set help us? Since we are not using it to update the model... Later at 1:57 you said it's used so model doesn't overfit. How? When does it come into play? Goes without saying, great video! I'm on a spree!
Hey Ivan - We, as trainers of the model, can use the validation set metrics as a monitor to tell whether or not the model is overfitting. If it is, we can make appropriate adjustments to the model. The model itself though is not using the validation set for learning purposes or weight updates.
Thank you for the video! Super concise and clear. If you could shortly mention some real world examples in the future videos, that would be great, I see in the comments that people have been wondering about similar things as I have. Or maybe you have done that, I'm about to check the other videos as well :)
You're welcome, Dragana! And thank you! Yes, as the playlist progresses, I do introduce some examples. More hands-on examples (with code) are shown in the Keras and TensorFlow.js series below. Those series more so focus on how to implement the fundamental concepts we cover in this series. I hope you enjoy the next videos you check out! Keras: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html TensorFlow.js: ua-cam.com/play/PLZbbT5o_s2xr83l8w44N_g3pygvajLrJ-.html
You're like the 3blue1brown of deep learning! You deserve waaay more subs. Maybe if you include tensorflow tutorials in this format you could get a crap ton of more subs because there'll be others out there looking for explanations of how tensorflow works in intuitive ways, who aren't mathematically literate. Key here is to reduce the need for mathematical literacy and make the concepts more intuitive and easier to get into. If you were to introduce math literacy needed to explain these concepts, then you'd need to hope that the people who are looking to understand these concepts have figured out that by watching the likes of 3blue1brown (assuming that they've found him in the process of wanting to understand the math (hint: most people don't want to learn the math, they just want to understand the code)). So there you have it, a possible method for you to gain more subs :P
Hello, do you have videos explaining different types of activation functions, when to use a specific one? And do you have a video about optimizers ? Like Momentum
This episode explains activation functions: deeplizard.com/learn/video/m0pIlLfpXWE No specific episode on optimizers, although we do have several explaining how a NN learns via SGD optimization.
Fantastic video, one aspect I am confused about is what is the algorithm doing when it is 'training' the data? How does it train on data and how do we know it is correct? Do you have any videos on this question or know where I could look to understand? Thank you.
I believe you're wrong about the test set being unlabeled. As far as I remember from Andrew Ng course in Stanford, the training set is used for model tuning for multiple models; the validation set is used for model selection (this is where you compare different models to check which one best performs on data not used for training). Once you choose a definitive model, you still have to check if it generalizes well for data never seen before, that does not carry any bias on model selection. At this point, you don't do any further tuning. Besides, having a labeled test set allows you to define test error. If data are unlabeled, this term doesn't make any sense, does it?
The test set's labels just cannot be known to the model in the way that the train and validation sets are. So as far as the model knows, the test set is unlabeled. You may have the test labels stored elsewhere though to do your own analysis.
One question. Data in the tet set does have labels, right? But it's not known to the classifier... It's only labeled so that we could calculate all the metrics at the end more easily... right?
Sometimes we'll have labels for the test set, and other times we may not. When we do have the labels, you're correct that the network will not be aware of the labels that correspond to the samples. The network will understand that there are, say 10 different classes, for which it will need to classify the data from the test set, but it will not know the individual labels that correspond to each sample.
I'm kinda confused about this, for a very large dataset - say (10 million records). In general in the production environment how will be the train test split would be done to evaluate how our model is working? -> I have heard in a few resources that it is okay to split the data into 98% for training , 1% validation (100,000 rows) and 1% testing (100,000 rows). The theory behind this is that 1% of the data is most probably representing the maximum variance in the data. -> And some say, we have to split the data more or less 70% for training, 15% for validation and 15% for testing. The theory behind this is that if we have a large data for validation, testing and if it is giving good accuracy on that, then we can say with "confidence" that it would work nearly the same in real time as well. If any of this is right or wrong, could you please explain me with a reason.
Could you give a real world example of training set and validating set ?! kinda like i want to train if it's blue or red flower depending on its height or width ... and i use k-nearest neighbour so what validation set consists of ?
Hey Andy, So, sticking with your example of flowers-- You would start out by gathering the data for red and blue flowers. This data would presumably be numerical data containing the height and width of the flowers, and each sample from your data set would be labeled with "blue" or "red." You would then split this data up into a training set and validation set. A common split is 80% training / 20% validation. You would then train your model on the data in the training set, and validate the model on the data in the validation set. Does this example make things more clear?
you actually got the labels of your test set, but don't get them through your model, so you wait until model make the prediction, then compare them with the labels which you held back at first, and calculate the accuracy based on how similar the are.
I have a question? Why we can not use validation approach in normal machine learning? Why we only use it in deep learning problems to prevent overfitting?
Explanation was really good ma'am but the white screen console that you showed could not be read. Please make those contents brighter and in big fonts.
Thanks for the feedback, Sayantani. In later videos, I zoom in on the code, so it is much easier to read. Also, note that most videos have corresponding text-based blogs that you can read at deeplizard.com The blog for this video can be found at deeplizard.com/learn/video/Zi-0rlM4RDs
Thank you for the video. I just have one question. How do we know how well our model is performing on the test set if we don't have labels to tell us the "correct answer" and if we don't even know what the correct answer is ourselves. How do we then know that the model performed well or badly on the test set? Thanks again.
Hey dazzaondmic - You're welcome! If we don't know the labels to the test set ourselves, then the only way we can gauge the performance of the model is based on the metrics observed during training and validating. We won't have a solid way of judging the exact accuracy of the model on the test set. If we have a decently sized validation set though, and the data contained in it is a good representation of what the model will be exposed to in the test set and in the "real world," then that increases confidence in the model's ability to perform well on new, unseen data if the model indeed performs well on the validation data. Does this help clarify?
IMO,the test dataset should have labels so that we can at least have some accuracy numbers to look at in the end. The only difference btw validation dataset and test dataset is that, we still have chance to update model based on the validation results by tuning hyperparameters. However, test dataset only provide us a final accuracy number, even it is bad, we won't perform additional training.
thanks very much for the videos!! then the labels of the training set and the array of its labels should be ordered to match properly, correct? f.i. if index 0 of the training array is the picture of a cat, then index 0 of label array should be 0 (and 1 if a dog)?
i hope that i'll get reply, my question is that do i have to dig deeper in machine learning concepts for starting deep learning or all these fundamentals are fair enough to start deep learning btw thankyou for providing us such valuable content for free
Yes, you can start with this DL Fundamentals course without prior ML experience/knowledge. Check out the recommended deep learning roadmap on the homepage of deeplizard.com You can also see the prerequisites for each course there as well.
FIT does NOT have validation ? most of the time, people code looks like this: clf=svm.SVC.fit( X_train, y_train) the validation_set and validation_split are no where to be found, even sklearn doc doesn't mention it. What is going on, how come these model don't get overfitting without a validation set ?
Hey Nada - It's typically going to be up to the engineer of the network to determine what is considered acceptable regarding their results. In general, I would say that you typically want your training and validation accuracy to be as close as you can get them to each other. If the validation accuracy is considerably greater than the training accuracy, then you may want to take steps to decrease the difference between those metrics. If the model used in this video was one I planned to deploy to production, for example, then I would take steps to close this gap. This would be considered a problem of underfitting. I talk more about that here: ua-cam.com/video/0h8lAm5Ki5g/v-deo.html
There's an error in using validation_data in model.fit. The format should be a tuple of NumPy arrays, i.e. valid_set = (np.array([0.6,0.5]), np.array([1,1]))
It's indeed helpful and understandable. As I am at beginner level, I wonder if there any way to get the demo code you are using for making these videos. Thanks in advance.
Thanks, Arifur! Download access to code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
Hello, Thank you for this playlist . It's awesome! My question is : In some cases, we don't specify a validation set. Why ? and when is not important to set a validation data ?
@@nandinisarker6123 Well, I found these 2 links: stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set machinelearningmastery.com/difference-test-validation-datasets/
Using Google Colab on this. I've set the same model and hyper-parameters. I've also used the same code to preprocess the data and the same params to train the model (batch_size, Adam's lr, validation_split, epochs) but I'm not getting the same metrics as you while training no matter how much I try. the validation accuracy plateaus around 0.75 and the val_loss starts at around 0.68 decreases then starts increasing around the 12th epoch to end around 0.66. This is bugging me and I can't figure it out. PS: I also tried with theano as a backend for Teras
Hey Yassine - Are you using the same data to train your model? This data was created in the Keras series. If you did use the same data, then be sure that you caught the reference in the Keras validation set video to reverse the order of the for-loops that generates the data. Let me know.
@@deeplizard Thank you for reaching back. Yes, I generated the data in the same fashion as you did in the preprocessing data video from the Keras series. I caught the for-loop reverse reference in that same series and after rectifying my code the val loss accuracy and loss behaved normally while fitting. But I'm not sure why the behaviour changed. As you explained, doing the validation split will take a percentage of the training set prior to fitting (I dunno if the validation set is generated after a shuffle or not in this case) and won't regenerate on each shuffle on each epoch. But why is switching the for-loops order mattered? you are still taking the 10 or 20% bottom of your data regardless of the for loop order and you end up with the same validation data on each epoch. Also, I used a sigmoid function in the output as you did in this series yet my prediction probabilities don't sum up to 1 as you depicted in the prediction video within the same playlist. Using a softmax function like in Keras API series works fine. It helps if you could clear up this confusion.
The validation_split parameter takes the last x% of the data in the training set (10% in our example), and doesn't shuffle it. With the way I had the for-loops organized originally, the validation split would completely capture all of the data in the second for loop, which was the 5% of younger individuals who did experience side effects and the 5% of older individuals who did not experience side effects. Therefore, none of the data in the second for-loop would be captured in the training set. With the re-ordering of the for-loops, the training set is made up of the data that is now generated in both for-loops. Another (better) approach we could have taken is, after generating all the data with both for-loops (regardless of which order the loops are in), we could shuffle all the data completely, and then pass that shuffled data to our model. With shuffled data, the training set and validation sets would be a more accurate depiction of the true underlying data since there would be no real ordering to it. As long as your data in the real world is shuffled before you pass it to your model, you shouldn't experience this problem.
For predictic we split a dataset into two sets, one is train set and another is test set. But when we have separate datasets for training and testing given, then do we include the dependent variable (response variable or the Y Variable) in testing dataset. Because in one of my simple Logistic reg analysis, they have given three datasets separately: training, validation and testing. In the testing dataset, I don't have the response variable ie., the Y variable. So this my question, can we test a dataset without the response variable Y.
For predictic we split a dataset into two sets, one is train set and another is test set. But when we have separate datasets for training and testing given, then do we include the dependent variable (response variable or the Y Variable) in testing dataset. Because in one of my simple Logistic reg analysis, they have given three datasets separately: training, validation and testing. In the testing dataset, I don't have the response variable ie., the Y variable. So this my question, can we test a dataset without the response variable Y.
I see. Yes, many times, we don't have the labels for the test data. This is completely fine. The labels for training and validation data are required, but labels for test data are not required.
You can also use a confusion matrix for non-binary output. I show an example of this towards the end of this video: ua-cam.com/video/FNqp4ZY0wDY/v-deo.html
Either convert your data to a supported data type, or manually create a separate validation set. More details in this blog: deeplizard.com/learn/video/dzoh8cfnvnI
Universities are getting obsolete when you have UA-cam. I mean I have siriously mearned more from UA-cam on machine learnkng and C# coding than from professors at University. Thanks for this great explanation. 🙏
Thanks, it made my mind clear you deserve a sub. But I have a question , what if we don't train? so it means like the accuracy will drop? like 0%? will give an error? or maybe like training is a must and what a stupid question I am asking hahahaha btw learning NLP here using python - nltk
Hey App Inventor- The code files for this series are available as a perk for the deeplizard hivemind at the following link: www.patreon.com/posts/code-for-deep-19266563 Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
Thank you, I'm glad you're enjoying the videos! I'll add 3D convolutions to my list of potential topics to cover in future videos. Thanks for the suggestion. In the mean time, I do have a video on CNNs in general below if you've not yet seen that one. ua-cam.com/video/YRhxdVk_sIs/v-deo.html
By validating the model against the validation set, we can choose to adjust our hyperparameters based on the validation metrics. The weights of the network, however, are not adjusted during training based on the validation set. (Note that weights are not hyperparameters.) The weights are only adjusted according to the training set. This is what is stated in the video. Hope it's clear now.
Dear the artificial intelligence community I am pleased to introduce DIDA dataset, which is the largest handwritten digit dataset. I will be grateful, if you could help me to introduce this dataset to the community. Thanks
not helpful at all. why cant anybody just show an EXAMPLE so that we can really wrap our head around this. I have a vague idea of validation. I take it the validation just "updates" because of the new information?? if that is it, then why are there 10 different definitions. what is an example of the validation demonstrating underfitting or overfitting?
The role of validation set in adjusting weights is still unclear to me after listening that part 3 times. So not great explanation. Maybe you need to make a video specific to this.
This is wrong, the test set must also be labelled. Otherwise you cannot evaluate the model at the end. The video should be corrected because it is teaching incorrect information to people.
Machine Learning / Deep Learning Tutorials for Programmers playlist: ua-cam.com/play/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU.html
Keras Machine Learning / Deep Learning Tutorial playlist: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
I love how concise (waffle free) your videos are.
Thank you, Andre!
Easy, clear, and complete. Perfect explanation ! 🌹
I have finally understood the difference between the validation and test sets as well as the importance of the validation set. Thanks for the clear and sample explanation.
Mr. mohamed could you please tell me the importance of validation set and how to prepare because i don't understand it well?
thank you
@@adanegebretsadik8390 Lets consider that you have a dataset D which we will split as follow:
1- 70% Of D as train set = T'
2- 30% of D as test set = S
We will further split the train set T' as follow:
1- 70% of T as train set = T
2- 30% of T as valid set = V
To construct a good classifier model we need it to learn all the important information related to T and then validate it first on D than test it in final on S.
The perfect model will have a good score in both D and T.
As a simple explanation to this setup would be this: If you are learning a new course (Machine Learning) D, you will have to pass some labs (V).
If you have scored a good score in V you are eligible to pass the final test S with confidence. Otherwise you will have to re-learn the course material D and test yourself a second time on V until you achieve a good results in V.
@@mohamedlichouri5324 thank you so much bro i finally understood but one which is not clear for me is how to split the V from train in keras python?
again thank you
@@adanegebretsadik8390 I often use train_test_split function like this:
from sklearn.model_selection import train_test_split
1- Split the data to 70% T' and 30% Test S.
X_trn, X_test, y_trn, y_test = train_test_split(X, y, test_size=0.30)
2- Resplit T' to 70% Train T and 30% Valid V.
X_train, X_valid, y_train, y_valid = train_test_split(X_trn, y_trn, test_size=0.30)
thank your sir. but i want to contact you to share your deep knowledge about machine learning since all the tips that i get from you are very essential for me till now. so do you mind if i contact you by social medias? for general information i am masters student in computer engineering so it may help me do my thesis.
Because of you, i am learning ANN during corona lockdown. Thank you very much.
And me trying to recall as fast as possible Xd
You said that “test set is unlabeled” but actually it is a labeled dataset. Of course it could be unlabeled because it isn’t adding anything to the model while it is training, but we use a labeled test set to quickly determine our models performance when it has finished training.
Hi @Gábor Pelesz
... That's what I thought too. I was wondering if I could get your insights on the main difference between validation and test set. From what I understand, the validation set is used with training. Meaning, after training say a Logistic Regression model (~100,000 iterations with specific hyperparamters)... then we deploy the validation set on this trained model... after which some metrics are calculated. If the error is bad, then we tune the hyperparamters, or do whatever is necessary... and then train some more based on the changes. Then after that... we validate again using the validation data... and this goes on until the get a satisfactory error chart.
Wouldn't the TEST set now be redundant? Since we already achieved good performance on the validation set. From what I've self-learned, we basically sample all sets (training, validation and test), from basically the same distribution... right? Would appreciate any insights.
@@alfianabdulhalin1873 1. In an ideal world, where we can train with data that is completely cover the space of the variables, the test set might be useless because it isn't adding any information to us (i.e redundant, was already in the training set). Therefore our models performance would be exactly what it achieved while training. But sadly we are so far from this world that with additional test sets, we are only able to speculate the performance of our models. So summing up, training and validation set is, let's say, 80%. The 20% that's left is more likely (and also it is important) to be unique and different.
2. We are training with the training set so our model is most biased towards the training set. Let's assume the model is tested against the validation set after the model went through all our data once and want to start over for another iteration (i.e if we have 10 training samples, then after every 10th step we test against validation set). While validating, we modify some hyperparameters accordingly (e.g. learning rate). What's important is that we change things after seeing how our validation tests performed, thus our model is also biased towards the validation set (although not as much as towards the training set). This emphasizes the relevance of a test set, a set of datapoints that the model probably never seen before (also the test set is important to be unique, different than the others, to make sense).
Hope your questions are answered!
Yes, the reason we pass labels with test data is to determine the accuracy, otherwise, those labels play no other role. It is like, you pass your unlabelled test data through the model collect all the predictions and then using the correct labels to compute the accuracy.
@@tamoorkhan3262 Do you mean test dataset is another validation set? After all, they are the same in the sense that their labels will not be used to update model parameter, and their labels are only used to generate some accuracy numbers.
@@aroonsubway2079 To the best of my knowledge the main point in distringuishing between validation set and test set is the following. During the training phase, we want to maximize the performance (accuracy) calculated on the validation set. By doing this after a while we are adjusting hyperparameters (n' of neurons, activation functions, n' of epochs...) to perform well in "that particular" validation set! (That's why cross-validation is generally a good choce)
The test set should be considered "one shot". We do not generally adjust hyperparameters to have a better performance on test set, because that was the role of the validation set. (Also the test set is labelled)
It's an approximation but in general:
👉 train set -> to adjust weigths of our model
👉 valid set -> to adjust hyperparmaters
👉 test set -> calculate final accuracy
Really helpful. Finally understood the difference between Validation and Test set.
To the best of my knowledge the main point in distringuishing between validation set and test set is the following. During the training phase, we want to maximize the performance (accuracy) calculated on the validation set. By doing this after a while we are adjusting hyperparameters (n' of neurons, activation functions, n' of epochs...) to perform well in "that particular" validation set! (That's why cross-validation is generally a good choce)
The test set should be considered "one shot". We do not generally adjust hyperparameters to have a better performance on test set, because that was the role of the validation set. (Also the test set is labelled)
It's an approximation but in genral:
👉 train set -> to adjust weigths of our model
👉 valid set -> to adjust hyperparmaters
👉 test set -> calculate final accuracy
I am really appreciate how simply you explained the concept. Your videos really help me to get the basic concept of DNN
Ur videos are neat. I have even to pause them and digest all the information before moving on sometimes. Thanks for your work.
Best video series so far found which explains the concepts of Neural networks :)
Thank you, Hiroshi!
Loved all the videos and extremely clear with the concepts and the foundations of ML, often we run models but don't have in depth understanding of what exactly it is. Your explanation is by far the best across all videos I have seen. I can actually go ahead and explain the concepts to others with full clarity. Thank you so much for your efforts. One request, I think there is one concept that got missed, " regularizers ". It will be nice to have a short video on that too. Thanks again for your precious time and super awesome explanation. Looking forward to being an expert like you :)
Thanks, sunaina! Happy to hear how much you enjoyed the course :)
Btw, regularization is covered here:
ua-cam.com/video/iuJgyiS7BKM/v-deo.html
Thanks a lot for these videos.. I was trying using CNN and Keras without explanation and I was just lost - now I get it.. Thx again angel
Perfect rate at which you speak. Perfect.
well done; straightforward and clear; thanks a lot
Very helpful, precise definition. I appreciate it :)
well i was reading deep learning with python and i got a bit lost, this video explained it to me very well so thank you and keep the hard work
Very clearly explained, thanks.
Thanks Teacher!!! Gratefull
{
"question": "The test set differs from the train and validation sets by:",
"choices": [
"Being applied after training and being unlabeled",
"Being applied after training and being labeled",
"Being randomly selected data",
"Being hand-selected data"
],
"answer": "Being applied after training and being unlabeled",
"creator": "Chris",
"creationDate": "2019-12-11T04:29:35.828Z"
}
Thanks, Chris! Just added your question to deeplizard.com
Thank you very much. I understood everything litteraly. Big thanks
Loving this series. Concise and to the point. (Y)
comments = "Thank you for your videos"
response = "You're welcome, Ra-ki Papa!"
@@deeplizard Perfect💖
Thank you so much for your videos! You make machine learning so much more understandable and fun :) I really appreciate your passion! Keep it up!!!
Best video series so far found which explains the concepts of Neural networks :) ... One small suggestion.. better if the font size of 'Jupiter Note book' is bit bigger. So it will be more easier to check the codes :)
Thanks for the suggestion, Hiroshi! In later videos that show code, I've started zooming in to maximize the individual code cells I'm covering. As an example, you can see the code starting at 7:33 in this video: ua-cam.com/video/ZjM_XQa5s6s/v-deo.htmlm33s
Let me know what you think of this technique.
Thank you very much for this video! It helps me get a quick understanding of the use of these 3 separate datasets (i.e. Train, Test and Validation)!
thank you very much for this clear and helpful explanation.
Thank you for your clear explanation
Such a great video!!!!! Thank you!!!!!!
Really cleared my mind! Thank you :) Keep up the good work.
I have one question. In Tensorflow's Object detection API they tell us to create a training directory and a test directory and as usual 90-10 distribution. But we gotta label all of them. So this means the test directory in case of Tensorflow's API is actually Validation set right?
Hey Nirbhay - Not necessarily. Sometimes we'll label our test sets so we can see the stats from how well the model predicted on the test data. For example, we may want to plot a confusion matrix with the results from the test set. More on this here: ua-cam.com/video/km7pxKy4UHU/v-deo.html
If the test set is labeled, we just have to take extra precaution to make sure that the labels are not made available to the model, like they are for training and validations sets.
Ok I have to dig up more in order to understand it. By the way the page at the weblink you sent ain't available. Could you please post it again? or perhaps the title of the video? Thanks :)
The ")" was caught on the end of the URL. Here's the link: ua-cam.com/video/km7pxKy4UHU/v-deo.html
One question:
At 1:40 you said weights won't be updated based on validation loss. If so, how does validation set help us? Since we are not using it to update the model... Later at 1:57 you said it's used so model doesn't overfit. How? When does it come into play?
Goes without saying, great video! I'm on a spree!
Hey Ivan - We, as trainers of the model, can use the validation set metrics as a monitor to tell whether or not the model is overfitting. If it is, we can make appropriate adjustments to the model. The model itself though is not using the validation set for learning purposes or weight updates.
@@deeplizard we adjust the model manually! Ok, it makes sense now :)
Thank you!
Thank you for the video! Super concise and clear. If you could shortly mention some real world examples in the future videos, that would be great, I see in the comments that people have been wondering about similar things as I have. Or maybe you have done that, I'm about to check the other videos as well :)
You're welcome, Dragana! And thank you!
Yes, as the playlist progresses, I do introduce some examples. More hands-on examples (with code) are shown in the Keras and TensorFlow.js series below. Those series more so focus on how to implement the fundamental concepts we cover in this series.
I hope you enjoy the next videos you check out!
Keras: ua-cam.com/play/PLZbbT5o_s2xrwRnXk_yCPtnqqo4_u2YGL.html
TensorFlow.js: ua-cam.com/play/PLZbbT5o_s2xr83l8w44N_g3pygvajLrJ-.html
You're like the 3blue1brown of deep learning! You deserve waaay more subs.
Maybe if you include tensorflow tutorials in this format you could get a crap ton of more subs because there'll be others out there looking for explanations of how tensorflow works in intuitive ways, who aren't mathematically literate. Key here is to reduce the need for mathematical literacy and make the concepts more intuitive and easier to get into.
If you were to introduce math literacy needed to explain these concepts, then you'd need to hope that the people who are looking to understand these concepts have figured out that by watching the likes of 3blue1brown (assuming that they've found him in the process of wanting to understand the math (hint: most people don't want to learn the math, they just want to understand the code)).
So there you have it, a possible method for you to gain more subs :P
Hehe thank you, Jennifer! :D
Quality content , thanks
@Deeplizard u explained very well...
Hello, do you have videos explaining different types of activation functions, when to use a specific one?
And do you have a video about optimizers ? Like Momentum
This episode explains activation functions:
deeplizard.com/learn/video/m0pIlLfpXWE
No specific episode on optimizers, although we do have several explaining how a NN learns via SGD optimization.
thank you for this interesting video
Fantastic video, one aspect I am confused about is what is the algorithm doing when it is 'training' the data? How does it train on data and how do we know it is correct? Do you have any videos on this question or know where I could look to understand? Thank you.
Yes, check out the Training and Learning lessons in the course:
deeplizard.com/learn/playlist/PLZbbT5o_s2xq7LwI2y8_QtvuXZedL6tQU
Check 3blue1brown's Neural Network videos
I believe you're wrong about the test set being unlabeled. As far as I remember from Andrew Ng course in Stanford, the training set is used for model tuning for multiple models; the validation set is used for model selection (this is where you compare different models to check which one best performs on data not used for training). Once you choose a definitive model, you still have to check if it generalizes well for data never seen before, that does not carry any bias on model selection. At this point, you don't do any further tuning. Besides, having a labeled test set allows you to define test error. If data are unlabeled, this term doesn't make any sense, does it?
The test set's labels just cannot be known to the model in the way that the train and validation sets are. So as far as the model knows, the test set is unlabeled. You may have the test labels stored elsewhere though to do your own analysis.
One question. Data in the tet set does have labels, right? But it's not known to the classifier... It's only labeled so that we could calculate all the metrics at the end more easily... right?
Sometimes we'll have labels for the test set, and other times we may not. When we do have the labels, you're correct that the network will not be aware of the labels that correspond to the samples. The network will understand that there are, say 10 different classes, for which it will need to classify the data from the test set, but it will not know the individual labels that correspond to each sample.
I'm kinda confused about this, for a very large dataset - say (10 million records).
In general in the production environment how will be the train test split would be done to evaluate how our model is working?
-> I have heard in a few resources that it is okay to split the data into 98% for training , 1% validation (100,000 rows) and 1% testing (100,000 rows). The theory behind this is that 1% of the data is most probably representing the maximum variance in the data.
-> And some say, we have to split the data more or less 70% for training, 15% for validation and 15% for testing. The theory behind this is that if we have a large data for validation, testing and if it is giving good accuracy on that, then we can say with "confidence" that it would work nearly the same in real time as well.
If any of this is right or wrong, could you please explain me with a reason.
Could you give a real world example of training set and validating set ?!
kinda like i want to train if it's blue or red flower depending on its height or width ... and i use k-nearest neighbour so what validation set consists of ?
Hey Andy,
So, sticking with your example of flowers-- You would start out by gathering the data for red and blue flowers. This data would presumably be numerical data containing the height and width of the flowers, and each sample from your data set would be labeled with "blue" or "red." You would then split this data up into a training set and validation set. A common split is 80% training / 20% validation. You would then train your model on the data in the training set, and validate the model on the data in the validation set.
Does this example make things more clear?
If the test set is unlabeled, how can you measure accuracy? How can you know that the model works?
you actually got the labels of your test set, but don't get them through your model, so you wait until model make the prediction, then compare them with the labels which you held back at first, and calculate the accuracy based on how similar the are.
I have a question? Why we can not use validation approach in normal machine learning? Why we only use it in deep learning problems to prevent overfitting?
Explanation was really good ma'am but the white screen console that you showed could not be read. Please make those contents brighter and in big fonts.
Thanks for the feedback, Sayantani. In later videos, I zoom in on the code, so it is much easier to read. Also, note that most videos have corresponding text-based blogs that you can read at deeplizard.com
The blog for this video can be found at
deeplizard.com/learn/video/Zi-0rlM4RDs
Thanks
Mam can just tell me do u have NLP playlist netural language processing
Not yet
Thank you for the video. I just have one question. How do we know how well our model is performing on the test set if we don't have labels to tell us the "correct answer" and if we don't even know what the correct answer is ourselves. How do we then know that the model performed well or badly on the test set? Thanks again.
Hey dazzaondmic - You're welcome! If we don't know the labels to the test set ourselves, then the only way we can gauge the performance of the model is based on the metrics observed during training and validating. We won't have a solid way of judging the exact accuracy of the model on the test set. If we have a decently sized validation set though, and the data contained in it is a good representation of what the model will be exposed to in the test set and in the "real world," then that increases confidence in the model's ability to perform well on new, unseen data if the model indeed performs well on the validation data.
Does this help clarify?
IMO,the test dataset should have labels so that we can at least have some accuracy numbers to look at in the end. The only difference btw validation dataset and test dataset is that, we still have chance to update model based on the validation results by tuning hyperparameters. However, test dataset only provide us a final accuracy number, even it is bad, we won't perform additional training.
thanks very much for the videos!! then the labels of the training set and the array of its labels should be ordered to match properly, correct? f.i. if index 0 of the training array is the picture of a cat, then index 0 of label array should be 0 (and 1 if a dog)?
Yes, correct!
i hope that i'll get reply, my question is that do i have to dig deeper in machine learning concepts for starting deep learning or all these fundamentals are fair enough to start deep learning btw thankyou for providing us such valuable content for free
Yes, you can start with this DL Fundamentals course without prior ML experience/knowledge. Check out the recommended deep learning roadmap on the homepage of deeplizard.com
You can also see the prerequisites for each course there as well.
@@deeplizard thnakyou so much, i'm falling in love with lizard for the first time xD
FIT does NOT have validation ?
most of the time, people code looks like this:
clf=svm.SVC.fit( X_train, y_train)
the validation_set and validation_split are no where to be found, even sklearn doc doesn't mention it.
What is going on, how come these model don't get overfitting without a validation set ?
Thank you for this video. In the results of the example, the validation accuracy is higher than the training accuracy. Is this considered a problem?
Hey Nada - It's typically going to be up to the engineer of the network to determine what is considered acceptable regarding their results. In general, I would say that you typically want your training and validation accuracy to be as close as you can get them to each other. If the validation accuracy is considerably greater than the training accuracy, then you may want to take steps to decrease the difference between those metrics. If the model used in this video was one I planned to deploy to production, for example, then I would take steps to close this gap. This would be considered a problem of underfitting. I talk more about that here: ua-cam.com/video/0h8lAm5Ki5g/v-deo.html
I have trained and validate data. Now how can i test an image in the model?
Hey Pooja - Check out this video, and let me know if it answers your question.
ua-cam.com/video/bfQBPNDy5EM/v-deo.html
There's an error in using validation_data in model.fit. The format should be a tuple of NumPy arrays, i.e. valid_set = (np.array([0.6,0.5]), np.array([1,1]))
Yes, this is specified in blog for the episode below:
deeplizard.com/learn/video/dzoh8cfnvnI
silly question: can you also pass in pandas df's or is it irrelevant? and numpy is enough?
It's indeed helpful and understandable. As I am at beginner level, I wonder if there any way to get the demo code you are using for making these videos. Thanks in advance.
Thanks, Arifur! Download access to code files and notebooks are available as a perk for the deeplizard hivemind. Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
If i have 1000 rows in dataset. Then how can select 800 rows for training and 200 for testing instead of select randomly in splitting?
Keras can automatically split out a percentage of your training data for validation only.
More here: deeplizard.com/learn/video/dzoh8cfnvnI
You are the best :)
validation set is used to tweak the hyperparameters of a model.
Hello,
Thank you for this playlist . It's awesome!
My question is : In some cases, we don't specify a validation set. Why ? and when is not important to set a validation data ?
This is my question too. Hope someone answers.
@@nandinisarker6123 Well, I found these 2 links:
stats.stackexchange.com/questions/19048/what-is-the-difference-between-test-set-and-validation-set
machinelearningmastery.com/difference-test-validation-datasets/
why can't you use Test after each epoch of training, since no weight will be updated from it?
Where to get the dataset?
does this work for unsupervised and reinforcement learning?
No, RL works differently. Check out our RL course:
deeplizard.com/learn/playlist/PLZbbT5o_s2xoWNVdDudn51XM8lOuZ_Njv
@@deeplizard ty
Using Google Colab on this.
I've set the same model and hyper-parameters. I've also used the same code to preprocess the data and the same params to train the model (batch_size, Adam's lr, validation_split, epochs) but I'm not getting the same metrics as you while training no matter how much I try.
the validation accuracy plateaus around 0.75 and the val_loss starts at around 0.68 decreases then starts increasing around the 12th epoch to end around 0.66. This is bugging me and I can't figure it out.
PS: I also tried with theano as a backend for Teras
Hey Yassine - Are you using the same data to train your model? This data was created in the Keras series. If you did use the same data, then be sure that you caught the reference in the Keras validation set video to reverse the order of the for-loops that generates the data. Let me know.
@@deeplizard Thank you for reaching back.
Yes, I generated the data in the same fashion as you did in the preprocessing data video from the Keras series. I caught the for-loop reverse reference in that same series and after rectifying my code the val loss accuracy and loss behaved normally while fitting. But I'm not sure why the behaviour changed. As you explained, doing the validation split will take a percentage of the training set prior to fitting (I dunno if the validation set is generated after a shuffle or not in this case) and won't regenerate on each shuffle on each epoch. But why is switching the for-loops order mattered? you are still taking the 10 or 20% bottom of your data regardless of the for loop order and you end up with the same validation data on each epoch.
Also, I used a sigmoid function in the output as you did in this series yet my prediction probabilities don't sum up to 1 as you depicted in the prediction video within the same playlist. Using a softmax function like in Keras API series works fine. It helps if you could clear up this confusion.
The validation_split parameter takes the last x% of the data in the training set (10% in our example), and doesn't shuffle it. With the way I had the for-loops organized originally, the validation split would completely capture all of the data in the second for loop, which was the 5% of younger individuals who did experience side effects and the 5% of older individuals who did not experience side effects. Therefore, none of the data in the second for-loop would be captured in the training set.
With the re-ordering of the for-loops, the training set is made up of the data that is now generated in both for-loops.
Another (better) approach we could have taken is, after generating all the data with both for-loops (regardless of which order the loops are in), we could shuffle all the data completely, and then pass that shuffled data to our model. With shuffled data, the training set and validation sets would be a more accurate depiction of the true underlying data since there would be no real ordering to it. As long as your data in the real world is shuffled before you pass it to your model, you shouldn't experience this problem.
All fine, but I came to know that the dependent variable are not included in the training set, how is that? Thank you
Hey balabodhi - I'm not sure what you mean by "dependent variable." Can you please elaborate?
For predictic we split a dataset into two sets, one is train set and another is test set. But when we have separate datasets for training and testing given, then do we include the dependent variable (response variable or the Y Variable) in testing dataset. Because in one of my simple Logistic reg analysis, they have given three datasets separately: training, validation and testing. In the testing dataset, I don't have the response variable ie., the Y variable. So this my question, can we test a dataset without the response variable Y.
For predictic we split a dataset into two sets, one is train set and another is test set. But when we have separate datasets for training and testing given, then do we include the dependent variable (response variable or the Y Variable) in testing dataset. Because in one of my simple Logistic reg analysis, they have given three datasets separately: training, validation and testing. In the testing dataset, I don't have the response variable ie., the Y variable. So this my question, can we test a dataset without the response variable Y.
I see. Yes, many times, we don't have the labels for the test data.
This is completely fine. The labels for training and validation data are required, but labels for test data are not required.
ok that's fine, but I couldn't understand what is labels
how to test accuracy when predicting non-binary output ?
(as far as i know , they use ''confusion matrix'' when output are binary)
You can also use a confusion matrix for non-binary output. I show an example of this towards the end of this video: ua-cam.com/video/FNqp4ZY0wDY/v-deo.html
what happens if i give same set of images for both validation and training datasets?
You will not be able to identify overfitting or see how your model is generalizing to data it wasn't trained on.
@@deeplizard i got an error :"`validation_split` is only supported for Tensors or NumPy " what should I do now?
Either convert your data to a supported data type, or manually create a separate validation set. More details in this blog:
deeplizard.com/learn/video/dzoh8cfnvnI
Universities are getting obsolete when you have UA-cam. I mean I have siriously mearned more from UA-cam on machine learnkng and C# coding than from professors at University. Thanks for this great explanation. 🙏
My dear Channel,
I only wish for you to change the ominous music in the beginning!
TY :-(
Lol it has been changed in later episodes :D
@@deeplizard Looking forward! Thanks again for the video
You are fast!
what is the different about model.evaluate() and model.predict()
model.predict() got the lower accuracy than model.evaluate()
I love your videos
Thanks, it made my mind clear you deserve a sub. But I have a question , what if we don't train? so it means like the accuracy will drop? like 0%? will give an error? or maybe like training is a must and what a stupid question I am asking hahahaha
btw learning NLP here using python - nltk
model? you mean the software or system right?
If you don't train the model, then it will likely perform no better than chance for the given task. By "model," I mean the neural network.
Please share the code as well
Hey App Inventor- The code files for this series are available as a perk for the deeplizard hivemind at the following link: www.patreon.com/posts/code-for-deep-19266563
Check out the details regarding deeplizard perks and rewards at: deeplizard.com/hivemind
thanks alot, this is really excellent tutorial, explained so well in a simple manner
i request you to please provide a tutorial on 3d convolution algorithm, to process the medical image files
Thank you, I'm glad you're enjoying the videos!
I'll add 3D convolutions to my list of potential topics to cover in future videos. Thanks for the suggestion.
In the mean time, I do have a video on CNNs in general below if you've not yet seen that one.
ua-cam.com/video/YRhxdVk_sIs/v-deo.html
1:00 - 1:55 You lost me there. First you said Validation is for HP tuning and then you say that it is Not
-_- off to another video/article
By validating the model against the validation set, we can choose to adjust our hyperparameters based on the validation metrics. The weights of the network, however, are not adjusted during training based on the validation set. (Note that weights are not hyperparameters.) The weights are only adjusted according to the training set. This is what is stated in the video. Hope it's clear now.
Dear the artificial intelligence community
I am pleased to introduce DIDA dataset, which is the largest handwritten digit dataset. I will be grateful, if you could help me to introduce this dataset to the community.
Thanks
I think you should slow down when you are explaining to let the information sink in :)
I have in later videos. In the mean time, each video has a corresponding written blog on deeplizard.com that you can check out for a slower pace :)
@@deeplizard Thank you so much the course helped me to understand better
Just talk more slowly. I had to put you at 0.75 speed and you sound like you are drunk.
Lol
The blogs are helpful for slower pac as well:
deeplizard.com/learn/video/Zi-0rlM4RDs
haha same issue here
wayway
wayway
I'm lost. Are we really talking about weight training or something else?!
Use 0.5x speed.
Thank me later.
???? 0.5 ????
why is the music so scary?
speaking very fast as we are computers to catch up with the speed.
not helpful at all. why cant anybody just show an EXAMPLE so that we can really wrap our head around this. I have a vague idea of validation. I take it the validation just "updates" because of the new information?? if that is it, then why are there 10 different definitions. what is an example of the validation demonstrating underfitting or overfitting?
The role of validation set in adjusting weights is still unclear to me after listening that part 3 times. So not great explanation. Maybe you need to make a video specific to this.
I think you could redo this video and speak slowly and calmly.
Too fast for me.I kept on rewinding
You can use the corresponding blogs for every video on deeplizard.com to move at a slower pace as well.
deeplizard.com/learn/video/Zi-0rlM4RDs
man!!! you speak soooo fast it is hard to keep up with the video and what you are saying, great content, but you need to slow down......
Maybe talk slower. It's hard to understand people when they're racing.
speak slowly
too fast and too bad quality about telling concept
This is wrong, the test set must also be labelled. Otherwise you cannot evaluate the model at the end. The video should be corrected because it is teaching incorrect information to people.