First of all thanks for the video. Bagging: when we take different models and train them parallely with each getting sub-set if data from the total data and each model has high-variance and low bias. Boosting: same as above but the difference is instead of training them parallely ,output of one model is given as input to the other and each model should have high-bias and low-variance.
Hello Aman Sir, Thank you for the great video, simple explanation. Could you please elaborate on how the meta-model is built and used for the testing / real-test set? Like here, the meta-model uses Logistic Regression, right? How a logistic regression works to stack the results from base model?
I am a new in this field and I was trying to understand this concept, refered many webpages and seen many videos. You explain Very nicely. I got it the concept.
Bagging is Bootstrap Aggregation which is used primarily to reduce Variance , it uses CLT to do the same. Boosting improves the base learners by learning from mistake of the previous model using homogeneous weak learners, it helps in reducing Bias.
In bagging , we make different subset of dataset by using row sampling and replacement and that subset we pass different model's to make prediction and at the end we combine or aggregate all of the model prediction.
bagging is the process known as parallel computing and in this method, we can choose rows and columns with replacement and its example is randomforest....but in boosting it is a series computing and example xgboost.
In bagging we take base learner models, with high variance and low bias,. Eg - random forest we typically take decision trees ( which are Fully grown to their max depth, with max_depth = None) as such decisions trees are high variance, models. The main aim of bagging itself is to reduce high variance of the overall/final model, In bagging we have bootstrap (row sampling and column sampling) and aggregation steps which helps to achieve low variance final model, Also every base learner is being train on a sample dataset not on whole data set, so every base learner is learning something unique or different from other base learners
Thank you sir for this lecture. Want to know one thing... data which goes to Meta model consist of Independent variables and actual output value (Target variable Y) along with predictions from weak learners like LR, SVM, NN... so how does Meta Model use predictions from weak learners to predict the final output/ prediction? Is Meta model consider predictions from weak learners as additional independent variables(along with existing independent variables) and target variable as dependent variable and give final prediction ? Please help.
@@UnfoldDataScience Thanks for reply. So predictions from weak learners are taken as Independent variables and original target variable as dependant variable... Right?
Bagging helps in reducing variance due to overfitting in decision trees and further to reduce bias, boosting is used. Hence, ultimately we achieve a model with low bias and low variance.
Hi Aman, thanks for your explanation! I have a question though - is regularization and ensembling the same? In the decision trees case we use the same techniques of bagging and boosting, so, if i'm regularizing am i implicitly ensembling and viceversa? Thank you!
Hi Aman, thanks for your explanation. I have a question regarding deep learning models, Can we stack yolov4 (which I converted the .weight file into a .h5 file) and other CNN models like InceptionResnetV2+LSTM into one ensembling model for different classification with different data?
Than you very much Just I want to understand there are how many approaches when imlpimenting stacked ensemble learning I mean when we combine base learners to meta learner
Hi Aman, Thanks for the video..I have one question When we have finished training the stacked model and now when we have test sample..will the test sample go through all the learners + metal model (i.e, SVM, Random forest, gaussian and logistic regression-meta model) or we will feed the test sample only to meta model (i.e., logistic regression in our case)?
Thank you for the video Can this approach be usfull for semantic segmentation purposes? For example we have metamodel consists of UNet, Deeplab and FCN And metaclassifier FCN Is it going to get better result?
Hi.. I am very much interested in ML concepts and try to build career in this, but I can see lots of mathematical derivations are there when trying to learn any new concept, also so many libraries are there, its quite difficult to get acquainted to all these, can you please guide how to actually learn all these so that can be understood well.
@5:48 at this time, you said "and this training data goes to another model, called meta model." the way you pointing the finger and that what to said is not getting understood by me. For me these are very much imp so i goes to the depth of each n every word along with action. Please kindly sort my query please? and what is training model here after dividing 75 records into 80-20%....? if it is 80% (as i know) then y didn't you mentioned it....im confused
I can share with you my understanding. First, we divide the 100 training examples into 75 Training and 25 Test examples. Then we divide the 75 Training examples into (80% Training and 20% Validation examples) i.e. 60 training and 15 validation examples. After that, we train the different base models on these 60 training examples and make predictions on the 15 validation examples. The predictions on the 15 examples become the input to our meta-model. Now we train the meta-model and test our accuracy on the initial 25 test examples. In short this Blending. When we follow the K-fold approach to divide the 75 Training examples to divide and train as 60 training and 15 validation examples, it is called stacking. Hope it helps. Happy learning.
First of all thanks for the video.
Bagging: when we take different models and train them parallely with each getting sub-set if data from the total data and each model has high-variance and low bias.
Boosting: same as above but the difference is instead of training them parallely ,output of one model is given as input to the other and each model should have high-bias and low-variance.
Well said!
I've been struggling to understand this for quite a few hours now. Finally, got it. Thank you so much!
Glad it helped!
That is very nicely Explained. Thank you, Sir.
Hello Aman Sir, Thank you for the great video, simple explanation.
Could you please elaborate on how the meta-model is built and used for the testing / real-test set?
Like here, the meta-model uses Logistic Regression, right? How a logistic regression works to stack the results from base model?
I am a new in this field and I was trying to understand this concept, refered many webpages and seen many videos. You explain Very nicely. I got it the concept.
Thanks for watching.
Thanks a lot... Was struggling with this Stacking approach.... Now it's clear!
Cheers Ranajay :)
Great explanation of the concept. Thank you for also showing the python samples to really bring it home.
I beg my pardon....I was struggling with this technique
Very clearly understood and the code n got executed!!
Thanks a lot
Thanks for watching Sri.
I love that you throughly explained the theory before you dove into the code. Great job!
Thank you.
Laudable teaching. Learnt a lot.
Absolutely very good explanation , better than my professor
Thanks a lot. your comments motivate me.
Wow, fast and cleare, thanks.
You're welcome!
Can you make a separate video for Blending with detailed example and implementation without the libraries?
finished watching
crystal clear explanation
Greatly explained💥👌
Eid Mubarak Farhan. Tc
Bagging is Bootstrap Aggregation which is used primarily to reduce Variance , it uses CLT to do the same. Boosting improves the base learners by learning from mistake of the previous model using homogeneous weak learners, it helps in reducing Bias.
Thanks Bharat.
Thanks a lot bro! ... Helped a lot for one of my projects!!
Welcome Harsha.
Beautiful man got my concepts cleared you deserve more reach.
Thanks a lot Akshay. Kindly share in the data science groups you are part of :)
Good content
Thanks a lot.
Sir Namaskar. That code you did in Python is for stacking or blending, kindly say.
Nice class
Thanks
Thanks for your sharing.
My pleasure
Very well explained. Can you also explain KcrossK cross validations and go in dept of meta model.
Thank you for explaining .Can you suggest which ensemble techniques is suitable for deep learning model for video classification task
Yes
Simply wonderful!
Many thanks!
Good Explanation... Thank you
You are welcome
Thnks 4 the video, sir, can i perform stacking between different CNN models and feature fusion in between these models
How to use stacking regressor models from sklearn and keras??
In bagging , we make different subset of dataset by using row sampling and replacement and that subset we pass different model's to make prediction and at the end we combine or aggregate all of the model prediction.
Thank you.
bagging is the process known as parallel computing and in this method, we can choose rows and columns with replacement and its example is randomforest....but in boosting it is a series computing and example xgboost.
Correct. Thank you.
In bagging we take base learner models, with high variance and low bias,.
Eg - random forest we typically take decision trees ( which are Fully grown to their max depth, with max_depth = None) as such decisions trees are high variance, models.
The main aim of bagging itself is to reduce high variance of the overall/final model,
In bagging we have bootstrap (row sampling and column sampling) and aggregation steps which helps to achieve low variance final model,
Also every base learner is being train on a sample dataset not on whole data set, so every base learner is learning something unique or different from other base learners
Thanks!
Thank you sir for this lecture.
Want to know one thing... data which goes to Meta model consist of Independent variables and actual output value (Target variable Y) along with predictions from weak learners like LR, SVM, NN... so how does Meta Model use predictions from weak learners to predict the final output/ prediction?
Is Meta model consider predictions from weak learners as additional independent variables(along with existing independent variables) and target variable as dependent variable and give final prediction ?
Please help.
Good question Tushar.
The meta model will take predictions from weak learner as features.(no original feature)
@@UnfoldDataScience Thanks for reply. So predictions from weak learners are taken as Independent variables and original target variable as dependant variable... Right?
Love your study sir..
Thanks Sangram.
Bagging helps in reducing variance due to overfitting in decision trees and further to reduce bias, boosting is used. Hence, ultimately we achieve a model with low bias and low variance.
Correct Rahul.
thank you
You're welcome Anurag.
Thanks a lot , i am a beginner
Thanks Neeraj.
Nice subject
Thanks a lot :)
Thank you! Keep making these videos.
Will do Chitram. Your comments are my motivation.
excellent
Thanks Chris
Clearly Explained.
Thanks a lo for motivating me.
@@UnfoldDataScience Sir, Can you share the notebook of this tutorial?
Can we do level 2 meta model. Also can we insert new training features in meta model?
Hi Aman, thanks for your explanation! I have a question though - is regularization and ensembling the same? In the decision trees case we use the same techniques of bagging and boosting, so, if i'm regularizing am i implicitly ensembling and viceversa?
Thank you!
Good Explanation.
Glad you liked it
Hi Aman, thanks for your explanation. I have a question regarding deep learning models, Can we stack yolov4 (which I converted the .weight file into a .h5 file) and other CNN models like InceptionResnetV2+LSTM into one ensembling model for different classification with different data?
I also have the same question...
Than you very much
Just I want to understand there are how many approaches when imlpimenting stacked ensemble learning
I mean when we combine base learners to meta learner
The way of implementing can be many depending on how u implement that in code however the internal logic remains same.
Hi. Thank you so much for the video. Can you please guide on how to merge 2 BERT models together. Thanks for the help!
Thanks Jyoti, will do
Thank you sir
So nice of you Sabeena.
While executing the for loop, there is an error message. "type error:KNN not iterable" .How to solve this?
nice explanation, how can we do testing with test data set
same way like normal ML
Very good Aman
Thank you.
thanks
You're most welcome
Hi Aman, Thanks for the video..I have one question
When we have finished training the stacked model and now when we have test sample..will the test sample go through all the learners + metal model (i.e, SVM, Random forest, gaussian and logistic regression-meta model) or we will feed the test sample only to meta model (i.e., logistic regression in our case)?
Also have this question in mind. Why is no one answering this?
Thank you for the video
Can this approach be usfull for semantic segmentation purposes?
For example we have metamodel consists of UNet, Deeplab and FCN
And metaclassifier FCN
Is it going to get better result?
Yes, we can try that, I am not 100% sure it will work though.
thank you a lot!
You're welcome raphael.
I am unable to locate this ipynb file in your respective google drive .Please guide
I think this file is missing, I will try to find out and place it however I doubt it may be in my old laptop and difficult to recover.
sir which ensemble technique we should choose? and when we should choose ? how do we decide that
Bagging and Boosting are good. Decision will happen based on available resources , Data Size etc.
@@UnfoldDataScience nobody has made a single video on this on UA-cam. you should definitely make a video on this topic !!!
Hi.. I am very much interested in ML concepts and try to build career in this, but I can see lots of mathematical derivations are there when trying to learn any new concept, also so many libraries are there, its quite difficult to get acquainted to all these, can you please guide how to actually learn all these so that can be understood well.
Hi Rinky,
Please watch my machine learning playlist once. Tell me if it boosts your confidence:
ua-cam.com/video/8PFt4Jin7B0/v-deo.html
is it possible if blending model sometimes lower accuracy that the initial model??
Possible.
Possible.
Sir stacking thoda aur clearly bolte... Blending achha tha..🙏🙏
Feedback k liye Dhanyavaad. :) dekhne k liye bhi :)
finished coding
Good day! May I request a link for a copy of your code, sir? Thank you
Getting error with this code:
# creating stacking classifier with above models
stackingclf = StackingClassifier(classifiers=[myclf1, myclf2, myclf3], meta_classifier=mylr)
Without error below code:
# creating stacking classifier with above models
stackingclf = StackingClassifier(estimators=[myclf1, myclf2, myclf3], final_estimator=mylr)
may be this argument method is not taking "meta_classifier" due to version issue.
Thanks a lot @anugati saved my time!
Sir ..Plz exam briefly... Not understanding 😟😟
@5:48 at this time, you said "and this training data goes to another model, called meta model."
the way you pointing the finger and that what to said is not getting understood by me. For me these are very much imp so i goes to the depth of each n every word along with action. Please kindly sort my query please?
and
what is training model here after dividing 75 records into 80-20%....?
if it is 80% (as i know) then y didn't you mentioned it....im confused
I can share with you my understanding. First, we divide the 100 training examples into 75 Training and 25 Test examples. Then we divide the 75 Training examples into (80% Training and 20% Validation examples) i.e. 60 training and 15 validation examples. After that, we train the different base models on these 60 training examples and make predictions on the 15 validation examples. The predictions on the 15 examples become the input to our meta-model. Now we train the meta-model and test our accuracy on the initial 25 test examples. In short this Blending. When we follow the K-fold approach to divide the 75 Training examples to divide and train as 60 training and 15 validation examples, it is called stacking. Hope it helps. Happy learning.
Thanks Mohammad and Shivansh for discussion.
please make subtitle
iris is not binary classification it has more than 2 classes in target variable
Yes, Correct it has three categories. Did I say 2, Thanks for pointing out.
@@UnfoldDataScience you didn't say 2 but you said iris is a binary classification dataset .....
Code for this one?
I m searching for it however there is a possibility it may have it in my old laptop. I will try to find and upload.
Good job
Thanks Sesha.