thank you so much for the video but i have a question in adasyn we choose must far instance to have higher chance of being sampled to avoid over fitting ? if yes but what if was a noisy one ?
Great video sir! Thanks a ton. Just had a very fundamental doubt. SMOTE algorithm seems to be creating a linearly dependant set of new data points. Shouldn't this mean that these points are adding no new information for the ML algorithm to learn from? Thanks in advance for your response.
Intersting question, it is surely not possible to add completely new information we can only enhance it, smote is trying to just increase some more points often around the boundary
Well explained sir. Just few questions to ask. 1. How many synthetic points to select between the points of original minority class? 2. Why point bridge creates problem in smote?
Answer to Question 1 is how much balance you want to create. In the example given if you have 10 such minority points and you want to create 20 synthetic points, you choose two neighbors. 2. Bridge is a problem because, these points are more similar to majority classes.
Excellent video Sir. It has been systematically presented. The video starts from the discussion about imbalanced class problem to solutions using undersampling, oversampling along with its limitations. After that SMOTE is explained and how it has been improved using Borderline SMOTE. Finally, the concept of ADASYN is presented with an example. Based on the video, I have one question- while selecting the nearest neighbours of minority class in SMOTE, do we check which class the neighbours belong to?
regarding to ADASYN: why more samples will be created for the sample that has a high ri? if we create more sample for this then we will kindly reduce the gap between classes and it is risky to create samples in this area because its nearest neighbours are majority class
Hi Rebeen, thanks for watching the video, till the end. You are right, but that's a risk you will have to take. Because if you create samples from a dense region of the minority class, it will not add any knowledge there.
thank you so much for the video but i have a question in adasyn we choose must far instance to have higher chance of being sampled to avoid over fitting ? if yes but what if was a noisy one ?
Great video sir! Thanks a ton. Just had a very fundamental doubt.
SMOTE algorithm seems to be creating a linearly dependant set of new data points. Shouldn't this mean that these points are adding no new information for the ML algorithm to learn from? Thanks in advance for your response.
Intersting question, it is surely not possible to add completely new information we can only enhance it, smote is trying to just increase some more points often around the boundary
Nice session sir❤️, can you share some more information (via link or videos) about adasyn concept ??
Yes Debjit, I went there little bit quickly please towardsdatascience.com/class-imbalance-smote-borderline-smote-adasyn-6e36c78d804
Well explained sir. Just few questions to ask.
1. How many synthetic points to select between the points of original minority class?
2. Why point bridge creates problem in smote?
Answer to Question 1 is how much balance you want to create. In the example given if you have 10 such minority points and you want to create 20 synthetic points, you choose two neighbors.
2. Bridge is a problem because, these points are more similar to majority classes.
Excellent video Sir. It has been systematically presented. The video starts from the discussion about imbalanced class problem to solutions using undersampling, oversampling along with its limitations. After that SMOTE is explained and how it has been improved using Borderline SMOTE. Finally, the concept of ADASYN is presented with an example.
Based on the video, I have one question- while selecting the nearest neighbours of minority class in SMOTE, do we check which class the neighbours belong to?
The neighbors to be taken needs to be from minority observation s
In SMOTE, on which bases do we identify a point from minority class, in step 1?
Did anyone had a case where SMOTE made ML models performance even worse?
This session creates a clear concept on class imbalance... Well explained... Thank You
Thanks a lot pallavi
regarding to ADASYN: why more samples will be created for the sample that has a high ri? if we create more sample for this then we will kindly reduce the gap between classes and it is risky to create samples in this area because its nearest neighbours are majority class
Hi Rebeen, thanks for watching the video, till the end. You are right, but that's a risk you will have to take. Because if you create samples from a dense region of the minority class, it will not add any knowledge there.
Just used SMOTE for a project recently. Nice session and thanks for the share🙏🏻
Thanks, sci kit by default allows to use borderline too
Thank you sir.
Thanks a lot
i need the material