Does a highly skewed feature affect the AUC of Decision tree classifier? and do I have to remove outliers? I have a feature with 80% of values as 0 and the maximum is 13, I have tried log and sqrt transformation but it's still highly skewed
hey bhavesh !.....When max_features=10 what is the difference in selecting the best attribute/feature either from 10 or 50(total number of features). As both the features were selected based on the information gain or Gini gain.
when the best feature is being selected from 50(total number of features), the same feature i.e. the topmost best feature out all the features will be selected to split the internal nodes every time when there needs to be a split in the sub trees .... whereas when max_features =10, it randomly chooses 10 features for each split and out of those 10 .. the best feature from those randomly selected 10 features will be used to split the internal nodes.
nice video but if pictorially explained it would have been more interesting
very useful but better if you provide more figures and images to better explain what each parameter means
I'm badly stuck here, please explain with the practical implementation
do you have similar presentation for the hyper parameters of random forest and xgboost?
Does a highly skewed feature affect the AUC of Decision tree classifier? and do I have to remove outliers? I have a feature with 80% of values as 0 and the maximum is 13, I have tried log and sqrt transformation but it's still highly skewed
try binning or reduce the skew by IQR
Do a similar video with lgbm instead of decision trees wherein more hyperparameters come into the picture
Sure!
Its really helpful!
Thank you so much for the video, really appreciate it
I'm glad you liked the video :)
What aspect that considered to know which hyper parameter we should use in our decision tree clasifier?
love the information shared by you sir
can you please make video on pre pruning and post pruning in R.F
Hello!
Do you have some advice about hyper parameters to use in regression randonforest?
can u show it on a dataset
sure, I'll create a video on it soon!
Can you tell about CP values in Decision Tree?
Insightful video...thank you
You are welcome 😊
gracias bro.. saludos 🇨🇱
Nice explanation!
Glad it was helpful!
Nice one!!
Great helpful!
Glad it was helpful!
hey bhavesh !.....When max_features=10 what is the difference in selecting the best attribute/feature either from 10 or 50(total number of features).
As both the features were selected based on the information gain or Gini gain.
when the best feature is being selected from 50(total number of features), the same feature i.e. the topmost best feature out all the features will be selected to split the internal nodes every time when there needs to be a split in the sub trees .... whereas when max_features =10, it randomly chooses 10 features for each split and out of those 10 .. the best feature from those randomly selected 10 features will be used to split the internal nodes.