Dear All, if you are looking for feature engineering materials, please check my feature engineering playlist, all videos are available. Happy Learning!
Sir, Can you please send me the all feature engineering technique file. it will be very helpful to me, if you send them. My email id is ara007kumar@gmail.com
What a coincedence, today is also an Independence day, this really suprised me, I was following your youtube videos and suddenly you greeted, for a movement it got a smile on my face. Happy Independence day.
thank you so much respected sir. Alot love for you from pakistan. this video was very helpfull. we are looking foreword to see others playlist like these from you. once again thanks
Sir, please share the link once again. I saw your video and it's a very helpful for the student's like me. I want to know more about the feature engineering. Thank you for making such an amazing lecture. Waiting for the feature engineering link.
thanks sir for listening to my request to create a video on mean encoding , i am really enjoying your videos , and i have learned a lot from that. Please continue to create such awesome videos.
Hi Krish, it is just a suggestion if u start same channel in Hindi language. It will more helpful to those Indian students who are living in small cities and not much familiar with English lecture. Hope u understand my request. I'm your regular viewer and respect ur effort and knowledge. God luck.
@@SanthoshKumar-dk8vs you can fork it from either mine or krish's GitHub account. Check Krish's video description for his GitHub link and you find all there
Hello sir...your way of teaching is really incredible. I am studying through your lecture for past 1week and that's why unable to fill the form to get the materials which you have prepared for the same... So if possible please enable the form link again...
Sir, I am working in Data Science for a long time but want to your all playlist as I already have covered some of them. I need your notes on Feature Engineering so can you provide me it now. I shall be very thankful to you for this kindness. Best wishes more love for you from Pakistan.
so happy I found your channel...wooh amazing lecture Please send me the zip file with respect to feature engineering thank you sir will definitely join your channel.
Thank You Sir!!things we can understand easily by your Videos.Sir could you pleasee reopen the link where we could get the Feature engg materials that could be more great
by introducing a higher number to the categories on the basis of a higher no. of occurrence in a given class ( say here 1) are you not introducing bias in the dataset? ( target guided ordinal encoding)
@@sravanijammula573 Krish uploaded the form when he uploaded the video. Now it's old so I think he removed that. I am also not able to fill the form as I saw video very late
Hello, this may be late but i'll try to answer. Basically the output column is the target column. The example used on the video is binary classification, so there are only two class which is assigned to 0 and 1. If your target is not in numeric value, you can convert it yourself using pandas with assign function if i'm not mistaken.
@16:18 position you are saying to use 'one-hot encoding with multi-category' for an ensemble technique. But the beginning of the video you had explained ensemble techniques does not require feature scaling. Can you please clarify?
Hey Krish! I have few questions based on encoding which are 1. Let’s suppose I have a feature which has 1000 different categories which I need to convert to either an integer/float how should I do that?Here I can’t go for one hot encoding as it might create 999 columns.And also it has only 1100 record /rows by which even though going by the “one hot encoding with multiple categories” method the most repeated categories will be extremely less how do we handle it in such cases? 2. In Ordinal encoding why are the ranks need to be assigned to a categorical label instead we can give some random unique number to the categorical values without ranking them for example PhD as 1 , BE as 2, Masters as 4 and Stats as 3. 3. Also regarding “Label Encoding” how are the ranks decided say if PhD needs to be given higher rank let’s suppose ‘4’ how can a library know that it should be given a higher rank? Or is it something else that in the library code we need to manually set it?
For the first point u should not apply one hot encoding instead we can go ahead with Mean encoding. Label encoding for ordinal categorical will be assigned with ranks. In this case PhD should have a highest rank or label. This will help us to specify the ML algorithm where in we are providing higher importance to phd
In mean encoding,If the feature values are replaced by the mean values ,the no of data values in the pincode column are still the same right??Then whats the point of doing mean encoding?
Can anyone clarify at 18:10 , how to find the mean here? are we adding all '0' and '1' corresponding to A and dividing the total by number of occurences of A?
Hi Krish, I am confused with your explanation. My doubts are: You said for target guided ordinal categories you are assigning the rank based on mean values then how does it matter if the category is nominal or ordinal since the ranks are assigned based on mean values and not the inherent rank/order of the variable itself. Also for label encoding the number actually mean anything since it isn't like price/sales where the number holds significance so won't the result be junk/unusable and if they are usable then how do you interpret the result
If the categorical feature is ordinal then we can assign labels to it. Here the category which obtains highest target mean will be assigned the highest label value. If the categorical feature is nominal then we cannot assign labels. This is because in label encoding the number does mean something. Different numbers teach the model to make different predictions. For example, a value of 4 (PHD) in the salary prediction example results in a prediction of higher salary whereas a value of 1 (Bcom) results in prediction of a lower salary! This happens because, in the training data, entries with PHD as educational attribute will have a higher salary in the target column. This is generally useful when we do not know the exact ranking!! We find the correlation on the categorical feature to the target and then rank categories according to the mean of the values observed in the target! If the categorical feature is nominal, we do not want the algorithm to learn more from some categories compared to the others. Hence using one hot encoding, we set the values to be 1 and 0. Now as for Mean encoding of nominal categorical features, we essentially map a relationship between the categories and the target. When applied to the example of pin code numbers, some pin codes may result in higher salaries (assuming we are trying to predict salary again). Hence the mean of that pin code will be higher. We simply map in the mean value in place of that category!! So now when the model learns, it will know that a data point with high value in the pin code feature should predict higher salary! Regards!
How would we calculate the mean if the output is a multi class classification. In that case shall we take 0,1,2 as output? In the eg you have taken 0&1. Here we can do the calculation. What if there are more than 2 classification outputs. If you could attach the notebooks in a link it would be easier i guess instead of sending personal emails. Just a suggestion.
Hi Krish, At 20:23 the Label for A - 0 and A - 1 will be different based on mean right ? for example the mean will be calculated this way right ? A - 1 => 0.73 B - 1 => 0.6 C - 1 => 0.4 A - 0 => 0.5 B - 0 => 0.35 C - 0 => 0.36 Then the ordering of feature will be as below right ? A - 1 >> B - 1 >> A - 0 >> C -1 >> C - 0 >> B - 0
Hi krish, thanks so much for shedding light on this topic of Feature Engineering. I'm at Beginner Level of learning DS/ML and I really fell in love with your way of teaching these techniques. I would really love to get that document on FE you mentioned about in this video. I tried to drop my details via the google form but I see it's closed. Kindly assist please. Thanks in advance!
Hi Krish, We cannot perform Mean or Target encoding on test data because we don't have target column in test data. So how can we deal with such a situation where we have variable with multiple level in it? I am talking in respect with Hackathon where we generally don't have target variable, this is something which we have to predict. Would appreciate your help.
You already got the ordinal number or the float number for each category class from the training data . So you dont need to do it again in test data. You will simply use it. You might already know this. But I am answering if someone else has this doubt.
Sir please could you please tell us why the theory of computation is actually used and what are the application of these subjects please Sir make a video on that
Hi Krish ..... I had recently started to follow up your video & it was very helpful. could you please provide me the materials related to feature Engineering......thanks in advance
Sorry for the stupid question: what´s the output? I mean you encode before you apply any ML algorithm and to that point you just have the dataset, what kind of output do you mean here? Thanx
In Target guided encoding if mean of two variables are same then how to assign numbers? As both has same mean how to decide for which has to give more numbers.?
Dear Sir - In one hot encoding with multiple categories - We are only taking the top 10 categories and applying One Hot Encoding on the same. What about the other categories, as we are dissolving the column completely to apply One-Hot-Encoding.
Hi sir. I have started to learn ML from your channel only. Thank you for your knowledge that you are sharing with us. I also have one request for you can i get feature engineering zip file now. I am really interested in ML.
use groupby function to create a data frame consisting of mean value of a feature wrt to the id , or name( of which you want to know the value , then use merge function to add this mean value data frame to your original data frame( which you will use for training you model)
Hi Krish, I have started liking your channel so much. Hats off for the great service you are doing for the aspiring and already experienced Datascientists. The form url which you have shared is no more available. Could yuo please share the material via google drive or reactivate the form.
Hey Krish, can you please do share the ZIP file which you have mentioned in the video about the Feature engineering, as I am unable to open the Google url link. it will be more helpful if you help me with the file.
For Ordinal Data: How Label Encoding works ? I have some confusion here because in case of ordinal data value with weight matters ? Can you pls explain bit in detail pls.
For one hot encoding with multiple catergories, it will create columns for top most categories.Then what happened for the records with remaining categories. Do they have 0 for all columns.
Dear All, if you are looking for feature engineering materials, please check my feature engineering playlist, all videos are available. Happy Learning!
if you don't mind will u reopen the link or provide your writen codes on github with link
@@yash20december all materials are available in feature engineering playlist
Thank you sir.
is there something more you provide for the paid ones. please let me know.
Sir, Can you please send me the all feature engineering technique file. it will be very helpful to me, if you send them. My email id is
ara007kumar@gmail.com
What a coincedence, today is also an Independence day, this really suprised me, I was following your youtube videos and suddenly you greeted, for a movement it got a smile on my face. Happy Independence day.
you are the best, greetings from an ecuadorian studying in Portugal.
There was doubt from so long about this that when there are more than 100 types of value then how to do encoding which is clear today thank you sir 🙏🙏
Hi
bro can you send me the material
thank you so much respected sir. Alot love for you from pakistan. this video was very helpfull. we are looking foreword to see others playlist like these from you. once again thanks
Just started watching your videos. You explain the concepts in a simple manner.Thanks
No Words for education. Many Thanks and wishes for futures.
you saved my day with mean encoding
Hi Krish, It's the best video I have ever seen. Crystal clear.
Sir, please share the link once again. I saw your video and it's a very helpful for the student's like me. I want to know more about the feature engineering.
Thank you for making such an amazing lecture. Waiting for the feature engineering link.
Hi
Krish your way of explanation is just amazing....Thanks for these amazing videos and yes please share zip file
Hi
Hi
Still the best video out there. I think other content dont know what a practitioner of DS needs at 2:30 am .... :p
We need mentor like you... Great job👍
Krish Sir the way you explain is easy to understand. Please reopen the form. Thanks 🙂
Wao thank you soo much, sir you explained soo well. whenever I face any doubts your video saves my day.. God bless u .. Happy Learning
Hi krish, nice way to collect the data free of cost.
thanks a lot, this thing can't be explained better than how you explained it.
I just became Fan of your ML knowledge.
Hi
Please re-open the form for feature engineering techniques. Thank you.
Yes sir please re-open the form
Excited to learn the coding part too Sir.
Thank you for putting the time and efforts to create this video, also all other videos. Very helpful.!
thanks sir for listening to my request to create a video on mean encoding , i am really enjoying your videos , and i have learned a lot from that. Please continue to create such awesome videos.
Hi
Sir u r doing really great and I think under your guidance I will become a good data scientist soon...please help me sir
you are doing a wonderful job Kris...👏👏
I liked the Mean Encoding technique and Target-guided encoding. We are preserving the normality of the data as well as not increasing the dimensions.
Thanks Krish Bhai..I have learned a lot from your videos
Thanks man! Great content The Lord bless you with more understanding and help you to know Him better and better
Very useful information provided by u sir. Thank you.
Thank you so much for sharing your knowledge with us
Hi Krish, it is just a suggestion if u start same channel in Hindi language. It will more helpful to those Indian students who are living in small cities and not much familiar with English lecture. Hope u understand my request. I'm your regular viewer and respect ur effort and knowledge. God luck.
Great help Krish... Thanks for your video man
Vishal Shukla. could you please share this docs with me on dolly.shukla7860@gmail.com
Hi
Hey krish, nice video as usual... Filled the form and thanks for making motivational and additional support videos for encouragement. Kudos
Hi bro, could you please send me featuring document pls?
@@SanthoshKumar-dk8vs you can fork it from either mine or krish's GitHub account. Check Krish's video description for his GitHub link and you find all there
Hello bro, can you share zip file, bcz I watched it today so not able to fill form as you know.
Kaushalshivam2018@gmail.com
hi bro this is sarath..
I am a data scientist aspirant can you share me feature engineering notes..
mail id : sarath20994@gmail.com
Hello sir...your way of teaching is really incredible.
I am studying through your lecture for past 1week and that's why unable to fill the form to get the materials which you have prepared for the same...
So if possible please enable the form link again...
I came across this video today and i like to learn more feature engineering
"if you don't mind will u reopen the link sir"
Yes pleaseeee
Yes please sir reopen
Yes, it's very much needed now
yes please sir
It's on the GitHub
Sir, I am working in Data Science for a long time but want to your all playlist as I already have covered some of them. I need your notes on Feature Engineering so can you provide me it now. I shall be very thankful to you for this kindness.
Best wishes more love for you from Pakistan.
Hi
Excellent Explanation Sir, Thanks a lot
You are the best sir.
Hi sir,
I want the feature engineering doc. Can you please open the link for the form?
Waiting for your response
Hello I want the feature engineering document👏👋👋👋👋👋 Just came across this video please
@@MM-vx8go its available on his github
Akhil Kasare where please
Akhil Kasare this is my email.. mmaxwell265@gmail.com
What's the github username
Amazing explanation sir 🙏🙏
so happy I found your channel...wooh amazing lecture
Please send me the zip file with respect to feature engineering
thank you sir
will definitely join your channel.
Thanks sir for all these free contents! :p
Can you send the zip file to me.
arifmollick8578@gmail.com
Thank U so much Sir for such Huge help....
The video is quite informative and easy to understand. I really loved the video :)
Sir , Thankyou for this wonderful lecture , please share the study material
Thanks alot for sharing such a absolutely amazing knowledgeable video...
Nice information about feature engineering. Thanks a lot
can you plz send it to me
Hi @Krish, can you please share the Feature engineering materials if possible. Your videos are really impressive.
excellent job Boss. really helpful
Guys plz if u don't like his videos then leave it, but don't do dislike 🙏
Great Video
plz give demo also
Hi Krish , I really liked the way you are teaching, could you please share the feature engineering study material?
Thank You Sir!!things we can understand easily by your Videos.Sir could you pleasee reopen the link where we could get the Feature engg materials that could be more great
by introducing a higher number to the categories on the basis of a higher no. of occurrence in a given class ( say here 1) are you not introducing bias in the dataset? ( target guided ordinal encoding)
Hi Krish, I am not able to fill the form. Its removed. Can you please upload that
Same here
same here
Where did krish upload the form... Can u share the link related to it
@@sravanijammula573 Krish uploaded the form when he uploaded the video. Now it's old so I think he removed that. I am also not able to fill the form as I saw video very late
Thanks vishal for the update... If u are aware of it jus post it here...
Clearly Explained, Thankyou!
Hi krish,
I started seeing your videos now and want the feature engg doc. Can you please open the link for the form?
Waiting for your response.
22:38 sir... how we find Output Columns and how it assigns as 0 or 1
Hello, this may be late but i'll try to answer. Basically the output column is the target column. The example used on the video is binary classification, so there are only two class which is assigned to 0 and 1. If your target is not in numeric value, you can convert it yourself using pandas with assign function if i'm not mistaken.
@@dionricky thanks buddy
thank you sir from tamil
@16:18 position you are saying to use 'one-hot encoding with multi-category' for an ensemble technique. But the beginning of the video you had explained ensemble techniques does not require feature scaling. Can you please clarify?
Hey Krish! I have few questions based on encoding which are
1. Let’s suppose I have a feature which has 1000 different categories which I need to convert to either an integer/float how should I do that?Here I can’t go for one hot encoding as it might create 999 columns.And also it has only 1100 record /rows by which even though going by the “one hot encoding with multiple categories” method the most repeated categories will be extremely less how do we handle it in such cases?
2. In Ordinal encoding why are the ranks need to be assigned to a categorical label instead we can give some random unique number to the categorical values without ranking them for example PhD as 1 , BE as 2, Masters as 4 and Stats as 3.
3. Also regarding “Label Encoding” how are the ranks decided say if PhD needs to be given higher rank let’s suppose ‘4’ how can a library know that it should be given a higher rank? Or is it something else that in the library code we need to manually set it?
Do you have any ideas how to tackle this issue ?
For the first point u should not apply one hot encoding instead we can go ahead with Mean encoding.
Label encoding for ordinal categorical will be assigned with ranks. In this case PhD should have a highest rank or label. This will help us to specify the ML algorithm where in we are providing higher importance to phd
Krish Naik Thanks a lot Krish👍🏻😊 And thanks a ton for your awesome content learning new things. Waiting for the part - 2 of the series😊
Really good one Krish
In mean encoding,If the feature values are replaced by the mean values ,the no of data values in the pincode column are still the same right??Then whats the point of doing mean encoding?
1st to view, 2nd to like, 1st to comment.
can someone share the feature engineering doc of krish pls? i missed filling the form.
Did you get the material? If yes, can you share it?
Can anyone clarify at 18:10 , how to find the mean here? are we adding all '0' and '1' corresponding to A and dividing the total by number of occurences of A?
yes .but am to not sure of it
Hi @krish Naik, how can get the zip file of all feature engineering techniques? kindly help
Please share with me too. Thanks
You are awesome sir 🙏
Hi
Hi Krish, I am confused with your explanation. My doubts are: You said for target guided ordinal categories you are assigning the rank based on mean values then how does it matter if the category is nominal or ordinal since the ranks are assigned based on mean values and not the inherent rank/order of the variable itself. Also for label encoding the number actually mean anything since it isn't like price/sales where the number holds significance so won't the result be junk/unusable and if they are usable then how do you interpret the result
If the categorical feature is ordinal then we can assign labels to it. Here the category which obtains highest target mean will be assigned the highest label value. If the categorical feature is nominal then we cannot assign labels.
This is because in label encoding the number does mean something. Different numbers teach the model to make different predictions. For example, a value of 4 (PHD) in the salary prediction example results in a prediction of higher salary whereas a value of 1 (Bcom) results in prediction of a lower salary! This happens because, in the training data, entries with PHD as educational attribute will have a higher salary in the target column. This is generally useful when we do not know the exact ranking!! We find the correlation on the categorical feature to the target and then rank categories according to the mean of the values observed in the target!
If the categorical feature is nominal, we do not want the algorithm to learn more from some categories compared to the others. Hence using one hot encoding, we set the values to be 1 and 0.
Now as for Mean encoding of nominal categorical features, we essentially map a relationship between the categories and the target. When applied to the example of pin code numbers, some pin codes may result in higher salaries (assuming we are trying to predict salary again). Hence the mean of that pin code will be higher. We simply map in the mean value in place of that category!! So now when the model learns, it will know that a data point with high value in the pin code feature should predict higher salary!
Regards!
Could you please upload the forum again . ?
Thanks in advance :)
Sir please open the form enteries to get zip file for feature engineering
Hi sir how is is going to be of target guided encoded and mean encoding in case of regression problem.?
How would we calculate the mean if the output is a multi class classification. In that case shall we take 0,1,2 as output?
In the eg you have taken 0&1. Here we can do the calculation. What if there are more than 2 classification outputs.
If you could attach the notebooks in a link it would be easier i guess instead of sending personal emails. Just a suggestion.
Hi
Hi Krish,
At 20:23 the Label for A - 0 and A - 1 will be different based on mean right ?
for example the mean will be calculated this way right ?
A - 1 => 0.73
B - 1 => 0.6
C - 1 => 0.4
A - 0 => 0.5
B - 0 => 0.35
C - 0 => 0.36
Then the ordering of feature will be as below right ?
A - 1 >> B - 1 >> A - 0 >> C -1 >> C - 0 >> B - 0
@krish Naik I have a doubt. FOr mean encoding and target guided encoding we need labels for encoding but how would we encoded the data at test time. ?
Hi krish, thanks so much for shedding light on this topic of Feature Engineering. I'm at Beginner Level of learning DS/ML and I really fell in love with your way of teaching these techniques. I would really love to get that document on FE you mentioned about in this video. I tried to drop my details via the google form but I see it's closed. Kindly assist please. Thanks in advance!
Hi Krish,
We cannot perform Mean or Target encoding on test data because we don't have target column in test data. So how can we deal with such a situation where we have variable with multiple level in it?
I am talking in respect with Hackathon where we generally don't have target variable, this is something which we have to predict.
Would appreciate your help.
You already got the ordinal number or the float number for each category class from the training data . So you dont need to do it again in test data. You will simply use it.
You might already know this.
But I am answering if someone else has this doubt.
Hi everyone , does deleting one dummy variable column is automatically done by onehotcoding ? Or it should be done mannually
Hi krish.. google response link not active. how can I get the material
Hi
I already joined as a member
Please do a session work on the dython package and setting categories in it
Sir please could you please tell us why the theory of computation is actually used and what are the application of these subjects please Sir make a video on that
31 dislike for what? Teaching you free of cost with market standard!!
One should provide the link of better videos,if they dislike anything. 🥇
SIr,I want the feature engineering doc. Can you please open the link again?
Hi Krish ..... I had recently started to follow up your video & it was very helpful. could you please provide me the materials related to feature Engineering......thanks in advance
Sorry for the stupid question: what´s the output? I mean you encode before you apply any ML algorithm and to that point you just have the dataset, what kind of output do you mean here? Thanx
In Target guided encoding if mean of two variables are same then how to assign numbers? As both has same mean how to decide for which has to give more numbers.?
It's very helpful, sir please reopen the form link...
Dear Sir - In one hot encoding with multiple categories - We are only taking the top 10 categories and applying One Hot Encoding on the same. What about the other categories, as we are dissolving the column completely to apply One-Hot-Encoding.
Hi
Thanks Krish!!!! :)
Hi sir.
I have started to learn ML from your channel only. Thank you for your knowledge that you are sharing with us.
I also have one request for you can i get feature engineering zip file now. I am really interested in ML.
sir, pincode is non-categorical variable. Then why do we go for encoding?
Thanks a lot for the clear explanation. Can you Please reopen the google form again?
how to perform mean and target encoding for Regression problem?
use groupby function to create a data frame consisting of mean value of a feature wrt to the id , or name( of which you want to know the value , then use merge function to add this mean value data frame to your original data frame( which you will use for training you model)
Dear Krish,
I can not find tutorial on One Hot Encoding using Postgresql. Can you show a psql syntax of it?
I read on scikit learn documentation that label encoding should be applied only to the output column.
Krish what to do when its classification problem and we have pin code column in our dataset?
Hi Krish, I have started liking your channel so much. Hats off for the great service you are doing for the aspiring and already experienced Datascientists. The form url which you have shared is no more available. Could yuo please share the material via google drive or reactivate the form.
Hey Krish, can you please do share the ZIP file which you have mentioned in the video about the Feature engineering, as I am unable to open the Google url link. it will be more helpful if you help me with the file.
For Ordinal Data: How Label Encoding works ? I have some confusion here because in case of ordinal data value with weight matters ? Can you pls explain bit in detail pls.
For one hot encoding with multiple catergories, it will create columns for top most categories.Then what happened for the records with remaining categories. Do they have 0 for all columns.
Thanks for sharing, the video is helpful!
My doubt is with the mean encoding
What if two values in one feature get the same mean ?
in one hot encoding for mult. category if top 20 category has same count then what should be done