Using Pipeline for making Machine Learning workflow easy | Machine Learning

6.1 Scikit-Learn ColumnTransformer [Applied Machine Learning || Varada Kolhatkar || UBC]

Using Column Transformer and Pipeline to handle data with missing values | Machine Learning

вернулись в ПРОШЛОЕ 🔃 | WICSUR #shorts

Ветеран війни отримав гроші на житло

Рождение Немецкой Легенды - Mercedes 190E 2.3-16

(Part 1) Using Column Transformer for making Machine Learning workflow easy | Machine Learning

Rachit Toshniwal

Переглядів 6 348

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 10 лют 2025
In this tutorial, we'll look at Column Transformer, a powerful data pre-processing technique for making machine learning workflow super simple.
Column Transformers can be used in conjunction with Pipelines and GridSearchCV to further let the model itself pick best parameters for the best working model performance.
In the tutorial, we'll be going through all the nitty-gritties of Column Transformer, and discuss when, how, where to use them.
I've uploaded all the relevant code and datasets used here (and all other tutorials for that matter) on my github page which is accessible here:
Link:
github.com/rac...
If you like my content, please do not forget to upvote this video and subscribe to my channel!
If you have any qualms regarding any of the content here, please feel free to comment below and I'll be happy to assist you in whatever capacity possible.
Thank you!

КОМЕНТАРІ • 40

@imdadood5705 3 роки тому
Just what I needed!
@skyrayzor3693 Рік тому
This tutorial is awesome!!
@deepanshumahour3318 3 роки тому ⁺¹
well explained!!!
Please Keep this work up,
I hope your channel will grow rapidly
@KumarHemjeet 3 роки тому
This is amazing..please keep making videos..don't stop !
@rachittoshniwal 3 роки тому
Haha! Thanks!
@owusubright1046 3 роки тому
I have seen a lot of youtube channels which are very good and have many content but bro you channel conquers them all. Please do more videos on other fields of Machine Learning and Deep Learning. Thanks and my respect to you bro.
@rachittoshniwal 3 роки тому
Thanks man! Appreciate it :)
@slimmoses3376 3 роки тому
This video helped me so much. Keep up the awesome work!
@rachittoshniwal 3 роки тому
Thanks! I'm glad it helped!
@DeenQuery 3 роки тому
how lovely :-) thank you so very much
@amolkabugade3728 3 роки тому
very nicely explained in a very smooth manner.......thank you so much sir..
@rachittoshniwal 3 роки тому
Thanks Amol! Sir mat bolo 😅
@ShubhamKumar-xy6kj Рік тому
Great video bro...
@nikitanaidu1651 3 роки тому
Great content!
@rachittoshniwal 3 роки тому
Thanks Nikita, I'm glad it helped! :)
@dipanwitasarkar5185 4 роки тому ⁺¹
I got immense and deep understanding of how I can make life easier with sklearn ColumnTransformer. Thank you so much for the video.
If you can kindly comment on how to get back the column names in original dataframe once encoding is done.
@rachittoshniwal 4 роки тому ⁺¹
Hi, Dipanwita, I'm glad it helped!
Getting back the columns names is a little tricky, but possible nonetheless. Every column data can be extracted from the "transformers_" attribute of our 'ct' column transformer object (in the video)
in the 'ct', RobustScaler is the first t/f, hence the first 6 columns in the output are of RS. To extract those, we'd need to do something like:
a = ct.transformers_[0][2]
next is OneHotEncoder in the 'ct', hence to get those columns, we'd need to do:
b = ct.transformers_[1][1].get_feature_names( )
We're dropping the remaining columns, hence "a + list(b)" should give us the full list of columns in the correct order of output.
in our case, remainder was "drop", but if it was passthrough, those columns would be situated at the very last in the output df
To get them, we'd do:
c = ct.transformers_[2][2], and this 'c' now contains the index positions of the columns in the original dataframe which were passed through. In our case, it is index 3, hence df.columns[3] is the passthrough column, and we can append this column to the "a + list(b)" list (IFF remainder was "passthrough")
hope it helps!
@owusubright1046 4 роки тому ⁺¹
Nice course man well done. Well explained everything thanks for such good content.
@roshantonge1952 Рік тому
very good video
@olatheog Рік тому
This is such a great video. I am just sad you did not end it with fitting a model and training after transforming as that is where I have problems. Is there another video of yours where you did that? I would really appreciate. Thank you
@rachittoshniwal Рік тому ⁺¹
Thanks! I do have a couple of end to end project videos where I've fitted models after transforming. Hope they help!
@martinbielke8301 4 роки тому
great tutorial!
@rachittoshniwal 4 роки тому
Thanks Martin! Appreciate that!
@martinbielke8301 4 роки тому
@@rachittoshniwal You are welcome. Do you have something on "feature importance"? If not a tutorial maybe some web page that you could recommend? I'd appreciate that very much.
@rachittoshniwal 4 роки тому
@@martinbielke8301 I'll definitely make one on feature importances. But for now, you can have a look at these excellent links:
mljar.com/blog/feature-importance-in-random-forest/
machinelearningmastery.com/calculate-feature-importance-with-python/
@ashishsikarwar7578 3 роки тому
Thank you Rachit for sharing such a great content. I am new to machine learning, can you do a video on "from applying ColumnTransformer on categorical values and then all the way to use them for Linear regression and other algorithms/models"
@rachittoshniwal 3 роки тому
Hi, i have done a similar video here: ua-cam.com/video/wXQRLpDF-ms/v-deo.html
hope it helps!
@ashishsikarwar7578 3 роки тому ⁺¹
@@rachittoshniwal Will check out, thank you very much!
@ankitlakshya450 2 роки тому
Can we perform this feature engineering before train test split or is it mandatory to do it after train test split
@JavedKhan-nr2oo 3 роки тому
sir coloumn transfer can we use oridinal encoding label encoding and one hot encodig can u please explain Thank you
@vish183 4 роки тому
Cant thank you enough for the knowledge imparted. Kudos !!! . A suggestion - Am looking at a variable which needs imputation before One Hot Encoding. Can i perform both the steps in a single code of column transformer or should there be multiple column transformers, which would later be combined using Pipeline functionality?? Please help
@rachittoshniwal 4 роки тому
Thanks man! Appreciate that!
I cover exactly this in this video:
ua-cam.com/video/a6o9ies85eM/v-deo.html
Have a look , and hope it helps!
@ajaykushwaha-je6mw 3 роки тому
A question, why we are not using CT for hours_per_week ?
@rachittoshniwal 3 роки тому
I just wanted to demonstrate how we can exclude some columns from the transformations and pass them unfiltered. No other reason really.
@hudaali5708 3 роки тому
How can I get the names of the columns back? :"""""(
Please HELP!!!!!!!!!!!!!
@rachittoshniwal 3 роки тому ⁺¹
Getting back the columns names is a little tricky, but possible nonetheless. Every column data can be extracted from the "transformers_" attribute of our 'ct' column transformer object (in the video)
in the 'ct', RobustScaler is the first t/f, hence the first 6 columns in the output are of RS. To extract those, we'd need to do something like:
a = ct.transformers_[0][2]
next is OneHotEncoder in the 'ct', hence to get those columns, we'd need to do:
b = ct.transformers_[1][1].get_feature_names( )
We're dropping the remaining columns, hence "a + list(b)" should give us the full list of columns in the correct order of output.
in our case, remainder was "drop", but if it was passthrough, those columns would be situated at the very last in the output df
To get them, we'd do:
c = ct.transformers_[2][2], and this 'c' now contains the index positions of the columns in the original dataframe which were passed through. In our case, it is index 3, hence df.columns[3] is the passthrough column, and we can append this column to the "a + list(b)" list (IFF remainder was "passthrough")
hope it helps!
@amolkabugade3728 3 роки тому
any way to reach you..
@rachittoshniwal 3 роки тому
Sure, here! www.linkedin.com/in/rachit-toshniwal
@ajaykushwaha-je6mw 3 роки тому
o=OneHotEncoder(drop=First) # this will drop 1 label from each Feature.
@rachittoshniwal 3 роки тому
Yes it will

Наступне

Автоматичне відтворення

Using Pipeline for making Machine Learning workflow easy | Machine Learning

Using Pipeline for making Machine Learning workflow easy | Machine Learning

6.1 Scikit-Learn ColumnTransformer [Applied Machine Learning || Varada Kolhatkar || UBC]

6.1 Scikit-Learn ColumnTransformer [Applied Machine Learning || Varada Kolhatkar || UBC]

Using Column Transformer and Pipeline to handle data with missing values | Machine Learning

Using Column Transformer and Pipeline to handle data with missing values | Machine Learning

вернулись в ПРОШЛОЕ 🔃 | WICSUR #shorts

вернулись в ПРОШЛОЕ 🔃 | WICSUR #shorts

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло

Рождение Немецкой Легенды - Mercedes 190E 2.3-16

Рождение Немецкой Легенды - Mercedes 190E 2.3-16

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

Simplify Data Preprocessing with Python's Column Transformer: A Step-by-Step Guide

Simplify Data Preprocessing with Python's Column Transformer: A Step-by-Step Guide

Use Pipeline to chain together multiple steps

Use Pipeline to chain together multiple steps

Implementing Machine Learninng Pipelines USsing Sklearn And Python

Implementing Machine Learninng Pipelines USsing Sklearn And Python

How do I encode categorical features using scikit-learn?

How do I encode categorical features using scikit-learn?

Data Preprocessing 06: One Hot Encoding python | Scikit Learn | Machine Learning

Data Preprocessing 06: One Hot Encoding python | Scikit Learn | Machine Learning

Feature Scaling (How it really works?) Explained !!

Feature Scaling (How it really works?) Explained !!

fit vs transform vs fit_transform | fit vs fit_transform | fit and fit_transofrm in sklearn

fit vs transform vs fit_transform | fit vs fit_transform | fit and fit_transofrm in sklearn

Julie Michelman - Pandas, Pipelines, and Custom Transformers

Julie Michelman - Pandas, Pipelines, and Custom Transformers

What if all the world's biggest problems have the same solution?

What if all the world's biggest problems have the same solution?

Ветеран війни отримав гроші на житло

Ветеран війни отримав гроші на житло

How Strong Is Tape?

How Strong Is Tape?

1% vs 100% #beatbox #tiktok

1% vs 100% #beatbox #tiktok

«Шнурки не зрізайте, акуратненько»: медик про реакцію військових на поранення #shorts

«Шнурки не зрізайте, акуратненько»: медик про реакцію військових на поранення #shorts

Психіатр Глузман УПЕРШЕ сканує Зеленського, Путіна й Трампа

Психіатр Глузман УПЕРШЕ сканує Зеленського, Путіна й Трампа

Тайское мороженое в Калининграде

Тайское мороженое в Калининграде

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)

Дал Свою Безлимитную Карту Друзьям, Потратили Миллионы... (Хазяева, Кокошка, Дилблин, Сатир)

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!

КТО НЕ ДВИНЕТСЯ, ПОЛУЧИТ МАШИНУ!