Data Scaling in Neural Network | Feature Scaling in ANN | End to End Deep Learning Course

CampusX

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 1 лип 2024
Data scaling is a recommended pre-processing step when working with deep learning neural networks. Data scaling can be achieved by normalizing or standardizing real-valued input and output variables.
Code - colab.research.google.com/dri...
Video on Standardization: • Feature Scaling - Stan...
Video on Normalization: • Feature Scaling - Norm...
============================
Do you want to learn from me?
Check my affordable mentorship program at : learnwith.campusx.in
============================
📱 Grow with us:
CampusX' LinkedIn: / campusx-official
CampusX on Instagram for daily tips: / campusx.official
My LinkedIn: / nitish-singh-03412789
Discord: / discord
👍If you find this video helpful, consider giving it a thumbs up and subscribing for more educational videos on data science!
💭Share your thoughts, experiences, or questions in the comments below. I love hearing from you!
⌚Time Stamps⌚
00:00 - Intro
03:15 - Code Demo/Feature Scaling
06:00 - Intuition

КОМЕНТАРІ • 41

@maanuiitd 2 місяці тому ⁺¹
Thank you so much for this exemplary video. Just wanted to add:
1. Normalization: It can be used when data does contain outliers. Scikit-Learn provides a transformer called MinMaxScaler for this. It has a feature_range hyperparameter that lets you change the range if, for some reason, you don’t want 0-1 (e.g., neural networks work best with zero-mean inputs, so a range of -1 to 1 is preferable).
2. Standardization: Unlike minmax scaling, standardization does not restrict values to a specific range. However, standardization is much less affected by outliers.
3. Data with a heavy tail: When a feature’s distribution has a heavy tail (i.e., when values far from the mean are not exponentially rare), both min-max scaling and standardization will squash most values into a small range. Models generally don’t like this at all. So before you scale the feature, you should first transform it to shrink the heavy tail, and if possible to make the distribution roughly symmetrical. For example, a common way to do this for positive features with a heavy tail to the right is to replace the feature with its square root (or raise the feature to a power between 0 and 1). If the feature has a really long and heavy tail, such as a power law distribution, then replacing the feature with its logarithm may help. Another approach to handle heavy-tailed features consists in bucketizing the feature. This means chopping its distribution into roughly equal-sized buckets, and replacing each feature value with the index of the bucket it belongs to.
4. Data with multiple peaks: When a feature has a multimodal distribution (i.e., with two or more clear peaks, called modes), it can also be helpful to bucketize it, but this time treating the bucket IDs as categories, rather than as numerical values. This means that the bucket indices must be encoded, for example using a OneHotEncoder (so you usually don’t want to use too many buckets). This approach will allow the model to more easily learn different rules for different ranges of this feature value. Another approach to transforming multimodal distributions is to add a feature for each of the modes (at least the main ones). The similarity measure is typically computed using a radial basis function (RBF)-any function that depends only on the distance between the input value and a fixed point. The most commonly used RBF is the Gaussian RBF, whose output value decays exponentially as the input value moves away from the fixed point.
Reference: 3rd Edition Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow written by Aurelien Geron
@CODEToGetHer-rq2nf 7 місяців тому ⁺⁷
God level teacher ❤️🤌🏻
@VarunSharma-ym2ns 2 роки тому ⁺⁵
Nicely explained in hindi to understand......Helpful in interviews
@nayeem634 2 роки тому ⁺¹³
Sir, kindly add KNN and SVM algorithms to your 100 days of ML playlist thus it becomes more impressive.
And a request to continue the deep learning regularly.
@sidindian1982 Рік тому ⁺¹
yess KNN is pending,,. , but SVM you can fetch old vedios on concepts & codes - guess its 2 yrs old one... just recheck ..
@pravinshende.DataScientist 2 роки тому ⁺⁴
hi sir before wathcing your vedio I am very happy because i know after watching i will get some new insight so i am thankful before watching this vedio bc you have created such a moment for us!!F
@SurajitDas-gk1uv Рік тому ⁺³
Very good tutorial. Thank you very much for such a good tutorial.
@narendraparmar1631 4 місяці тому ⁺¹
Thanks for this easy explanation
@bhupendersharma0428 Рік тому ⁺³
Amazing Explanation SIr
@sujithsaikalakonda4863 8 місяців тому ⁺²
very well explained sir.
@nationfirst-worldaffairs6410 2 роки тому ⁺⁴
Well done teaching sir .....🤗🤗
@nishitaverma8805 2 роки тому ⁺²
Waiting for the next video in the playlist
@ParthivShah 2 місяці тому ⁺¹
Thank You Sir.
@zkhan2023 2 роки тому ⁺²
Guys sub like kero take sir bhi motivated rahain
@roshankumargupta46 2 роки тому ⁺⁴
Your videos and content quality is way too good instead of interview oriented. Thanks.
Any planning for projects on deep learning?
@zkhan2023 2 роки тому ⁺²
good content sir jee
@barunkaushik7015 Рік тому ⁺¹
Superb
@shanu9494 Рік тому ⁺²
Sir, please continue Deep Learning and NLP playlists
@rb4754 25 днів тому
Well explained... Amazing!!!!!!!!!
@narendravarma4363 3 місяці тому
I am doing pcos ultrasonic image classification whether infected or not infected.I collected the dataset from kaggle.During the training,I trained my model by rescaling the image.Do I need to rescale the test data also.
@MohiuddinShojib Рік тому ⁺²
If I have date type data in my features , do I need to also scaling ?
@akshaykanavaje474 2 роки тому ⁺¹
thank u sir
@sandipansarkar9211 Рік тому ⁺²
finished watching
@bhushanpatel3902 Рік тому ⁺¹
tooo good.....
@yashjain6372 Рік тому ⁺²
best
@user-wp7zk9ee5z 6 місяців тому
Great video sir..!
One correction is that after applying StandardScaler the values doesnt range between -1 to 1. Even in your code after applying scaling technique there are values more than 1 and -1.
@AmanSahu-vj9uu Рік тому ⁺¹
I have a question sir.
Should we scale the dummy variables that are obtained when encoding categorical variables in the dataset, for Artificial Neural Networks models specifically?
Some say that dummy variables should not be scaled as their values (0 & 1) already lie between -3 and 3 and by scaling them, the variables actually lose their interpretation.
But some say that in ANN specifically it is ABSOLUTELY necessary to scale all the features including the encoded categorical variables.
I'm confused as to what is right and why.
I'm hoping to know your remarks on this.
@maanuiitd 2 місяці тому
IMO, you can scale other variables to match the scale of the dummy variables.
@jabed.akhtar Рік тому ⁺¹
Sir ji, kya aap please ye pura OneNote share karsake hai?
@surajprusty6904 3 місяці тому
Here we can see the values are actually going beyond -2 to +2 instead of -1 to +1. Why is that?
@DarkShadow00972 5 місяців тому
Hi sir how can I get ur one notes DL.
@znyd. Місяць тому
💛
@kindaeasy9797 2 дні тому
Niceee
@bollarapuphanindra1264 2 роки тому ⁺¹
sir pls provide link to the neural network performance word file that you created thank you
@campusx-official 2 роки тому ⁺¹
docs.google.com/document/d/1YjuG6xj10ALmltbEqmxu1wOBWMIV5BU2DmVP05xlZ3c/edit?usp=drivesdk
@ajaykulkarni576 2 роки тому ⁺¹
what digital pen are you using?
@campusx-official 2 роки тому ⁺¹
S-pen
@mr.deep. 2 роки тому ⁺³
Sir activation function pe video kab aaye ga
@campusx-official 2 роки тому ⁺³
Next week
@ShubhamVerma-wf3vc 2 роки тому ⁺¹
Bhaiya interview preparation pai video kab aayega?
@campusx-official 2 роки тому ⁺¹
Kuch dino me

Наступне

Автоматичне відтворення

Dropout Layer in Deep Learning | Dropouts in ANN | End to End Deep Learning