Should You Scale Your Data ??? : Data Science Concepts

Поділитися
Вставка
  • Опубліковано 1 лис 2024

КОМЕНТАРІ • 19

  • @greenbean2177
    @greenbean2177 3 роки тому +4

    All your videos which I've seen are super clear, thanks.
    I'd add that scaling will affect results of linear regression if we use regularization: if we have two features x1 and x2 which are equally important and are measured in centimetres, then weights w1 and w2 are expected to be equal as well. If x2 is measured in metres than weight w2 would be 100 times higher, and if we have regularization it would penalize the large weight so we may obtain another result.
    One more case when it is useful to use scaling is when we use gradient descent for optimization, scaling data would make gradient descent find optimum faster.

  • @darshansolanki5535
    @darshansolanki5535 3 роки тому +11

    Best teaching skills man !

    • @ritvikmath
      @ritvikmath  3 роки тому

      I appreciate that!

    • @111dogger
      @111dogger 3 роки тому

      @@ritvikmath Hey ritvikmath, do you have a Linkedin profile?

    • @ksrajavel
      @ksrajavel 3 місяці тому

      @@111dogger He does and you can follow him, but not connection request it seems. The number of connections were as little as 100 and not above 200. He keeps his circle small.

  • @sachinrathi7814
    @sachinrathi7814 3 роки тому +3

    Thank you for such clear explanation of concepts. Love your teaching skills

    • @xxshogunflames
      @xxshogunflames 3 роки тому

      Soooo much! I have been presented the interpretability topic before but it didn’t really hit home. After this video I feel like a new world is open haha

  • @ccuuttww
    @ccuuttww 3 роки тому +3

    A fixed interval value such as RGB color, PH we do normalize
    A non-bounded value such as temperature, dimension use standardize
    One reason why we Scale data? because it is sensitive to nearly all regularizer

  • @jeremy8144
    @jeremy8144 11 місяців тому

    Thank you

  • @alihindustani3374
    @alihindustani3374 Рік тому +1

    I have a doubt. Please help me understand this: If all my features are not of same unit, say one input variable is daily step count and another is daily average touch events and so on, then does scaling make sense?
    Or another example, say i have two input variables, one is distance in km and another is weight in grams. Would scaling be the right approach? I don't think so.

  • @aliylukmansyahnasution7994
    @aliylukmansyahnasution7994 Рік тому +1

    nice vid!

  • @Nishant8185
    @Nishant8185 3 роки тому +1

    How would you recommend to handle categorical variables - OHE or Impact encoded? Would scaling make sense? Say in NN where scaling will speed up things.

    • @sachinrathi7814
      @sachinrathi7814 3 роки тому

      Ohe definately make huge impact when cardinality is high in your variable
      Use other methods like target encoding or there are a lot

    • @Nishant8185
      @Nishant8185 3 роки тому

      @@sachinrathi7814 agreed. I refined my question. It's more on lines of scaling categorical variables

  • @JainmiahSk
    @JainmiahSk 3 роки тому +1

    Please use YAML for ML and do a video.

  • @sukursukur3617
    @sukursukur3617 3 роки тому

    By shifting(xi-mü) and scaling((xi-mü)/sigma); which is zscore; a dataset, one loses information of number value of each element of dataset, in return, who gains capability of attaining descriptive characteristics of dataset to infer.
    Is this true?

  • @kdhlkjhdlk
    @kdhlkjhdlk 3 роки тому

    Uniformly scaling data for KNN is not always (or even often) correct. It can be often wrong to say every feature is equally important.

  • @owonubijobsunday4764
    @owonubijobsunday4764 Рік тому

    🎉🎉❤❤❤😊😊😊😊😊
    Thanks a lot