How to remove outliers in Python? | For multiple columns | Step by step ♥

Поділитися
Вставка
  • Опубліковано 8 лип 2024
  • In this video, I demonstrated how to detect, extract, and remove outliers for multiple columns in Python, step by step. Enjoy ♥

КОМЕНТАРІ • 60

  • @jingyiwang5113
    @jingyiwang5113 Рік тому +3

    I am really grateful for this video. I am doing research with my professor. And this is really an essential skill for me to conduct research with him. Thank you so much! I do appreciate your wisdom!

  • @amandacorreia2625
    @amandacorreia2625 2 роки тому +11

    Your voice, the music and the explanation: everything is amazing! Thanks a lot ♥

  • @Mandelbrot567
    @Mandelbrot567 2 роки тому

    This video is excellent, I tried the method on another data set , it worked a treat.

  • @rajendranayak8018
    @rajendranayak8018 2 роки тому +6

    Dear Eigen B, Please upload videos on machine learning & higher stats. I found this video, which helps me a lot. Your way of teaching is good.

  • @robertaraujo347
    @robertaraujo347 2 роки тому +1

    I loved to watch this video! it goes to the main point, your explanation was very clear and you've taken ur time to avoid letting any detail out. At the beginning I was considering if I should see ur video cause it lasted 13 minutes and I don't like to see videos longer than 5 minutes xd but I'll leave happy cause I've understood this topic and now I'll be able to apply this in futures data cleaning.

  • @Christopher-xr9kq
    @Christopher-xr9kq Рік тому +1

    Wow. Watched entire video. So peaceful. good job!!!!

  • @eugenevlaxos92
    @eugenevlaxos92 3 роки тому +1

    thank you so much you saved my data mining project

  • @jesusparra9840
    @jesusparra9840 Рік тому

    Excelente video, estuve buscando bastante y tu lo explicaste super bien todo

  • @jamesjulius7726
    @jamesjulius7726 Рік тому

    excellent explanation and pace! so calm, will never forget these part #removing outliers

  • @jayanthimallela8842
    @jayanthimallela8842 Рік тому

    This video Really help me a lot for outliers. thankful to you and very clean and decent explanation, please do more videos on machine learning. Thanks a lot

  • @zarynooi5669
    @zarynooi5669 2 роки тому

    Thank You! Very helpful !

  • @chandrasm009
    @chandrasm009 2 роки тому +1

    Thanks alot Eigen B. Its really helpful.

  • @mihirthakkar6902
    @mihirthakkar6902 3 роки тому

    Very nicely explained. great work. Thanks.

  • @josiahadesola
    @josiahadesola 2 роки тому

    Awesome....Thanks I love the method of teaching and background music

  • @nurulfadillah1248
    @nurulfadillah1248 9 місяців тому

    this really helps me, thank you so much!

  • @123arskas
    @123arskas Рік тому

    Nice work. Liked the simplicity and the soothing voice + music.

  • @t.farias9336
    @t.farias9336 3 роки тому

    thanks, you helped me a lot!

  • @christopherfreyre744
    @christopherfreyre744 Рік тому

    This is amazing thanks for sharing and such a lovely explanation

  • @selsabillekkaf5724
    @selsabillekkaf5724 9 місяців тому

    Every thing is amazing ! , More than very helpful. thank you

  • @peopleonemillion4283
    @peopleonemillion4283 2 роки тому

    Thank you!!!! you are amazing

  • @bobochibi
    @bobochibi 2 роки тому

    Thank you! Your video was really helpful for me :)

  • @stephanie_ong
    @stephanie_ong 2 роки тому

    Thank you so much!

  • @ft_smile
    @ft_smile 3 роки тому

    I wish i could show you how much thankful am i
    🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏🙏

  • @IngenieriaEstructural7
    @IngenieriaEstructural7 Рік тому

    Genia me ayudaste mucho

  • @manojnaik8720
    @manojnaik8720 2 роки тому

    Sweet voice....Nicely explained.... Thanks

  • @rokeyasiddiqua9375
    @rokeyasiddiqua9375 2 роки тому

    Great tutorial

  • @chiranjeebmahanta1215
    @chiranjeebmahanta1215 Рік тому

    Thanks a lot!

  • @surajsalunkhe2348
    @surajsalunkhe2348 Рік тому

    Thanks for the help

  • @sudhanshusingh5594
    @sudhanshusingh5594 3 роки тому

    thnx u so much.... really tqqq

  • @priyanshugupta2104
    @priyanshugupta2104 Рік тому

    बहुत अच्छा सिखाया बहिनी

  • @divina.glitch
    @divina.glitch 2 роки тому

    Thanks!

  • @hizokadarkwolf
    @hizokadarkwolf 2 роки тому

    I was doing something similar, with no results... Guess what: I used & instead of | when finding the lower and upper bounds. Thanks a lot for making this video!

  • @cse048harshkumawat6
    @cse048harshkumawat6 2 роки тому

    Is there any way to replace those outliers rows with upper_bound or lower_bound please help

  • @user-cc3eu4ng7f
    @user-cc3eu4ng7f Рік тому

    thank great video i have question if i have about 446 feature how can i deal with it like in your example i tried to store the features in a variable X then use your code but it did not work any help please

  • @MrYnitram
    @MrYnitram 2 роки тому +1

    great video! One question though: what if you only wanted to drop the outlier values and not the whole row in which the outlier is found?

    • @prashantshrivastava01
      @prashantshrivastava01 Рік тому

      not possible.. but you can replace outliers with NaN but again.. no point of doing that

    • @jayanthimallela8842
      @jayanthimallela8842 Рік тому

      It won't be like that; we can't remove only outlier we can remove entire row only.

  • @souravsinghbhandari9699
    @souravsinghbhandari9699 Рік тому

    I used the same technique for my dataset but outliers are still persistent any suggestions what to do?
    I tried rerunning the loop it removed some outliers but that reduced the original dataset i was working on.
    Anyone has any better suggestions?

  • @shahadewadh606
    @shahadewadh606 Рік тому

    ❤❤❤❤

  • @oipseismic7621
    @oipseismic7621 2 роки тому +2

    i tried these codes and it doesn't work. it shows(an only compare identically-labeled Series objects)

  • @gebremedhnmehari8451
    @gebremedhnmehari8451 2 роки тому

    How we can determine the value of the quantile?

  • @manish17788
    @manish17788 2 роки тому

    what if data has no outlier. In that case we will loose tiny data? how to know if not outlier removal is needed in big dataset?

  • @trangdtt30
    @trangdtt30 3 роки тому

    Hi. I have one error: "Name 'dt' is not defined" when i ran cell [9]. can you help me

  • @kyleroach2581
    @kyleroach2581 3 місяці тому

    This should be titled Pandas ASMR

  • @jenirex1944
    @jenirex1944 2 роки тому

    what will be the output of In[8].. can anyone explain?

  • @AhmedDaoud2
    @AhmedDaoud2 Рік тому

    Thanks, can I get the test.csv file?

  • @PulkitKumar-fd8rb
    @PulkitKumar-fd8rb Рік тому

    Instead of removing, how can we impute median values ?

  • @KP-oi4ee
    @KP-oi4ee 2 роки тому +2

    index_list = []
    for feature in ['feature1', 'feature2']:
    index_list.extend(outliers(data, feature))
    index_list = []
    ----- > For this i am getting an error : Boolean array expected for the condition, not float64 ,
    How can i fix it ?

    • @aartiahluwalia4104
      @aartiahluwalia4104 2 роки тому

      index_list = []
      for feature in ['feature1', 'feature2']:
      index_list.extend(outliers(data, feature))
      index_list = [] --> seem to have created two index_list so modify this line as
      index_list

  • @jorgeeg2668
    @jorgeeg2668 2 роки тому

    No entiendo ingles, pero entendi el video :D

  • @VishwasPatki
    @VishwasPatki 2 роки тому

    Error: TypeError: Cannot perform 'ror_' with a dtyped [float64] array and scalar of type [bool]

  • @9881847751
    @9881847751 2 роки тому +1

    what is ft? here?

  • @modhua4497
    @modhua4497 2 роки тому

    Could you share your code? Thanks

  • @aravinthanseenu1237
    @aravinthanseenu1237 Рік тому

    Dear Eigen B,
    Instead of removing the outliers kindly help to code- how to replace them with mean value of respective column.

  • @AlAhlyLy
    @AlAhlyLy 2 місяці тому

    Hello, I write your code And nothing happend, thank you for the video anyway

  • @jamesjayanth7926
    @jamesjayanth7926 Рік тому

    Define outliers error is coming

  • @Master_of_Chess_Shorts
    @Master_of_Chess_Shorts Рік тому

    great coding but operation should be column wise not row wise, you are removing a possible valid adjacent value by using the index, imagine a large dataset with 500 columns...

  • @WEMELONs
    @WEMELONs 9 місяців тому

    where vids mazafaka