Outlier detection techniques(python)| how to avoid outliers without deleting it

Поділитися
Вставка
  • Опубліковано 28 гру 2020
  • we will discuss Outlier detection techniques or outlier detection techniques in data mining and ways to Treat outliers effectively using interquartile range.
    we will also discuss how to avoid outliers without deleting it
    What is an outlier?
    in simple terms an outllier is an unusual term which stands out completely from rest of the observations and does cause significate change to sample mean etc, we will plot qq plot and histograms to visualize outlers.
    Due to outlier our anlysis and understanding of the data can be completely different from the reality , posing an incorrect or false representation.
    for example lets take salarys of 5 individuals are as following:
    10000,12000,9500,8800,1000000
    we can see that the salary of the 5th individual is way higher than rest of the persons , and if we say then we can conclude that the mean salary is .
    There are multiple statistiscal approaches such as z score , proximity models etc to detect outliers but for this demonstration we will more convinient and followed approach and will determine using histograms and box plots etc.
    In this demo we will follow the IQR approach to filter and deal witg outliers. as we know that lower limit for any observation is Q1- 1.5* IQR and upper limit is Q3 + 1.5 IQR
    these terms are as follow:
    - Q1 = 25th percentile
    - Q3 = 75th percentile
    - IQR = Q3- Q1

КОМЕНТАРІ • 23

  • @dcadventures1730
    @dcadventures1730 Рік тому +1

    Such a blessing! Been dealing with outliers for the past week and I created duplicates instead of treating them! Thank you!

  • @saschaFlow
    @saschaFlow 3 роки тому +2

    excellent overview. thank you :) how did you do the headlines green? :D it must be a markdown feature I'm not yet familiar with. Thank you again.

  •  2 роки тому +1

    Excellent! love the explaination - so clear and detailed

  • @ibrahimyusuf8845
    @ibrahimyusuf8845 2 роки тому

    Thanks a lot, have been having a lot of difficulties dealing with outliers, virtually almost all of my dataset contains outliers and using drop/deleting techniques virtually removed of all of the rows in my dataset but with the capping methods in this tutorial it seems the problem will be solved. I don’t know if I can get the codes of this tutorial from you. Thanks once again.

  • @enricoroselino7557
    @enricoroselino7557 Рік тому

    how do you color the markdown to green ? its look cool!

  • @factorealhindi3140
    @factorealhindi3140 3 роки тому +2

    Very well illustrated.

  • @khawlah3567
    @khawlah3567 3 роки тому +1

    Thank you, this is what I am looking for

    • @codersdigest1466
      @codersdigest1466  3 роки тому

      I am really glad to know that it has been helpful for you.

  • @fatemehpakzad3328
    @fatemehpakzad3328 2 роки тому

    for categorical feature how we can fine and remove outliers

  • @saswatpradhan5549
    @saswatpradhan5549 3 роки тому +1

    why is it showing invalid syntax near variable?

  • @milliesadie486
    @milliesadie486 2 роки тому

    thank you bhai it really help🥺🥺🥺🥺🥺🥺

  • @ramakdixit8648
    @ramakdixit8648 2 роки тому +1

    Excellent

  • @shivu.sonwane4429
    @shivu.sonwane4429 3 роки тому +2

    Siddhesh 👍

  • @BitterTruth24
    @BitterTruth24 11 місяців тому

    Can you also make code availalble

  • @shivu.sonwane4429
    @shivu.sonwane4429 3 роки тому +2

    Very well explained 🥳

  • @manideep4917
    @manideep4917 3 місяці тому

    Github link for code please