Tutorial 25- Probability Density function and CDF- EDA-Data Science

Поділитися
Вставка
  • Опубліковано 28 лис 2019
  • Please join as a member in my channel to get additional benefits like materials in Data Science, live streaming for Members and many more
    / @krishnaik06
    If you are looking for the best online course in Data Science with placement assistance. Apply for appliedAI Course
    www.appliedaicourse.com
    Connect with me here:
    Twitter: / krishnaik06
    Facebook: / krishnaik06
    instagram: / krishnaik06

КОМЕНТАРІ • 175

  • @nvsyashwanth918
    @nvsyashwanth918 3 роки тому +2

    The way you explain concepts is amazing.

  • @srujankumar637
    @srujankumar637 3 роки тому +2

    you are an awesome teacher; seems like completely dwelling over the concepts... simply i take a bow

  • @hadishaaben3665
    @hadishaaben3665 2 роки тому +2

    Man you do not know how much i learned from you , your explanation is AWESOME

  • @mischievousmaster
    @mischievousmaster 3 роки тому +15

    Krish loves the word PARTICULAR a ton! 😁

  • @howtotipsandtricks4381
    @howtotipsandtricks4381 Рік тому +3

    Institution of Data science claims that they have good content for freshers also but it is no there, i always have to come in your channel for many topics clarification which I could never learn from there.
    You have a great skill of teaching😊

  • @Ankurkumar14680
    @Ankurkumar14680 4 роки тому +7

    Thanks for sharing sir, I always admire your teaching style, knowledge and helping nature. Small clarification, in normality plot, values on y-axis does not tell us area under the curve. In this way, y-axis corresponding to mean value on the x-axis will always be .5 but that is not the case. Actually, it is the gradient value of the CDF function (graph).

  • @VVV-wx3ui
    @VVV-wx3ui 4 роки тому +2

    Thanks Krish for sharing your knowledge. Please keep it going.

  • @user-ed3uq8ic9b
    @user-ed3uq8ic9b 10 місяців тому +1

    you are much better than many online courses in market ...thank you please keep going.

  • @saurabhtripathi62
    @saurabhtripathi62 4 роки тому +2

    thanks your series is great , u made this very easy.

  • @amiysrivastava1444
    @amiysrivastava1444 4 роки тому +3

    best explanation of cdf when compared with other youtube videos.

  • @VVV-wx3ui
    @VVV-wx3ui 4 роки тому +2

    Simply explained. Good going Krish.

  • @jhondhelpago1638
    @jhondhelpago1638 10 місяців тому

    He explained two topics in less than 10 minutes, yet its so clear and informative

  • @tensorthug6802
    @tensorthug6802 3 роки тому +7

    The y-axis of pdf is gradient of cdf, the higher the gradients, more the density is at that particular point. The y-axis of CDF is the percentage population of a particular point.

    • @MuhammadAbdullah-lr7sd
      @MuhammadAbdullah-lr7sd 2 роки тому +1

      Yes, that is what I'm thinking but in the video it creates some confusion.

    • @t.saishodhanrao9519
      @t.saishodhanrao9519 2 роки тому

      this comment should be pinned...... It creates lot of confusion for those who don't about this

  • @TheAl217
    @TheAl217 3 роки тому

    Thank you for clarifying these functions.

  • @cherubyGreens
    @cherubyGreens 3 роки тому

    Feeling amazing with Krish Naik!

  • @im_tanmay_g
    @im_tanmay_g 10 місяців тому

    Most simplest and non-confusing video on PDF & CDF. Thank you for the same.

  • @enchanted_swiftie
    @enchanted_swiftie 2 роки тому

    But sir, when I plot KDE plots with seaborn, I often get the values on the y-axis more than 1. What the interpretation will be then? Or KDE plots are different from Density Curves?

  • @oliullah.mahmud
    @oliullah.mahmud 2 роки тому

    Thank you. I like your teaching style!

  • @mohammedabdulahmed8808
    @mohammedabdulahmed8808 2 роки тому

    Simply amazing explanation 😍😍
    Thanks alot and keep doing sir!!!

  • @livebiochemistry
    @livebiochemistry 2 роки тому

    One of best video of PDF and CDF..thanks sir

  • @tridipbhowmik2356
    @tridipbhowmik2356 4 роки тому +3

    Sir, i have a problem regarding installation of anaconda. After installation when i launch jupyter notebook, on the top of the bar it shows kernel error as a result of which the code doesn't run. So, what should i do to overcome this error?

  • @MScFabianoBriao
    @MScFabianoBriao 3 роки тому +1

    Buenos!
    Do you have videos of real cases showing inferential statistics to test (validate) models?

  • @jaheerkalanthar816
    @jaheerkalanthar816 2 роки тому

    Thanks for the video sir, I learned lot of things in this video

  • @ibrahimahmethan586
    @ibrahimahmethan586 4 роки тому

    thank u so much . god bless u

  • @mahalerahulm
    @mahalerahulm 4 роки тому

    Excellent !! Very nice explanation.

  • @Ks-oj6tc
    @Ks-oj6tc 3 роки тому

    Well explained, Thanks a lot Krish.

  • @ahmedel-bahnihi346
    @ahmedel-bahnihi346 4 роки тому +42

    When you explain the PDF, you said, it is the area under the curve till that point. I think this is the CDF, not PDF. Thanks a lot for your effort nd videos

    • @dharunsainath322
      @dharunsainath322 3 роки тому

      that is correct...he is probably talking about CDF

    • @madhuprasath6193
      @madhuprasath6193 3 роки тому +2

      A query,then how do you interpret a pdf?

    • @dattamalpote2005
      @dattamalpote2005 3 роки тому +1

      i think krishna sir had explain it right.

    • @mitultandon5227
      @mitultandon5227 3 роки тому +10

      @@madhuprasath6193 It basically gives you the probability of that point. PDF would answer a question like these:- What would be the chance of weight of a person to be 90kg?. Answer to this as per the above graph in the video would be "only 25% chance ( or 0.25 probability )". Basically PDF tells us the exact probability for every point.

    • @srujankumar637
      @srujankumar637 3 роки тому +1

      area under the curve in particular (definite integral) range is pdf:: total must be unity

  • @aditisrivastava7079
    @aditisrivastava7079 4 роки тому

    Thanks to wonderfull video..............i will simply add through pdf we can find the probababilty for a point or a range whereas cdf tell about the less than probability

  • @johan-mattias
    @johan-mattias Рік тому

    learning about machine learning for google spreadsheets, this helped understand the CDF so much thank!

  • @TariqueMahmud313
    @TariqueMahmud313 2 роки тому

    Too many clear concepts in just 7 minutes !!! Thanks man!!

    • @chaos8514
      @chaos8514 Рік тому

      hello are you also trying to learn data analysis?

  • @sudhirBhalekar007attaboy
    @sudhirBhalekar007attaboy 4 роки тому +1

    Why do we calculate CDF as PDF is already giving you % of distribution for required data analysis, through this CDF, we are getting added (C.values) but what is the significance of this concept?

  • @sapnilpatel1645
    @sapnilpatel1645 Рік тому

    Amazing video sir. Thank you so much.

  • @bandhammanikanta1664
    @bandhammanikanta1664 4 роки тому +3

    Thanks for this video Krish.
    We will be very happy to see atleast a single reply to any comments in youtube as well as issues in his github.

    • @krishnaik06
      @krishnaik06  4 роки тому +3

      I usually see the comments and make a note on it to create videos...github videos will be coming up soon

    • @bandhammanikanta1664
      @bandhammanikanta1664 4 роки тому

      @@krishnaik06 Thank you Krishna. Waiting to see your updates.

    • @himanshubansal2701
      @himanshubansal2701 3 роки тому

      @@krishnaik06 sir what is benefits of cdf over pdf ? bcoz we will be analysing same precentage with pdf also.

  • @AJ-et3vf
    @AJ-et3vf 2 роки тому

    awesome video sir! thank you!

  • @ratulghosh3849
    @ratulghosh3849 4 роки тому

    Good going Sir keep up the good work :)

  • @Vidi_111
    @Vidi_111 2 роки тому

    Thank you sir .. the way you explain is so easy to understand...

  • @Nikhil-jj7xf
    @Nikhil-jj7xf 4 роки тому +3

    Krish pls provide you're online course registration link

  • @menakask6050
    @menakask6050 Рік тому

    Hi krish, kindly let me know pls explain how do you say at point 130 in X axis with 90% in distribution in Y axis is "less than" since the CDF is straight it is increasing and you are mentioning less than 130kg is there in 90% of the dataset. How do you predict it is less or high using CDF?

  • @santamsaha9415
    @santamsaha9415 3 роки тому

    this video is a soul saver

  • @baijuthomas3716
    @baijuthomas3716 3 роки тому

    so is CDF simply a another representation of a PDF , just wondering at what point do you decide to use a CDF over a PDF if it simply represents data in a different way ? it seems like the CDF always locks the results between 1-0 . what if the returns go -Negative how would you represent that ? Thanks as always .

  • @muhammadyasirbutt3631
    @muhammadyasirbutt3631 4 роки тому

    very well brother your are great teachning

  • @saddamshaikh9285
    @saddamshaikh9285 4 роки тому +5

    Sir please make a video on navie byes algorithm.

  • @mukundsudharsan1294
    @mukundsudharsan1294 4 роки тому +5

    In 3.08 you mentioned that the y-axis in the normal distribution represents the % of distribution below that point. If that statement holds true then shouldn't the graph be continuously increasing and it would be cdf? So what does the y-axis indicate for the normal distribution falling in the right half? Please correct me if I am mistaken, but would like to understand this better.

    • @kantafcb1
      @kantafcb1 3 роки тому

      y axis shows the %age distribution of intervals

  • @sagarkhande4412
    @sagarkhande4412 3 роки тому

    Ty sir your video was really helpful..👍

  • @adityapathania3618
    @adityapathania3618 3 роки тому

    dont you think the cumulative total will go above 1? at 3rd 4th value itself ?as the probabilities are getting added ?

  • @KalyanGk0
    @KalyanGk0 3 роки тому +1

    Great explanation krish😊 .please make video on practical implementation of this concepts using python.

  • @navinofficial5439
    @navinofficial5439 4 місяці тому

    Crystal Clear!

  • @rakhijha8911
    @rakhijha8911 4 роки тому +1

    Simply amazing i read so many articles on cdf but everyone was calculating the value no one explained it so well can I plzz connect you on LinkedIn

  • @phalgunaa2157
    @phalgunaa2157 Рік тому +1

    Your explanation is too good bro

  • @NR_Tutorials
    @NR_Tutorials 4 роки тому

    nice videos thanks krish naik sir

  • @hariprasanth7568
    @hariprasanth7568 4 роки тому

    sir PDF it using KDF but for CDF which is function running background?

  • @nehasaroha2505
    @nehasaroha2505 4 роки тому +3

    Very informative video....can you suggest some simpler kaggle datasets on which we can perform EDA using PDF,CDF and multivariate analysis. I have already done on iris, Titanic and Haberman's dataset, but was thinking about getting more practice.

  • @DataAI_junction
    @DataAI_junction День тому

    thank you for the video

  • @shobitjain7836
    @shobitjain7836 4 роки тому

    Can you suggest how can I plot a CDF using distplot in seaborn?

  • @tsaurav18
    @tsaurav18 Рік тому

    thankyou sir.

  • @kpratik5551000
    @kpratik5551000 4 роки тому

    Very good explanation.

  • @brayansereno4249
    @brayansereno4249 3 роки тому

    Hi Krish, thank you so much, I speak Spanish but I understand you, I really need an explanation of this topic and I don't found in Spanish, you're great bro

  • @RishikeshGangaDarshan
    @RishikeshGangaDarshan 3 роки тому

    If data is not in form of gaussian distribution then the pdf or cdf will work or not

  • @bluestar2253
    @bluestar2253 Рік тому

    Excellent explanation of PDF and CDF

  • @Thanusree234
    @Thanusree234 Рік тому

    Sir is this subject exploratory data analysis and statistics subject the same sir please reply 🙏

  • @Emotekofficial
    @Emotekofficial 3 роки тому +1

    As far as I know CDF is Cumulative Distribution function. It can also be calculated for Probability Mass Function. But you can say in this Scenario as Cumulative Distribution function of given Probability Density Function.

  • @sharathchandrakarnati4615
    @sharathchandrakarnati4615 2 роки тому

    We can use logistics regression right based upon cdf ?

  • @vikasrajput1957
    @vikasrajput1957 4 роки тому +2

    Hi Sir,
    where can we study the mathematics behind all of the ML algorithms

    • @akshatgarg6635
      @akshatgarg6635 3 роки тому +1

      NPTEL its the best. start with that if you want to go deeper then probably start with advanced statistics

  • @ahmed96616
    @ahmed96616 4 роки тому

    Excellent !

  • @magedrefat1658
    @magedrefat1658 Рік тому

    Sir, your explanation is amazing ^_^

  • @schuf1738
    @schuf1738 Рік тому

    Thank you !!

  • @rakeshenjapuri3143
    @rakeshenjapuri3143 3 роки тому

    how will calculate the above 60% in pdf and how will take the cumulative percentage in cdf give with mathematical explanation sir

  • @shadiyapp5552
    @shadiyapp5552 Рік тому

    Thank you sir ♥️

  • @ujjwalmv9697
    @ujjwalmv9697 Рік тому

    what if cdf over a period of time is not being constant and hits 90 degree and then goes constant, why is that straightline coming in cdf?

  • @ankitchakraborty1126
    @ankitchakraborty1126 Рік тому

    Hi sir. Thanks for sharing awsome content like this. I have 1 question. Can we calculate percentile and median from CDF?

  • @lahari1512
    @lahari1512 4 роки тому

    hai krish , ur vedios are really helping me to learn machine learning very easily , can u please upload svm and xg boost vedios please its a request

  • @BalaguruGupta
    @BalaguruGupta 4 роки тому +2

    I've commented on your other videos also, there I understood the love you had on the technology. From this video I understood that, actually you're so much passionate on teaching sir. The way you explained PDF and CDF is really amazing sir. Thank you so much. :)

  • @smritiprasad6529
    @smritiprasad6529 4 роки тому +1

    Hello sir
    In some videos values in y-axis of pdf is referred as probability for a point on x-axis . I am bit confused can you please explain.

    • @indrasenareddyadulla8490
      @indrasenareddyadulla8490 4 роки тому

      Really it's an awesome explanation.

    • @karthikvijayasarathi89
      @karthikvijayasarathi89 3 роки тому

      No, one cannot define probability at a single point x for a continuous random variable. It should always be a range

  • @kaushikvankadkar8430
    @kaushikvankadkar8430 4 роки тому

    Why is PDF differentiated and how area under the graph gives us probability..

  • @ranjanpal7217
    @ranjanpal7217 Рік тому

    Amazing...plz make a video on how to determine the distribution of a dataset using Python.

  • @jaychauhan2933
    @jaychauhan2933 4 роки тому +1

    Can you please show the practical implement of PDF and CDF

  • @roshankumargupta9978
    @roshankumargupta9978 4 роки тому +1

    Thanks for the video Krish. But I do have a doubt that in a long run.. how could be it will be beneficial that this much percentage of data lies upto certain threshold?

  • @ArchnaVijay
    @ArchnaVijay 3 роки тому

    amazing video

  • @pushkarsaini7653
    @pushkarsaini7653 4 роки тому +2

    sir may u please make video on hypothesis.

  • @aimenbaig6201
    @aimenbaig6201 3 роки тому

    you are the best

  • @karthikvijayasarathi89
    @karthikvijayasarathi89 3 роки тому +3

    Small correction - Probability cannot go over 1 , but probability density function can go.
    Correct me if I am wrong

  • @skviknesh
    @skviknesh 3 роки тому

    what is the formulae for smoothening the histogram?

  • @lokeshchoraria6559
    @lokeshchoraria6559 4 роки тому

    best explanation

  • @equbalmustafa
    @equbalmustafa 4 роки тому +2

    Nice one

  • @valor36az
    @valor36az Рік тому

    Nice explanation

  • @anto1756
    @anto1756 3 роки тому

    Nice 😁 could u do a comparison about survival function, inf and all other methods please

  • @amitmaurya6179
    @amitmaurya6179 2 роки тому

    If you are smoothening the curve, why count changes to percentage of distribution.

  • @vidyasurbhi3084
    @vidyasurbhi3084 Рік тому

    Very nice 👌.

  • @phaninanugonda3807
    @phaninanugonda3807 3 роки тому

    Can pdf be greater than 1 in Z distribution

  • @0505Arjun
    @0505Arjun 4 роки тому +3

    What scenarios we will use CDF and PDF in machine learning..?

    • @gauravsaini728
      @gauravsaini728 4 роки тому +3

      I have the same query. Can you please give any domain specific example and show is how PDF and CDF curves will help the data scientist to take certain decisions..

    • @alphonseinbaraj7602
      @alphonseinbaraj7602 4 роки тому

      yeah ..same query me too

    • @rajatchaturvedi7379
      @rajatchaturvedi7379 4 роки тому

      Don't know its application in ML yet, Since I have started learning recently but one use of it is in EDA
      Which is the exploratory data analysis.Exploratory means you don't know anything about the data set from before. Your task is to extract some basic yet critical information about the dataset before implementing ML algorithms.Its important to get an idea about the dataset .PDF, CDF various plots such as 2D, pair plot , etc are some aspects of EDA.There are other stuff also.You can google it to understand more .

  • @wealth_developer_researcher
    @wealth_developer_researcher 3 роки тому

    Amazing :)

  • @muhammadsaqib2961
    @muhammadsaqib2961 4 роки тому

    Good explanation

  • @pravalikamucherla139
    @pravalikamucherla139 4 роки тому +1

    Hi
    You have used only weight one feature to determine pdf n CDf can we do with two or more features

    • @PriyaAmar848
      @PriyaAmar848 3 роки тому

      Great question, expecting answer from any DS enthusiast

    • @kantafcb1
      @kantafcb1 3 роки тому

      no, its univariate analysis

  • @tejas8211
    @tejas8211 3 роки тому

    Sir the analysis which we do with CDF, we can do the same with z-score. Am I right?

    • @buddhadebbhattacharyaarchi1510
      @buddhadebbhattacharyaarchi1510 3 роки тому +1

      CDF is a general concept, applying to all sorts of distributions. Z-score is limited to normal distributions.

  • @PriyaAmar848
    @PriyaAmar848 3 роки тому

    Why do I need to use this distribution ? In which cases of data it's helpful ? Also we have uniform distribution, binomial and poisson. Where to use these. Appreciate if practical examples are included. Great explanation with graphs. Keep up the enthusias,

  • @nightowl1596
    @nightowl1596 Рік тому

    elite explanation, gg

  • @kabyabasu
    @kabyabasu 3 роки тому

    Krish is the God of data science

  • @alexv1602
    @alexv1602 Місяць тому

    Your god for me now😮

  • @SsK-mh5ds
    @SsK-mh5ds 3 роки тому

    Sir , how can I code it?

  • @umesh789s
    @umesh789s 4 роки тому

    can you please explain about T-score and T distribution