Matplotlib Tutorial (Part 2): Bar Charts and Analyzing Data from CSVs

Поділитися
Вставка
  • Опубліковано 2 січ 2025

КОМЕНТАРІ •

  • @Sharmapawan98
    @Sharmapawan98 5 років тому +332

    Who needs python docs when you have such an amazing teacher

  • @coreyms
    @coreyms  5 років тому +120

    I hope everyone finds this video helpful. The next video of the series will be posted tomorrow at the same time. The next video will cover how to create pie charts.
    I'd like to thank Brilliant for sponsoring this series. If you'd like to check them out then you can sign up with this link and get 20% off your premium subscription:
    brilliant.org/cms

    • @dhananjaykansal8097
      @dhananjaykansal8097 5 років тому

      As usual lovely!!!!!!!

    • @tamasasztalos7484
      @tamasasztalos7484 5 років тому

      It's a great tutorial; the only thing I was missing is to add total values on the top of each bar charts (can be trickier for stacked bar chart)

    • @ishanpand3y
      @ishanpand3y 4 роки тому

      Thank you, sir, for providing top-class tutorials for free.

    • @JuniorDIEKA
      @JuniorDIEKA 4 роки тому

      Hello Corey!
      Please can you advise:
      1. how did you the clean the data within the column " LanguageWorkedWith" so that you can generate this clear data?
      2. After I have split it and save it to another csv file a part from the main, the is the output: [(" 'JavaScript'", 53020), (" 'HTML/CSS'", 39761), (" 'Java'", 29863), ("['Bash/Shell/PowerShell'", 28340), (" 'SQL']", 28178), (" 'Python'", 26185), (" 'PHP'", 20394), (" 'SQL'", 19094), (" 'TypeScript']", 16091), ("['HTML/CSS'", 15322)]
      [Finished in 33.6s]
      3. According the below output , how will I do so that it can bring the sum exact of the occurrence of the languages as it look like not doing it?
      Thank you,

    • @JoshKonoff1
      @JoshKonoff1 3 роки тому

      Where is the CSV for this? I don't see it in the description. Thank you!

  • @ishanpand3y
    @ishanpand3y 4 роки тому +78

    In case you don't know, the shortcut for 8:13 in jupyter notebook is *Ctrl + left mouse click* on the different lines one by one. You can write at different lines at the same time.

  • @Katira-KR7
    @Katira-KR7 4 роки тому +90

    These series is much better than the curses in Udemy I paid for. Thank you very much.

  • @apoorvwatsky
    @apoorvwatsky 5 років тому +81

    23:40 here's that one liner if anybody's interested. Personally, I like this more.
    languages, popularity = map(list, zip(*language_counter.most_common(15)))

    • @costasvas341
      @costasvas341 5 років тому +1

      Really nice! Could you please explain what the "*" symbol does?

    • @paklong2556
      @paklong2556 5 років тому

      nice

    • @jg9193
      @jg9193 4 роки тому

      Or just: list(zip(*language_counter.most_common(15))). Map is unnecessary as list() automatically maps over an Iterable

    • @corben3348
      @corben3348 4 роки тому +3

      @@jg9193 but if you don't use map(list, iterable) then languages and popularity will be tuples so you cannot use reverve() for the rest of the tutorial. Or languages, popularity = [list(e) for e in zip(*language_counter.most_common(15))] without map

    • @jg9193
      @jg9193 4 роки тому +2

      @@corben3348 Fair point, I didn't think of that. That said, he could just do languages[::-1] instead of languages.reverse() to reverse a tuple
      Then again, using list() would even be unnecessary if he did that

  • @fangshizhu9383
    @fangshizhu9383 3 роки тому +1

    At 8:12, when you selected multiple locations and simultaneously type the same code to multiple lines, my world just expanded!

  • @TheShubham67
    @TheShubham67 4 роки тому +10

    This series with pandas one has taken my skills to a new level.

  • @bmwmhamam
    @bmwmhamam 5 років тому +7

    No body teaches like you. You are the best. Amazing delivery of information, truly useful tutorials. Thank you so much.

  • @sunshadow9704
    @sunshadow9704 2 роки тому +3

    Corey, you are great teacher. You have rare ability to explain calmly. Much appreciating your efforts.

  • @Ghasakable
    @Ghasakable 5 років тому +17

    Man, you are awesome, everything I have learned about python started from your channel, I wish you the very best all success, as you make everyone happy, keep up the excellent work, we all heavily rely on you.

    • @coreyms
      @coreyms  5 років тому +5

      Thanks! That's very kind of you.

  • @shadowmasked7188
    @shadowmasked7188 Рік тому +1

    Thank you very much bro, Greetings from Azerbaijan.

  • @abhishek_raj
    @abhishek_raj 3 роки тому +1

    Right from reading data from a csv file to plotting it, you helped a lot of people.

  • @borhansiddiki7079
    @borhansiddiki7079 5 років тому +2

    I think your videos are more understandable than rest of the youtube channels

  • @miosz952
    @miosz952 4 роки тому +4

    The great thing about your tutorials is that despite main topic, you learn a lot useful tricks, modules etc.

  • @KienDoanTrung168
    @KienDoanTrung168 5 років тому +5

    such a great Python instructor with an angelic voice. Thank you so much 😊

  • @dalanxd
    @dalanxd 4 роки тому

    Corey Schafer saves my life once again...
    Deep gratitude for your work, man!

  • @luiggitello8546
    @luiggitello8546 Рік тому

    This is the best content on UA-cam, thank you for so much

  • @shuklarahul17
    @shuklarahul17 4 роки тому +1

    As you mentioned Zip can also be used
    language = cnt.most_common(10)
    language.reverse()
    language_X, language_Y = list(zip(*language))
    plt.barh(language_X, language_Y)

  • @djadamkent
    @djadamkent 5 років тому +14

    Another great video, thank-you. A Pandas series of videos would be awesome!

  • @lalu225
    @lalu225 5 років тому +5

    Excellent tutorial Corey! Real life stuff and practical, including the use of Counter. It's important to show these data preparation steps. Very helpful indeed, thank you.

  • @storiesshubham4145
    @storiesshubham4145 2 роки тому +1

    I can't express how amazing this video is. What a great teacher you are. 🔥🔥

  • @SahilKhan-rv9xb
    @SahilKhan-rv9xb 4 роки тому +9

    for those wondering how to obtain the CSV file, once you've clicked on it and you see all of the data in your web browser, just right click and say save as

  • @mohammedismail308
    @mohammedismail308 5 років тому

    Thanks a lot Corey. Really your videos are endless treasure.
    Just a way for plotting bar charts for more than one dataset on the same plot without need to numpy. Just use built-in map function.
    width = 0.25 #Width of bar
    plt.bar(list(map(lambda x: x-width/2, age_x)), salaries1, color = 'k', width = width)
    plt.bar(list(map(lambda x: x+width/2, age_x)), salaries2, color = 'r', width = width)

  • @dgh25
    @dgh25 Рік тому

    Your videos are just sprinkled with little golden nuggets! I love it ❤

  • @introduction_official6547
    @introduction_official6547 9 місяців тому +1

    Very informative video, good job Mr Corey

  • @nicholasmaloof8378
    @nicholasmaloof8378 5 років тому +4

    2 weeks later and still not a single dislike on this video

  • @dhssb999
    @dhssb999 Рік тому

    best matplotlib tutorial ever!

  • @androkranjcevic1988
    @androkranjcevic1988 4 роки тому

    Really nice work over here, the most important man on youtube for me.

  • @brumarul7481
    @brumarul7481 4 роки тому +1

    This is pure Gold .

  • @ItzSenaCrazy
    @ItzSenaCrazy 5 років тому +3

    What I really like is your videos, Corey. I can learn Python and English ;D
    Thanks!!

  • @MagnusAnand
    @MagnusAnand 2 роки тому

    I can't believe we need this hack to make a bar chart.
    Great video.

  • @58_yesilgul
    @58_yesilgul Рік тому

    What a perfect lesson, fast and insightful pieces of knowledge...

  • @ahmedskasmani
    @ahmedskasmani 5 років тому +4

    Amazing content Corey. The way you simplify the material and explain is awesome, many thanks. Can you please also do a video showing your setup and how you make video's. Thanks !!!

  • @brucegwon
    @brucegwon 4 роки тому

    This is the best fantastic lecture for the relation of Python and Pandas I've ever seen!!!!!!!!!!!!!!
    Xie Xie!!!

  • @akunnaemeka395
    @akunnaemeka395 2 роки тому

    thank you Brilliant for supporting Corey

  • @pratikarai8115
    @pratikarai8115 4 роки тому

    Your explanation is awesome...thank you so much ...A great teacher for a lifetime...

  • @Anon282828
    @Anon282828 2 роки тому

    thank you for always showing the clear code before abbreviating

  • @gamengine1176
    @gamengine1176 4 роки тому

    Very helpful video. The pandas method is much simpler and easier to understand. Thanks Corey!

  • @dhairyaoza5422
    @dhairyaoza5422 3 роки тому

    thank you so much sir,really glad i found ur playlist and didn't waste time on other platforms

  • @ericfricke4512
    @ericfricke4512 4 роки тому +2

    Programming is so fun.

  • @LashaGoch
    @LashaGoch 4 роки тому +1

    This is gold! Thank you very much for doing this, you have incredible talent to explain complicated stuff in an easy manner, keep up good work :)))

  • @micheliwrmg
    @micheliwrmg Рік тому

    sad fact, if you want to open csv file in PYcharm , you have to pay for PYcharm Professional(~$230) :(
    btw you are the best teacher I've ever seen

  • @LONNiETOWN
    @LONNiETOWN 2 роки тому

    29:31 does it create a list of the variables? What if I want to remove or edit (like pop() command) an element, can I use list commands?

  • @SandeepChaudhary-vx9zy
    @SandeepChaudhary-vx9zy 4 роки тому +1

    Great explanation...thanks a lot Corey sir

  • @markkennedy9767
    @markkennedy9767 Рік тому

    Thanks for this. Great lesson. As you say, creating multiple bars seems extraordinarily hacky. I would have thought this would be easily dealt with by a plotting library

  • @redferne01
    @redferne01 5 років тому +3

    Thank you for your work. I enjoy every lesson.

  • @muzaianghanem5644
    @muzaianghanem5644 4 роки тому +1

    That's true......you are an amazing teacher. This was very helpful

  • @KC-rl8ub
    @KC-rl8ub 5 років тому +8

    hi Corey....god bless you

  • @rahil1575
    @rahil1575 2 роки тому

    you are a life saviour for people like me

  • @SM-vu6fm
    @SM-vu6fm 2 роки тому

    Counter() is the best thing I learned today

  • @franklinlima2571
    @franklinlima2571 4 роки тому +1

    Great video! Thank you man

  • @Linshark
    @Linshark 3 роки тому

    I just came across this series of videos. They are extremely good :-)

  • @Thedevineforce
    @Thedevineforce 5 років тому +1

    @Corey Schafer .. I came up with below function which will handle the bar widths for multiple bar plots by itself. Just in case anybody wants to use it :
    ages_x = np.asarray([25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])
    count = 5
    width = 0.8/count
    def width_cal(position):
    shift = np.array([])
    if count < 2:
    return ages_x
    if count % 2 == 0:
    for i in range(1, count, 2):
    shift = np.append(shift, (width/2 * i))
    shift = np.sort(np.append(shift, np.negative(shift)))
    else:
    for i in range(0, count, 2):
    shift = np.append(shift, (width/2 * i))
    shift = np.unique(np.sort(np.append(shift, np.negative(shift))))
    shift = np.around(shift, decimals=3)
    return ages_x + shift[position]
    plt.bar(width_cal(0), dev_y, width=width, color='#444444', label="All Devs")

  • @KumarGauravhi
    @KumarGauravhi 4 роки тому

    Hi Corey, thank you for the wonderful session , I have stuck at this point with the last example :-import csv
    import numpy as np
    import pandas as pd
    from collections import Counter
    from matplotlib import pyplot as plt
    plt.style.use("fivethirtyeight")
    data = pd.read_csv('data.csv')
    ids = data['Responder_id']
    lang_responses = data['LanguagesWorkedWith']
    language_counter = Counter()
    for response in lang_responses:
    language_counter.update(response.split(';'))
    languages = []
    popularity = []
    for item in language_counter.most_common(15):
    languages.append(item[0])
    popularity.append(item[1])
    languages.reverse()
    popularity.reverse()
    plt.barh(languages, popularity)
    plt.title("Most Popular Languages")
    # plt.ylabel("Programming Languages")
    plt.xlabel("Number of People Who Use")
    plt.tight_layout()
    plt.show()
    ### I am getting an error like AttributeError: 'float' object has no attribute 'split' ...Please explain..

  • @eliesawan9513
    @eliesawan9513 4 роки тому +1

    you are amazing, waiting for your data science ( ML, AI ) course...... THANKS A LOT!

  • @guyindisguise
    @guyindisguise 4 роки тому +2

    At 9:30 you correct the numbers of the x-axis with plt.xticks()
    Couldn't we just have circumvented that problem by saying
    x_indexes = np.array(ages_x)
    instead of
    x_indexes = np.arange(len(ages_x))
    Since that would have given us an array with the original numbers that we could add/subtract the width to/from?
    Is there any benefit to the plt.xticks() solution (other than seeing how xticks work)?

    • @johannesherbert4943
      @johannesherbert4943 4 роки тому

      I thought the exact same thing, why is he making his life more complicated than necessary? There is no problem with adding/subtracting offsets directly from the ages np.array, it just works. It makes it less hacky, too.

  • @akashdeepchauhan5
    @akashdeepchauhan5 3 роки тому

    You're making machine learning interesting, thank you

  • @luiscesar_agais
    @luiscesar_agais 3 роки тому

    Very nice your explanations. Congratulations.

  • @rajivswargiary1536
    @rajivswargiary1536 5 років тому +1

    Great tutorial sir

  • @DeepakKumar-uz4xy
    @DeepakKumar-uz4xy 5 років тому +1

    thank you professor. love from india. u know what i dont like to read those documentation. when i saw your videos.

  • @abhishek_raj
    @abhishek_raj 3 роки тому

    You explain things really well, kudos!

  • @manosmakris8308
    @manosmakris8308 2 роки тому +1

    You can also do this for geting the languages and popularity lists.
    languages = list(map(lambda x: x[0], language_counter.most_common(15)))
    print(languages)
    popularity = list(map(lambda x: x[1], language_counter.most_common(15)))
    print(popularity)

  • @sandy73rocks
    @sandy73rocks 4 роки тому

    At 6:11 , even without replacing "ages_x" with "x_indexes" , and applying plt.xticks = (ticks= x_indexes,label=ages_x)at 9:38, is giving the same result, provided we convert ages_x from list to np.ndarray. With that approach we don't even need xticks().
    Can we do that instead of what is shown or am i missing a point?

  • @rnytpl
    @rnytpl 4 роки тому

    Thank you man, appreciate the effort and time you've put in creating such amazing content as these.

  • @alexanderten5497
    @alexanderten5497 5 років тому +2

    Thank you very much.its a great tutorial as always

  • @tongliu1076
    @tongliu1076 5 років тому

    Great video as always! Really helpful for detailed explanation.

  • @minghaotao6259
    @minghaotao6259 5 років тому +2

    Thank you for sharing your knowledge!

  • @oscar.kiamba
    @oscar.kiamba 2 роки тому

    The best in you tube .👏

  • @edcoughlan5742
    @edcoughlan5742 5 років тому +5

    These videos are great! Coming from R (and ggplot) I was a tad skeptical that Python could emulate R when it came to data viz, but I stand corrected.

  • @VishalSharma-rn7mt
    @VishalSharma-rn7mt 4 роки тому +1

    Great, amazing video

  • @fourdaysdead
    @fourdaysdead 5 років тому +1

    thank you very much, very clear and straight to the point!

  • @DidaKusAlex
    @DidaKusAlex 2 роки тому

    great tutorial! the best!! thanks for teaching us!

  • @hayetchekired462
    @hayetchekired462 4 роки тому

    great instructor

  • @shazkingdom1702
    @shazkingdom1702 5 років тому +2

    This is the best Corey; Thank you very much from my 🧠 and ❣

  • @rakeshmali1727
    @rakeshmali1727 10 місяців тому

    at 9:11, since we adjusted the offsets for different bars, why did the X-axis still show 0, 1, 2... only? shouldn't it have shown (0-0.5), 0, 0+0.5?

  • @randiarisman2419
    @randiarisman2419 5 років тому

    Another great video form you, Corey. Thank you, you made my day everyday!!

  • @ondereren5003
    @ondereren5003 4 роки тому

    Amazing video !

  • @adirbarak5256
    @adirbarak5256 4 роки тому

    for unpacking counter.most_common(x) you can use:
    for a,b in counter.most_common(x) or for a,b in counter.items():
    cause they are the same, they are a list of tuples, which is "zipped" already =
    meaning you can iterate of it simultaneously (a is tuple[0]. b is tuple[1])
    I hope it helps you, yea you out there.

  • @emmanueljimawo5595
    @emmanueljimawo5595 5 років тому +1

    Great videos. I'm so grateful...

  • @lillyclive2641
    @lillyclive2641 4 роки тому +1

    Such a great help, thankyou so much!

  • @kerimabdul2263
    @kerimabdul2263 4 роки тому +1

    great video.

  • @thebuggser2752
    @thebuggser2752 Рік тому

    Another great video. Thanks!!

  • @MrEoex
    @MrEoex 3 роки тому

    I don't understand why we needed the numpy, what it's purpouse? 4:40

  • @frankconte2457
    @frankconte2457 5 років тому +3

    Another great tutorial. Thank you. However, using a Jupyter Notebook, I am having a problem with plt.bar, plt.barh. The error I receive is "unsupported operand type(s) for -: 'str' and 'float'.

  • @Lucas-wn5wm
    @Lucas-wn5wm 2 роки тому

    I jus found the python legend . Thank god

  • @IamKudos
    @IamKudos Рік тому

    In 8:20, how did you select cursors for 3 lines at the same time and wrote on them simultaneously? That's so handy

    • @ZizouZ5
      @ZizouZ5 Рік тому

      hold ctrl-shift (mac) or ctrl-alt (windows)

  • @FerdinandCoding
    @FerdinandCoding 4 роки тому +1

    thank you for python tutorial

  • @AbubakerMahmoudshangab
    @AbubakerMahmoudshangab 2 роки тому

    Corey. Million thanks bro

  • @PaoloCondo
    @PaoloCondo 2 роки тому

    Thank you for the series of video! :)

  • @Coney_island23
    @Coney_island23 2 роки тому

    thank you!!!! you ar an excellent teacher

  • @mamathakavety6529
    @mamathakavety6529 2 роки тому +1

    Please do a tutorial on numpy as well, it would be super helpful, by the way awesome content😁

  • @giuseppeceravolo93
    @giuseppeceravolo93 5 років тому

    Thank you so much for your hard work! You are a great teacher and your video tutorial represent a valuable resource :)

  • @gurukirans266
    @gurukirans266 5 років тому +1

    Thank you lot sir 😃

  • @chandansarkar1123
    @chandansarkar1123 5 років тому +4

    We can not thank you enough..still thanks a ton Corey.
    I have an interesting observation @9.48. In the plt.xticks(...) method when I use the ticks and labels keywords it gives me AttributeError. It works when I pass the arguments without using keywords. Perhaps it has something to do with my Matplotlib version...

    • @nikhiledu7556
      @nikhiledu7556 5 років тому

      Same happened with me

    • @asas-jf5iz
      @asas-jf5iz 4 роки тому

      yes, some old version matplotlib will have this problem.

    • @kabongontumba9492
      @kabongontumba9492 4 роки тому

      Thank you guy, I had the same problem

  • @rotrose7531
    @rotrose7531 3 роки тому

    Thank you very much. Please, please come back!

  • @michaelren2821
    @michaelren2821 4 роки тому

    great tutorial, thanks

  • @chahineatallah2636
    @chahineatallah2636 9 місяців тому

    great video, in my jupyter notebook, the bar plots are plotted on different plots, not on same plot, although am following same steps, am i missing on something?

  • @jsceo
    @jsceo 5 років тому +7

    that feel when I paused tutorial to figure out how to extract languages and popularity from language_counter and later it turns out that you've done that exactly in the same way, lol

  • @rahulp1985
    @rahulp1985 4 роки тому +2

    How to have the percentage values also listed along the Y-axis with language names as shown in the plot in the stackoverflow website (towards the end of the video)

  • @shreddersengupta7384
    @shreddersengupta7384 4 роки тому

    we can also use the dictionary's keys() and values() for getting x and y axis. x_axis = list(dict.keys())