Boxplots in R with ggplot and geom_boxplot() [R- Graph Gallery Tutorial]

Поділитися
Вставка
  • Опубліковано 7 січ 2025

КОМЕНТАРІ • 45

  • @TheDataDigest
    @TheDataDigest  4 місяці тому +3

    If you want to download the R code from this video you can do this here in my free skool community:
    www.skool.com/data-analysis-with-r-6607/classroom/daa88316?md=1213bcd67e104eb88b1a58073b5445cb

  • @mocabeentrill
    @mocabeentrill Рік тому +2

    WOW! What a comprehensive tutorial. Thank you very much.

    • @TheDataDigest
      @TheDataDigest  Рік тому

      I am glad that you liked it and left such a nice comment.

  • @aquarianfog
    @aquarianfog 2 роки тому +5

    This has been the most helpful video while making figures for my dissertation!! Thank you

    • @TheDataDigest
      @TheDataDigest  2 роки тому

      Glad to hear that Katie :) Thanks for sharing the compliment by leaving a comment. In which area do you write your thesis if I may ask? My background is in biochemistry and evolutionary biology.

  • @rachitsingh98
    @rachitsingh98 3 роки тому +2

    Thank you very much, this was really helpful. Lot of useful information packed in a short video and explained clearly as well. Wonderful 👌🏽

    • @TheDataDigest
      @TheDataDigest  3 роки тому

      Thanks for the kind words Rachit. Glad you liked it and found it helpful. 😊

  • @nicolastovar8121
    @nicolastovar8121 2 роки тому +2

    Thanks man :3 I´m from Colombia and your videos are amazing!

    • @TheDataDigest
      @TheDataDigest  2 роки тому

      Hi Nicolás, thank you for the comment! I am glad you like my content so far. ^_^
      I love how UA-cam "brings together" people from all around the world.

  • @pascal3327
    @pascal3327 2 роки тому +2

    You are genius. Thank you so much for this amazing lecture.

    • @TheDataDigest
      @TheDataDigest  2 роки тому

      Thanks for the compliment. Glad you enjoyed the content!

  • @TheDataDigest
    @TheDataDigest  3 роки тому +7

    Below is the code I used for the thumbnail (overlay of boxplot over density plots):
    red

    • @giulianabeltramone8383
      @giulianabeltramone8383 3 роки тому +1

      Thank you so much! I will try them right away! Thank you for sharing!!

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому +1

      Month SDSM CHIRPS GCMs
      Jan 4 0 16
      Feb 1 2.3 28
      Mar 16.8 17 13
      Apr 28 25 89
      May 57 55 98
      Jun 27 23 42
      Jul 17 15 74
      Aug 79 70 130
      Sep 24 20 39
      Oct 19 14 9.3
      Nov 5 0 21
      Dec 8 2 19.5 ... This is my dataset, I wanted to group the way you perfectly did. But I could not make it. I want to build three variables corresponding to each month. I would be thankful if you have time to help ..

    • @TheDataDigest
      @TheDataDigest  Рік тому +1

      @@bernardrobenson5071 If you have that data as a data.frame, you might want to you str() to check that. I do recommend to make a bar chart and use facet_wrap to show the three variables and their changes over time. Please try out the code below:
      # you might have to turn Month into a factor for proper order.
      data$Month % pivot_longer(-Month) %>%
      ggplot(aes(x = Month, y = value, fill = name)) +
      geom_col() +
      facet_wrap(~name, ncol = 1)
      # if you really need a boxplot, I can only see it as dots per variable for each month, then please try this:
      data %>% pivot_longer(-Month) %>%
      ggplot(aes(x = name, y = value, fill = name)) +
      geom_boxplot() +
      geom_jitter()
      I will soon have a email address for subscriber questions. I will post it to you next week in case you have further questions.

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому +1

      @@TheDataDigest I sincerely Thank you for the quick answer and assistance. I want to build a boxplot. I tried several times and the result is that the boxplots in each month look like a flat tiny line. I was wondering why.. In your tutorial you described grouped plots through variety, treatment,and value. I have done almost the same and the work was oky but I could not make it. Maybe I am making a mistake with value repeating.. Also the box plot only covered three months and the result of the three months was repeated again on the other months.. ... Thanks again and always you are a great scientist, for your rapid response and all attempts you keep testing to help. You deserve all appreciation and respect .

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому

      @@TheDataDigest Thanks for your help.. I do it this way and It works. The code is shown below:
      ggplot(data_range_long, aes(x = Month, y = Value, fill = Method)) +
      geom_boxplot (width = box_width, alpha=alpha_value) +
      scale_fill_manual(values = c("CHIRPS" = "blue", "SDSM" = "green", "GCMs" = "red"))+
      scale_x_discrete(limits = month.abb)+
      ylab("Precipitaion averages (mm)")+
      xlab("Month")+ theme_bw()+
      theme(legend.position="top")+
      theme(legend.title=element_blank())+
      theme(axis.title.y = element_text(size = 16),
      axis.title.x = element_text(size = 16))+
      theme(legend.text = element_text(size = 14))+
      scale_colour_manual(values = desired_order, labels = desired_labels)+
      theme(plot.margin = margin (0.3,0.6,0.5,0.5,"cm"))+
      theme(axis.text.x = element_text(size = 10))+
      theme(axis.text.y = element_text(size = 12))

  • @giulianabeltramone8383
    @giulianabeltramone8383 3 роки тому +3

    I wish I could have seen this video before presenting my thesis! Thank you very much!
    I was wondering how you add the density plots under the boxplots?

    • @TheDataDigest
      @TheDataDigest  3 роки тому +1

      Congratulations for presenting/finishing your thesis. I bet it went well even without some of these plots. In R you can add different plot types (geoms) on top of each other ones the aes(x ,y, color, fill) mapping has been done in ggplot(). Then you can do "+ geom_density() + geom_boxplot(). But let my post the code in a separate comment on top

  • @binhomosta4593
    @binhomosta4593 3 роки тому +1

    Great tutorial! Thanks a lot.

    • @TheDataDigest
      @TheDataDigest  3 роки тому +1

      Thanks for the comment. Glad it was helpful for you.

  • @erichideki4994
    @erichideki4994 Рік тому +1

    but how can I avoid duplicating the circles on outliers? for example we can see for a single data a red circle and also a black one. Thank you and so useful video!

    • @TheDataDigest
      @TheDataDigest  Рік тому

      You can remove the outliers within the geom-function:
      geom_boxplot(outlier.shape = NA)
      Thanks for leaving a comment. Glad you like the videos.

  • @bkarim7349
    @bkarim7349 2 роки тому +1

    thanks, very very useful

    • @TheDataDigest
      @TheDataDigest  2 роки тому

      Thanks for leaving a comment. That's the goal with this videos. I learn a lot about different ways to plot data and enjoy teaching others along the way.

  • @martastaff9186
    @martastaff9186 Рік тому +1

    Hi! I seem to be struggling to produce the boxplots per row using the ggplot. My dataframe consists of 10k values in each row that I need to visualise as an individual boxplot. Any suggestions?

    • @TheDataDigest
      @TheDataDigest  Рік тому

      Hi Marta, I think the fastest way to help you is, if you send me a subset (or the whole) of your data with the R-code that you tried out so far. Also an image of a boxplot you want it to look like would be useful. You could email that to: question@thedatadigest.email

  • @bernardrobenson5071
    @bernardrobenson5071 2 роки тому +1

    Thanks for this great tutorial. However, I have tried to follow your steps but I still facing difficulty building and grouping Boxplot for three columns of data vs months. If you can help I would be appreciate

    • @TheDataDigest
      @TheDataDigest  Рік тому +1

      The issue might come from the data structure. Do you have 3 columns, one for category, one for month and one for the actual data, with repeating month? Then you can stack or dodge the categories and have month on the x-axis. Feel free to post the code that gave you error messages.

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому +1

      @@TheDataDigest thanks for the reply. My data consist of a month names column, and three columns of numbers (integers). Each column represents different climate data; ground observations, CMIP5, and CMIP6.. I appreciate your help

    • @TheDataDigest
      @TheDataDigest  Рік тому +1

      @@bernardrobenson5071 Can you give this code a try:
      library(tidyverse)
      example % pivot_longer(-month) %>%
      ggplot(aes(x = month, y = value, fill = name)) +
      geom_col(position = "dodge")
      Alternatively you can use geom_col(position = "stack") at the end.
      The pivot_longer function is the crucial step to turn the data into long format. Then you have month, name and value that you can use within the aes() mapping in ggplot().
      Let me know if that helped.

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому

      @@TheDataDigest Thanks for the quick answer. I will run the code and see the result. Thank again

    • @bernardrobenson5071
      @bernardrobenson5071 Рік тому +1

      @@TheDataDigest I think this code is for barplot. :: I am looking for boxplot code if you can help.

  • @DmitryPonomareF
    @DmitryPonomareF 2 роки тому

    wow, super! Thanks!

  •  3 роки тому +1

    Thanks 👏👏

  • @kyleevalencia1827
    @kyleevalencia1827 3 роки тому +1

    Can you make video about ggmatrix ?

    • @TheDataDigest
      @TheDataDigest  3 роки тому

      Hi Kylee, most definitely. I have planned to make a video about different ways to arrange plots in R (facet_wrap, gridExtra etc.) Thanks for pointing me towards ggmatrix!

  • @zainabpirbhai1660
    @zainabpirbhai1660 3 роки тому +1

    Can you help me with R?

    • @TheDataDigest
      @TheDataDigest  3 роки тому

      Hi, I hope these visualization tutorials with ggplot() are already a first start to help you with R.

  • @WahranRai
    @WahranRai 2 роки тому +2

    Dont use pipe %>% , let your code easy to understantable by everybody even python propgramers etc..
    ggplot2(data = data, aes =(x=names...) is better all needed info are encapsulated inside the function : data and attributes