Handling NA in R | is.na, na.omit & na.rm Functions for Missing Values

Statistics Globe

Додати в
- Мій плейлист
- Переглянути пізніше
Поділитися

Поділитися

Вставка

Розмір відео:

Показувати елементи керування програвачем

Автоматичне відтворення

Автоповтор

Опубліковано 3 гру 2024

КОМЕНТАРІ • 203

@careenevans 2 роки тому ⁺³
I have been following your tutorials for a couple of days now. I want to say thank you, they are truly direct and straight to the point. I wish that you would offer consultation to students even if you decide to charge a price on it. Because sometimes one might get stuck and not know what to do.
@StatisticsGlobe 2 роки тому
Thanks a lot for the very kind feedback Careen, glad you find the tutorials useful! :) In case you have any questions, you may post them to the Statistics Globe Facebook group: facebook.com/groups/statisticsglobe Regards, Joachim
@ezhankhan1035 Рік тому ⁺²
Directly answered what I was looking for - Thank you!
I have used 'drop_na()' as oppose to 'na.omit()' for the most part, but always good to know alternative ways of doing things.
@matthias.statisticsglobe Рік тому
That's great to hear! Thank you very much for the feedback, Ezhan.
@shambo9807 10 місяців тому ⁺¹
Very clear and succinct. All the info I needed clearly explained. 👍🏾
@Ifeanyi.StatisticsGlobe 10 місяців тому
Thanks for the kind words, Shambo!
@WaayoArag. Рік тому ⁺¹
Thank you for the good lesson; explained very clearly.
@matthias.statisticsglobe Рік тому ⁺¹
Thank you very much for the feedback! Great to hear you like the video/explanations!
@jababnamgay6366 4 роки тому ⁺¹
Thank you so much. Easiest method to remove NAs.
@StatisticsGlobe 4 роки тому
Thank you for the kind words Jabab!
@anthonyfernandezgonzalez8262 3 роки тому ⁺¹
Love it, thank you one more time dude! Love the way you prepared your lessons ´cause they are really short, focus on an specific context and finally you gave us multiple solutions for an scenario, so thats the way it must be.
@StatisticsGlobe 3 роки тому
Awesome, thank you very much for the very nice feedback Anthony! :)
@mycountryfarm 8 місяців тому ⁺¹
Awesome content! Very well demostrated!
@StatisticsGlobe 8 місяців тому
Thank you so much, glad you liked it!
@arunbioinfo1100 3 роки тому ⁺¹
excellent joachim, perfectly explained
@StatisticsGlobe 3 роки тому
Glad you liked it Arun, thank you for the kind words! :)
@shirinisabekova5504 4 роки тому ⁺¹
Thank you so much for the well-explained video. Keep on posting them please. You are doing a great job!
@StatisticsGlobe 4 роки тому ⁺¹
Wow, thanks a lot Shirin! More videos to come! :)
@shirinisabekova5504 4 роки тому ⁺¹
@@StatisticsGlobe I am very excited about it!
@eapen4irm 3 роки тому ⁺¹
Your videos are amazing and easy to understand! Thank you!!!
@StatisticsGlobe 3 роки тому
Thanks a lot for the nice comment Eapen! Glad you like them!
@multitaskprueba1 4 роки тому
Excellent explanation! You are a fantastic teacher!
@StatisticsGlobe 4 роки тому
Thanks a lot for the awesome compliment! :)
@DavidKaranjamdavis 3 роки тому ⁺¹
Informative and well explained
@StatisticsGlobe 3 роки тому
Thank you David, glad you think so!
@jababnamgay6366 4 роки тому ⁺¹
very simple to follow sir.
@StatisticsGlobe 4 роки тому
Glad to hear that Jabab, thanks for letting me know :)
@dominiquebarrette9621 4 роки тому ⁺¹
Bravo! So well explained! Thank you
@StatisticsGlobe 4 роки тому
Glad you enjoyed it Dominique!
@roshnyabraham7941 2 роки тому ⁺¹
Thank you so much! You have been such a good help.
@matthias.statisticsglobe 2 роки тому
That's great to hear Roshny! Thanks a lot for you support!
@mugangakivumbi 4 роки тому ⁺¹
Thanks you,tutorial was very helpful
@StatisticsGlobe 4 роки тому
Thanks again Ronald! :)
@fostkangben 3 роки тому ⁺¹
Thanks for this video
@StatisticsGlobe 3 роки тому
Most welcome Kangben! :)
@claytontherrien7583 2 роки тому ⁺¹
Thank you very much!
@StatisticsGlobe 2 роки тому
You're very welcome Clayton!
@malakasamaraweera6736 5 років тому ⁺¹
hi, your video demonstration is very useful. keep it up !
@StatisticsGlobe 5 років тому
Hi, thanks a lot for the positive feedback. Nice to hear that you like the videos!
@sameenabanu5931 4 роки тому ⁺¹
Thanks it is very informative.
@StatisticsGlobe 4 роки тому
Thank you Sameena, glad you liked it!
@atthoriqpp Рік тому ⁺²
Hello i have a question!
Should you always remove missing values in dataset (especially for public data)? Or do we need to consider the proportion of missing data, missing value type (MCAR, MAR, NMAR), and skewness of the data?
I’m really struggled with this particular issue (not the technique, but the judgement as to remove missing values or not), Please shed me a light and thanks!
@cansustatisticsglobe Рік тому ⁺²
Hello Atthoriq,
It is absolutely far from a good idea to remove the missing data unless the missingness is MCAR. This tutorial only discusses some missing value-removing functions, not the concept. Handling missing data is a HUGE concept on its own. Maybe these tutorials of ours: statisticsglobe.com/missing-data/, statisticsglobe.com/missing-data-imputation-statistics/ might be a starting point.
Regards,
Cansu
@atthoriqpp Рік тому ⁺²
@@cansustatisticsglobe Thank you. I'll check the article now.
@atifdai313 2 роки тому
Excellent work
@StatisticsGlobe 2 роки тому ⁺¹
Many thanks Atif! :)
@mdiqbal7168 2 роки тому ⁺¹
Tabulated value and calculated value in t-test normal distribution by plot in R programming
@cansustatisticsglobe Рік тому
Hello,
Thank you for your comment. Do you mean that you would like to see a tutorial on this topic? Is there something specific that you would like to know about tabulated value and calculated value in t-test normal distribution by plot in R programming ?
Regards,
Cansu
@negijivlogs4626 4 роки тому ⁺¹
Thanks for this video.
@StatisticsGlobe 4 роки тому
You are welcome Vivek! :)
@tmitra001 3 роки тому ⁺¹
I like all your Video
@StatisticsGlobe 3 роки тому
This is great to hear! Thanks for the wonderful feedback Tamoghna! :)
@aloysduistermaat7046 3 роки тому ⁺²
How does this work the other way round? For example, I want all values in my dataframe to become NA if they are below 0.4. Thank you!
@yannickpichardo5520 3 роки тому ⁺¹
you can use df[df < 0.4] = NA
@StatisticsGlobe 3 роки тому
Thanks Yannick, that would have been my recommendation as well :)
@aloysduistermaat7046 3 роки тому ⁺¹
Thanks guys! It worked
@StatisticsGlobe 3 роки тому
Great to hear!
@aloysduistermaat7046 3 роки тому ⁺¹
@@StatisticsGlobe To elaborate on my question from earlier.. How do you remove all values between - 0.4 and 0.4? I tried 'data[data -0.4]
@azad2546421 3 роки тому ⁺¹
Sir, in your statisticsglobe website, where do we start? As a beginner to R, I'd like to know as to where to start. Thanks
@StatisticsGlobe 3 роки тому ⁺¹
Hey Azad, thank you for your comment! Unfortunately my tutorials do not follow a clear order. I have planned to publish a huge overview on R programming soon, in which I will structure all tutorials. I hope I'll find the time for it soon. Regards, Joachim
@azad2546421 3 роки тому ⁺¹
@@StatisticsGlobe OK Sir. Till then, I will try to watch the videos as best as I can. Thank you very much for all your work.
@StatisticsGlobe 3 роки тому
You are very welcome Azad! Let me know in the comments in case you have any questions :)
@organ1181 Рік тому ⁺¹
How to deal with the missing data for catergory variable, please?
@cansustatisticsglobe Рік тому
Hello,
If you assume that the missingness in your data is MAR, see statisticsglobe.com/missing-data/.You can use multiple imputation (maybe the most preferred method under MAR) to impute your values. You can check the documentation of the mice() function: www.rdocumentation.org/packages/mice/versions/3.16.0/topics/mice, to see what methods are applicable for ordered or unordered categorical variable imputation. You should scroll down the page up until the Details section.
Alternatively, you can do list-wise deletion like in the tutorial above, yet this would bring some cons with it. See the Listwise Deletion tutorial: statisticsglobe.com/listwise-deletion-missing-data/ for the details.
Best,
Cansu
@lahirukudaligamage13 10 місяців тому
YESSSSS THANK YOUUUUU
@Ifeanyi.StatisticsGlobe 10 місяців тому
You're welcome Lahirukudaligamage. We are happy you found the tutorial helpful!
@jayw6886 2 роки тому ⁺¹
hello, great videos thanks! question, if I wanted to get the NA values in a separate subset instead of omitting or removing them, what can I do?
@StatisticsGlobe 2 роки тому
You are welcome Jay, glad you like it! :) Regarding your question, please have a look at the following R code:
airquality_NA
@lsjenny2198 2 роки тому ⁺²
I am trying to use ggscatter but I have many NAs in y column and no correlation coefficient appears. Is there any way of ignoring these NAs or changing them to "0"? please help me, thank you.
@StatisticsGlobe 2 роки тому ⁺¹
Hey Jenny, I have never used ggscatter, but you may replace NA values by 0 as shown here: statisticsglobe.com/r-replace-na-with-0/
@lsjenny2198 2 роки тому ⁺¹
@@StatisticsGlobe Thank you, I fixed it
@StatisticsGlobe 2 роки тому
Glad you found a solution!
@siddheshgosavi3552 4 роки тому ⁺¹
thank you so much ❤❤❤
@StatisticsGlobe 4 роки тому
Always welcome Siddhesh! :)
@hezzia4427 4 роки тому ⁺¹
I was looking for how to working with the missing data, not to remove entire row that has NA, there are other columns for each row containing NA
@StatisticsGlobe 4 роки тому
Hi Hezzi, in this case you should have a look at missing data imputation. For example, you may have a look at this tutorial: statisticsglobe.com/predictive-mean-matching-imputation-method/ Regards, Joachim
@Jay19876 4 роки тому ⁺¹
Can you just remove NA's from a specific column within a data set? For example, if I have a column such as "wind chill" which has a lot of blanks when its not cold outside, I don't want to erase all of that data from the data set if I am looking at another column/vector of interest. Thanks!
@StatisticsGlobe 4 роки тому ⁺¹
Hey Jay, you may impute your missing values. This depends a lot on the content of your variable though. You may have a look at this tutorial for more information: statisticsglobe.com/missing-data-imputation-statistics/ I hope that helps! Joachim
@frankjr3787 4 роки тому ⁺¹
THank you very much for this video (Just subscribed). How do you remove 'NA" from a data set that has no numeric values. Say I just had to Columns( Name and Hair Color) and some of the Hair colors were NA.. how would I omit that?
@StatisticsGlobe 4 роки тому
Hey Frank, Thanks for subscribing! :) The class of your variables does not matter, you can apply the functions shown in this video the same way. If it doesn't work, you could check if your NA values are real NA values or if they are "NA" charater strings. In this case, you could replace the "NA" by real NA as shown in the following example code:
data
@lavinaarora3697 4 роки тому ⁺¹
After omitting the NA the nos of rows still show the numbers in the original data set . Though I see that the number of row in the data after committing the rows is 111. which code can I use to get this 111 as nrow() gives me the original numbers
@StatisticsGlobe 4 роки тому
Hi Lavina, So you want to rename the rownames of the new data frame to be equal to the number of rows? Then you could use the following R code: rownames(data)
@TinaHelen 4 роки тому ⁺¹
Thank you, Maybe you can even help me further... How can I exclude single missing values from cases runinng Confirmatory Factor Analysis , without deleting the whole cases? I think the "na.rm=TRUE"-function should be the right one, but it seems that this doesnt work with the CFA-function (lavaan). When I do this, R still excludes the whole cases from the analysis. I would be so thankful, if anyone could help me!
@StatisticsGlobe 4 роки тому
Hey Tina, I recommend to apply a missing data imputation method such as predictive mean matching. The following tutorial provides more info: statisticsglobe.com/predictive-mean-matching-imputation-method/ Regards, Joachim
@TinaHelen 4 роки тому ⁺¹
Thank you so much for your fast answer and for the hint! I will definetely consider that option. So do you think it's not possible the way I wanted to do it (just exclude the values) in combination with the cfa-function? Best regrards :-)
@StatisticsGlobe 4 роки тому
As far as I know, it is not possible. I'm not an expert for CFA though, so please double check somewhere else. In general: Imputation is almost always better than deletion methods, since otherwise your results are likely to have a (stronger) bias. Regards, Joachim
@oluwadolapobifarin105 5 років тому ⁺¹
Thanks
@StatisticsGlobe 5 років тому
@Oluwadolapo Bifarin You are welcome :)
@Michelle-mv1gg 3 роки тому ⁺¹
how do you handle or replace NA values in a dataset where dates and other numeric information is missing .
@StatisticsGlobe 3 роки тому ⁺¹
Hi Michelle, usually I try to replace missing values using missing data imputation methods. You can find more info here: statisticsglobe.com/missing-data-imputation-statistics/ Regards, Joachim
@Michelle-mv1gg 3 роки тому ⁺¹
Thank you.
@StatisticsGlobe 3 роки тому
You are very welcome!
@whitfieldlewis837 Рік тому ⁺¹
good stuff
@matthias.statisticsglobe Рік тому
That's great to hear, Lewis! Thanks for the positive feedback!
@francesco8150 Рік тому ⁺¹
hi, i'm trying to do cov. with two groups of values, but one has NAs and R doesn't allow me to remove themwhan i do the cov, and if i rewrite the two groups without NA they are different in lenght, so cov can't be done, what i can do? ;(
@cansustatisticsglobe Рік тому
Hello Francesco,
It is always better to check the documentation of the function. There, you can see if the function offers a handling method. See the documentation here: www.rdocumentation.org/packages/pbdDMAT/versions/0.5-1/topics/covariance
Best,
Cansu
@lh4818 4 роки тому ⁺¹
How can You make a new data frame that excludes all the NA values
@StatisticsGlobe 4 роки тому
Hey, please try the following R code: data_new
@tirthanandi6122 Рік тому ⁺¹
na.omit is removing the whole row. what if I do not remove the whole row? Is there any way I can plot geom_line without omitting na? The plot needs to ignore the point where there is a na?
@cansustatisticsglobe Рік тому
Hello Tirtha,
I think geom_line works as you wish by default. But if you want to avoid the gaps due to NA values. You can check our tutorial statisticsglobe.com/connect-lines-across-missing-values-ggplot2-line-plot-r. If the tutorial is not relevant to what you ask, please describe your wish in a bit more detail. Then I can try to find other solutions.
Regards,
Cansu
@tirthanandi6122 Рік тому ⁺¹
@@cansustatisticsglobe Hi, thank you so much for your reply. The tutorial that you showed is ok for one x,y pair. But I am looking for x, y1,y2,y3 dataframe. Now, if a data is NA in y1, not necessarily NA in y2, and y3. If I want to plot geom_line x-y1,x-y2,x-y3, what should I do?
@cansustatisticsglobe Рік тому
@@tirthanandi6122 You are welcome. You can create new data columns for x-y1, x-y2, and x-y3 by simple data manipulation, then the data for x-y1 will be NA in some rows but not for x-y2 and x-y3. Ggplot will ignore the missing values and there will be breaks in your lines (I assume you pot multiple lines). If this solution doesn't address the issue please share your code with me then let me know what you want to change in the visual. I hope I can help then.
Regards,
Cansu
@sun27g 4 роки тому ⁺¹
when you ran na.omit(airquality) before mean(airquality$ozone) already rows with NAs were deleted, giving you a complete numeric dataset, then why mean(airquality$ozone) is returning NA again....
@StatisticsGlobe 4 роки тому ⁺⁴
Hey Aditya, na.omit(airquality) is not storing the complete data set in a new data object. You may use this code to store the complete data set:
airquality_complete
@Paan-2.1 3 роки тому
@@StatisticsGlobe Wie speichere ich diesen neu erstellen Datensatz als eigenes Rda File? :-)
@mosesyoung9318 3 місяці тому ⁺¹
What of if there were character variables
@StatisticsGlobe 3 місяці тому
Hey, most of these methods also work for character data.
@anandacharya9919 4 роки тому ⁺¹
How to handle missing values in category variables not mentioned ??
@StatisticsGlobe 4 роки тому
Hey Anand, Actually you can use the first three examples of the video also for categorical variables. Only the last example (taking the mean) is not applicable to categoricals. Regards, Joachim
@borknagarpopinga4089 4 роки тому ⁺¹
How can I delete a certain row only if the amount of NA's surpasses a certain threshold? E.g. when I have like 100 slope coefficients, but only one value is missing, it sounds a bit harsh to delete the whole row. How can I tell R to only delete the row, if there's let's say more than 10 NA's?
@StatisticsGlobe 4 роки тому ⁺¹
Hey Borknagar, the following R code should do the trick: data_new
@borknagarpopinga4089 4 роки тому ⁺¹
@@StatisticsGlobe Worked perfectly, thx a lot. (Y)
@StatisticsGlobe 4 роки тому
Nice to hear Borknagar, thanks for letting me know :)
@larissacury7714 2 роки тому ⁺¹
What if I had two entries for each SUBJECT and I want to filter both of their entries if one of their entries in another collumn is NA? ps: great video as always!
@StatisticsGlobe 2 роки тому ⁺¹
Hey Larissa, thank you very much, glad you like the video! Regarding your question, please have a look at the following example code:
data
@larissacury7714 2 роки тому ⁺¹
@@StatisticsGlobe Hi, thank you!! I went for a tidy solution check it out: data %>%
group_by(SUBJECT) %>%
filter(all(!is.na(MYVARIABLE))) does that make sense?
@StatisticsGlobe 2 роки тому
Hey, this is difficult to tell without seeing your actual data, but I think this should produce a different result as my code. Is there a specific reason why you would like to use tidy instead of Base R?
@punchline9131 5 років тому ⁺¹
Gibt es von dir auch ein Video wie ich das mit dem Befehl "listwise deletion" handeln kann?
@StatisticsGlobe 5 років тому
@Gummibärmann Listwise Deletion wird in R normalerweise mit der Funktion complete.cases durchgeführt. Du kannst dir hierzu dieses Video ab Minute 2:40 anschauen: ua-cam.com/video/OVHIYAEAHLY/v-deo.html Außerdem habe ich auf meiner Homepage ein Tutorial dazu veröffentlicht: statisticsglobe.com/listwise-deletion-missing-data/ Gib gerne Bescheid, ob dir die beiden Links geholfen haben :) Gruß Joachim
@StatisticsGlobe 5 років тому
@Der Humanist Danke für deine Rückfrage. Es scheint so als hätte euer dozent der Variable help immer eine 1 zugewiesen, wenn eine der anderen Variablen in df NA ist. Hat er danach eventuell ein Subset von df genommen, in dem nur die Beobachtungen drin sind, die in help = 0 sind? Dann wäre das (auf umständlichere Weise) das Gleiche wie wenn man die complete.cases Funktion verwendet. Ohne genauere Informationen ist das für mich aber ehrlich gesagt schwer zu beurteilen.
@StatisticsGlobe 5 років тому
@Der Humanist Freut mich, dass ich helfen konnte! Lassen Sie es mich gerne in den Kommentaren wissen, falls Sie weitere Fragen haben :)
@sofiac4058 4 роки тому ⁺¹
How can I remove NA values only if it is in a certain colunm.
@StatisticsGlobe 4 роки тому
Hi Sofia, you may either apply a listwise deletion (see here: statisticsglobe.com/complete-cases-in-r-example/) or you may extract the column as a vector and then remove NAs (see here: statisticsglobe.com/convert-data-frame-column-to-a-vector-in-r). I hope that helps! Regards, Joachim
@Janine5748 4 роки тому ⁺¹
Hey maybe you can help me. On university we have a project and we need to remove all the NA's from our data but the problem is I don't know how to remove Na's if they are "words" instead of "numbers". For example -> you get the variable "house" and then "new house", "old house", "big house", "small house" and then there are also some NA's . I tried it with complete.cases but it didn't work and also with "factor" so I decided to do it one by one and the parts with numbers were easy.
@StatisticsGlobe 4 роки тому ⁺¹
Hey Janine, Thanks for the comment. That's actually a very common problem. I suggest to replace the word-NAs with real NA values first. You can do that with the following code: data[data == "NA"]
@jenevavergara4125 4 роки тому ⁺¹
how about if I only want to remove rows with all values are NA?
@StatisticsGlobe 4 роки тому ⁺¹
Hey Jeneva, thanks for the question. You can use the code shown in examples 5 and 6 of this tutorial: statisticsglobe.com/r-remove-data-frame-rows-with-some-or-all-na Regards, Joachim
@jenevavergara4125 4 роки тому ⁺¹
Statistics Globe thank you very much, but I have another dilemma as I need to include the unique ID of the data for merging later, is there a way where I can only select columns with NA values in the row are present, so only that will be deleted? thank you very much for helping
@jenevavergara4125 4 роки тому ⁺¹
EX. in my dataset i have column names: "ID" "A" "B" "C" "D" i only want to delete the rows with NAs in column A B & C
@StatisticsGlobe 4 роки тому ⁺¹
@@jenevavergara4125 Is the following code working for you? data[rowSums(is.na(data[ , ! colnames(data) %in% "ID"])) == ncol(data[ , ! colnames(data) %in% "ID"]), ]
@daphne_moo 4 роки тому ⁺¹
Sir, I would like to mutate a column named Daily revenue , which is added with promotion_revenue and non_promotion_revenue. However, there are some rows consists of NA in promotion_revenue whereas $30 in non_promotion_revenue. When I compute, the mutated column (Daily Revenue) will show the daily revenue in NA, even if there is number in one of the columns. I ady applied na.rm = TRUE in the summarize code summarize(daily_revenue = sum(total_rev, na.rm = TRUE)) , it doesn't work.
@daphne_moo 4 роки тому
I tried this, failed :(
mutate((total_rev = promo_revenue + non_promo_revenue), na.rm = TRUE) %>%
group_by(order_date) %>%
summarize(daily_revenue = sum(total_rev))
@daphne_moo 4 роки тому
promo_revenue, non_promo_revenue, total_rev
2020-03-18 NA 14.90 NA
2020-03-18 42.47 10.85 53.32
@StatisticsGlobe 4 роки тому ⁺¹
Hi Daphne, you may replace the NA values by 0 before taking the sum. You can find more information here: statisticsglobe.com/r-replace-na-with-0/
@daphne_moo 4 роки тому ⁺¹
@@StatisticsGlobe thanks! Got it~
@shanti3310 3 роки тому ⁺¹
Hello,
How do handle NaN in R?
@StatisticsGlobe 3 роки тому
Hey Shanti, please have a look here: statisticsglobe.com/nan-in-r-is-nan-function
@taruvingatakudzwa151 10 місяців тому
How do i merge two datasets A and B but data set B is a small data that has to go and replace certain cells in A
@Ifeanyi.StatisticsGlobe 10 місяців тому
Hi Taruving. Maybe you could so something like this:
A
@hoax9784 Рік тому ⁺¹
and how do i do if it only shows other characters but not "NA", sir?
@cansustatisticsglobe Рік тому
Hello,
I am not sure if I got your question very well. Are you asking if the missing values are shown with other characters instead of NA?
Regards,
Cansu
@hoax9784 Рік тому ⁺¹
@@cansustatisticsglobe yes, sir. In my data, missing values are shown by "?" instead of "NA". However, i have already known the solution by watching your other videos. Thanks a lot.
@cansustatisticsglobe Рік тому ⁺¹
@@hoax9784 Perfect!
@mohammadbasheer6192 5 років тому ⁺¹
hi, can write a code to replace missing value "NA" with mean
@StatisticsGlobe 5 років тому ⁺¹
Hi, you can use the following code: x[is.na(x)]
@mohammadbasheer6192 5 років тому
@@StatisticsGlobe thank you sir
@StatisticsGlobe 5 років тому ⁺¹
You are welcome :)
@mohammadbasheer6192 5 років тому
@@StatisticsGlobe hello sir... could you please explain me about R functions and function components like function name, arguments, function body and return value... or can you make a video on this topic
thanks
@StatisticsGlobe 5 років тому
@@mohammadbasheer6192 Do you mean functions that are already available in R or do you mean user-defined functions? If you want to learn more about already available functions, you could have a look here: statistical-programming.com/r-functions-list/ If you want to learn more about user-defined functions, you could have a look here: statistical-programming.com/r-return-value-from-function-example
@zeusbhattacharya3122 5 років тому ⁺¹
How do you save omitted data in excel?
@StatisticsGlobe 5 років тому
Hi Zeus, you can find a detailed tutorial on exporting Excel files here: statisticsglobe.com/write-xlsx-xls-export-data-from-r-to-excel-file
Does this answer your question? Regards, Joachim
@ayeledesalegn5367 4 роки тому
@@StatisticsGlobe ua-cam.com/video/G2ra7Ku3eGM/v-deo.html
@shaheryarshafi 4 роки тому ⁺¹
is that possbile to change na from a particular rows like I have created Code : airquality[is.na(airquality[52:61, c(1, 2)])] = 7 but it not working then I create code like this one : airquality[is.na(airquality[52:61, c(airquality$Ozone)])] = "Sherry" this one is also not working
@StatisticsGlobe 4 роки тому
Hey Shehriyar, thanks for the question. You can use the following R codes:
airquality[52:61, c(1, 2)][is.na(airquality[52:61, c(1, 2)])] = 7
airquality[52:61, "Ozone"][is.na(airquality[52:61, "Ozone"])] = "Sherry"
Regards, Joachim
@mdiqbal7168 2 роки тому ⁺¹
R programming for t-test two tail tabulated value in plot
@cansustatisticsglobe Рік тому
Hello,
Thank you for your comment. Do you mean that you would like to see a tutorial on this topic? Is there something specific that you would like to know about tR programming for t-test two-tail tabulated value in the plot?
Regards,
Cansu
@mariasaraiva9675 3 роки тому ⁺¹
The problem is that depending on the package na.rm does not work. It seems that each package has its own way to consider NAs. This is stressful when you are used to SAS.
@StatisticsGlobe 3 роки тому
Hey Maria, you can use na.omit to remove rows with NA values before applying other functions. Note that it is often better to impute missing values via missing data imputation techniques, but this depends on your specific data.
@mariasaraiva9675 3 роки тому ⁺¹
@@StatisticsGlobe in epidemiology we "rarely" impute data, unless with multiple imputation after kowning very well what is going on with data , that is, sampling and understanding who are the missings. I know that for certain areas imputation is always recomended. Thanks.
@StatisticsGlobe 3 роки тому
OK I see, I have no experience in this field myself :)
@manjunathroyal2133 4 роки тому ⁺¹
When I try sum(is.na(data)) I am getting error as argument y is missing
@StatisticsGlobe 4 роки тому
Hi Manjunath, could you provide an example how your data looks like? Regards, Joachim
@shaheryarshafi 4 роки тому ⁺¹
maybe you need to use dataset name if you have use data(airquality) then sum(is.na(airquality) or any other name that you have used for your data .
@Paan-2.1 3 роки тому ⁺¹
@Statistics Globe Vielen Dank für das tolle Video. Das hat wirklich geholfen :) Leider habe ich immer noch ein Problem, und ich hoffe wirklch sehr, dass du meine Frage beantworten kannst. An welche Stelle setzte ich das na.rm = TRUE in einem komplexeren Code?
Ich bekomme immer eine Fehlermeldung und ich schätze (laut Internetrecherche) dass diese etwas mit den NA zu tun hat: Fehler in KhatriRao(sm, t(mm)) : (p
@StatisticsGlobe 3 роки тому ⁺¹
Hallo Paula, vielen Dank für die netten Worte. Freut mich sehr, dass dir meine Tutorials gefallen! :) Die Antwort auf deine Frage findest du in der Dokumentation der lmer Funktion. Diese kannst du mit dem R Code ?lmer aufrufen. Hierin steht:
"na.action
a function that indicates what should happen when the data contain NAs. The default action (na.omit, inherited from the 'factory fresh' value of getOption("na.action")) strips any observations with any missing values in any variables."
In anderen Worten: Die Option na.rm ist bereits automatisch aktiviert, wenn du die lmer Funktion verwendest. Bitte beachte, dass dies auch zu Risiken bei der Datenanalyse führen kann und dass du eventuell deine Daten imputieren solltest. Mehr Informationen findest du hier: statisticsglobe.com/missing-data/
Viele Grüße, Joachim
@SumanGhosh-vn3tx 5 років тому ⁺¹
great
@StatisticsGlobe 5 років тому
@Suman Ghosh Thank you very much! :)
@durduozkarc6345 3 роки тому ⁺¹
# 1. Load R packages
> library("quantstrat")
>
> # 2. Stock Instrument Initialization
>
> # 2.1. Initial Settings
> start.pf start.date end.date Sys.setenv(TZ='UTC')
> init.eq # 2.2. Data Downloading or Reading
>
> # Data Downloading
> getSymbols(Symbols='BMW',src='yahoo',from=start.date,to=end.date)
[1] "BMW"
Warning message:
BMW contains missing values. Some functions will not work if objects contain missing values in the middle of the series. Consider using na.omit(), na.approx(), na.fill(), etc to remove or replace them.
i don't want to see these errors how should i fix it
@StatisticsGlobe 3 роки тому ⁺¹
Hey Durdu, it seems like your data contains missing values. You may remove these missing values using the na.omit function as explained here: statisticsglobe.com/na-omit-r-example/ Please note that removing NA values should be theoretically justified.
@victorresende3140 4 роки тому ⁺¹
I LOVE YOU AAAAAAAAAA
@StatisticsGlobe 4 роки тому
Haha thx ;)
@16kush 2 роки тому ⁺¹
How to Undefined In place of NA?
@StatisticsGlobe 2 роки тому
Hey Kush, could you please explain your question in some more detail? I don't understand what you would like to do. Regards, Joachim
@16kush 2 роки тому ⁺¹
@@StatisticsGlobe sorry for the inconvenience, I meant to ask that if in some table I receive NA than how shall I replace it with some Specific Value Of my choice. In all the cells.
@StatisticsGlobe 2 роки тому
Hey Kush, I recommend using missing data imputation techniques for this: statisticsglobe.com/missing-data-imputation-statistics/
@Jonpaulim 4 роки тому ⁺¹
Hi can I ask a question please
@StatisticsGlobe 4 роки тому ⁺¹
Sure Jonathan, go ahead!
@Jonpaulim 4 роки тому
@@StatisticsGlobe thank you very much, could I maybe send it to you on email or on another platform as the question may be a little long if you’re happy to suggest one ?
@Jonpaulim 4 роки тому
@@StatisticsGlobe Can I ask in R, if I have got 2 data sets, of different rows and columns but I want to merge them and this is based on one of the columns in each data set. So if the first column in dataset1 has 3 values and the first column in dataset2 has 9 values but the way the data is is such that each of the values in the first column of the first dataset maps onto 3 values in the second dataset how do i do it?
@Jonpaulim 4 роки тому
so like if the first column in dataset 1 has values 1 , 2 , 3 and first column in dataset 2 has values 1a 1b 1c 2a 2b 2c 3a 3b 3c and I want to merge the 2 columns based on the numbers but clearly first dataset only has 3 rows second dataset has 9 rows and I want to merge them so I can perform functions on them how do I do it thanks
@Jonpaulim 4 роки тому
@@StatisticsGlobe sorry for the long question. So all this must be done with R base package. Do let me know if you are able to help with this. Many thanks.
@Rhena 3 роки тому ⁺¹
Könntest du das auch noch mal in Deutsch aufnehmen? :D
@StatisticsGlobe 3 роки тому
Hey Rhena, auf diesem Channel lade ich nur englischsprachige Videos hoch. Aber ich habe schon geplant demnächst eine teilweise deutschsprachige Webseite zu erstellen, ich hoffe, das hilft dann weiter! :) Viele Grüße, Joachim
@eyadha1 4 роки тому ⁺¹
Thanks. Very helpful
@StatisticsGlobe 4 роки тому
Thanks for the comment Eyad :)

Наступне

Автоматичне відтворення

Complete Cases in R | Example Code for the complete.cases Function