Thanks a lots Dr, you have made R language very easy to read for me. I have question D. I analysed income data which showed positive skew then I didn't apply normally distributed to fit the data. So what can I do to fit the distribution because I want to have interval estimate of the population income using sample data. I think exponential distribution is suitable to fit the data but how to fit using R? Thanks Dr
Sir, your videos have been very helpful for self-learning R. Always very clear. Thank you so much! Could you please tell whether there is a method to analyze and interpret how well our model works with testing data? Can we compare the means of the outcome derived from the model, with original outcome data in the testing data, using t-test?
Instead of cutoff, you can use it as a benchmark. Let's say you run a model and get R-sq 0.65. And then you make changes to the model and get r-sq of 0.74. So now you will know that changes to the model are yielding positive outcome.
Can we do the same procedure of multiple linear regression for timeseries data to find the factors affecting a dependent variable. I have converted the whole raw data into its differences I.ie. Present value minus past. I have done this to remove autocorrelation that occur in time series. Now model variables will be Change in production - dependent variable Change in rainfall - independent variable Change in temperature- independent variable Change in area - independent variable For 17 years. Ami following right track. I have followed the same way in R for creating the multiple regression model for cross sectional data
Sir, In Multiple Regression Model, Do we have to consider only the significant independent variables and then do other tests like BP, DW,ad.test, BG,VIF etc for the Linear model to be good or we need to include all the variable both significant and insignificant variables for the further process? Please help.❤️
Thank you for these videos, I really benefit from them. Can I ask a question? I was going through an example on kaggle and the author used the dummyVars function. Do you think you can explain how it works when applied to a dataset? Again I really appreciate these lessons
@@bkrai I'm sorry I should've included the link to the sample code in my initial question: www.kaggle.com/virosky/the-only-way-to-handle-missing-values/notebook I am not too sure what the function does when applied to a dataframe as done in the example I am referring too. The piece of code using the dummyVars function is towards the end of the "exploratory data analysis" section after opening the link I provided. Thank you for the reply.
I have learnt a lot from your videos. Thank you
You are welcome!
Thankyou Dr Rai
Welcome!
I got to know something more about data cleansing, thank you sir!!
Welcome!
Hi,
Many thanks, very clear as usual.
I have one suggestion about replacing by the mean:
vehicle$lh[vehicle$lh==0]
That's even better!
thank u sir .plz put videos for multiclassfication
You can refer to this:
ua-cam.com/play/PL34t5iLfZddvv-L5iFFpd_P1jy_7ElWMG.html
Thank you very much but really need your assistance
Let me know your question.
Great work. Congratulations.
Thank you! Cheers!
Thanks a lots Dr, you have made R language very easy to read for me. I have question D. I analysed income data which showed positive skew then I didn't apply normally distributed to fit the data. So what can I do to fit the distribution because I want to have interval estimate of the population income using sample data. I think exponential distribution is suitable to fit the data but how to fit using R?
Thanks Dr
You can try log transformation. I would also suggest try this:
ua-cam.com/video/_3xMSbIde2I/v-deo.html
Thanks Sir
Welcome
Sir, your videos have been very helpful for self-learning R. Always very clear. Thank you so much!
Could you please tell whether there is a method to analyze and interpret how well our model works with testing data? Can we compare the means of the outcome derived from the model, with original outcome data in the testing data, using t-test?
You can make a plot of actual and predicted values with test data. And obtain R-sq.
@@bkrai Thank you very much, Sir!
Is there a cut-off of R-sq value which is required to have a good agreement? I have read "R-sq value
Instead of cutoff, you can use it as a benchmark. Let's say you run a model and get R-sq 0.65. And then you make changes to the model and get r-sq of 0.74. So now you will know that changes to the model are yielding positive outcome.
@@bkrai Thank you very much Sir....!
Welcome!
Great work Dr. Do you mind putting together some videos on how to analyze Liberty scale data in R? Thanks in advance
I meant Likert scale data, sorry about that
Great suggestion! I've added it to my list.
Can we do the same procedure of multiple linear regression for timeseries data to find the factors affecting a dependent variable. I have converted the whole raw data into its differences I.ie. Present value minus past. I have done this to remove autocorrelation that occur in time series. Now model variables will be
Change in production - dependent variable
Change in rainfall - independent variable
Change in temperature- independent variable
Change in area - independent variable
For 17 years. Ami following right track. I have followed the same way in R for creating the multiple regression model for cross sectional data
You need time-series with regressors:
ua-cam.com/play/PL34t5iLfZdduRvHafEKM6vrDmfnlUfzAy.html
Sir, In Multiple Regression Model, Do we have to consider only the significant independent variables and then do other tests like BP, DW,ad.test, BG,VIF etc for the Linear model to be good or we need to include all the variable both significant and insignificant variables for the further process?
Please help.❤️
I would suggest check for multicollinearity before removing non-significant variables.
@@bkrai Thank you so much ❤️.
Thanks for your work!
Welcome!
Great job! Thanks!
Welcome!
Thank you for these videos, I really benefit from them. Can I ask a question? I was going through an example on kaggle and the author used the dummyVars function. Do you think you can explain how it works when applied to a dataset? Again I really appreciate these lessons
Thanks! Do you remember what method they were using?
@@bkrai I'm sorry I should've included the link to the sample code in my initial question: www.kaggle.com/virosky/the-only-way-to-handle-missing-values/notebook
I am not too sure what the function does when applied to a dataframe as done in the example I am referring too. The piece of code using the dummyVars function is towards the end of the "exploratory data analysis" section after opening the link I provided. Thank you for the reply.
They have used xgboost. It is one of the must know methods in top 10 link below:
ua-cam.com/play/PL34t5iLfZddsQ0NzMFszGduj3jE8UFm4O.html
@@bkrai Thank you very much Sir
Can We get R files Sir!
Added a link in the description.
Whoever read this you'll be successful one day, let's help grow this channel together for the future🤑❤
Thanks for your comments!