fit_transform is used on training data to learn parameters and transform it, while transform is used on new or unseen data to apply previously learned transformations without re-learning the parameters.
fit_transform() is used on the training data to learn the scaling or transformation parameters and then applies the same transformation to the training data. transform() is used on new data (e.g. test data) to apply the same transformation that was learned on the training data.
@@kartiknampalliwar8603 that data is not available in the new version. You can alternatively use "fetch_california_housing" and load it. Probably that is the similar sort of data.
00:01 Practical implementation of linear regression 02:32 Explaining features and target in linear regression 05:22 Preparing data for linear regression 08:12 Understanding data normalization and standardization 11:00 Implementing linear regression using steps 13:08 Implementation of cross-validation for linear regression 15:47 Using negative mean squared error for model optimization 18:21 Verification is crucial for accurate predictions. 20:43 Understanding the practical implementation of linear regression and its key steps 23:11 Linear regression calculates the average change in one variable based on another
Sir , your effort is really wonderfull and is inspiration. please make a separate playlist for EDA and feature engineering , lakhs of aspirants are wait , please make it on serious note.
fit transform and transform = we want to keep as a surprise is no longer unknown to our model and we will not get a good estimate of how our model is performing on the test (unseen) data which is the ultimate goal of building a model using machine learning algorithm.
playlist ke hisab se video dalo sir, theory aur practical implementation ke video mein bahot fark hai , much samjh nhi aya, r2 score, cross validation , xtrain ye sab kya hai theory mein to that he nhi ye sab.
`load_boston` has been removed from scikit-learn since version 1.2. The Boston housing prices dataset has an ethical problem: as investigated in [1], the authors of this dataset engineered a non-invertible variable "B" assuming that racial self-segregation had a positive impact on house prices [2]. Furthermore the goal of the research that led to the creation of this dataset was to study the impact of air quality but it did not give adequate demonstration of the validity of this assumption.
What value of MSE , RMSE, R-square should be taken into consideration to come to conclusion that model build is accurate one? Is there any range of value for MSE, RMSE and R-square ?
I am seeing you videos just to similary apply another multivarite problem but when I got the displot(with kind=kde) it came similar but of the rang eof the 1e^9 so How can decrease the error should I use the tunning or what ?
Sir the boston dataset is no more available in the scikit-learn datasets also can't load the boston dataset in juyter notebook can U please provide any solution for that?
you can use the alternate dataset like california housing , or you can search and save the boston dataset , and use pd.read_csv() method to use that dataset
But what are we predicting here? Can someone explain please..what does the values in "reg_pred" tell us? what is the difference between values in target features array and "reg_pred" values?
so we are predicting the output feature house pricing...for the independent features in x_test, dependent feature or actual values are in y_test. After applying linear regression, predicted values are in reg_pred. In linear regression we find the difference between actual values and predicted values, that is the error. MSE is that error here.
Sir i have build the model in linear regression and performance of evaluation metrics are also done. Now additional I want to add one more new row(instance) and find the performance of it how to do can you guide me pl. How to check the performance particularly that single row.
@kakhanna3585 hi niharika are you a data science student. pehle mene socha inke 38 videos hi hai machine learning ke. and me bhut jldi complete kr lunga. but me 2 video se age hi bdha hi ni abhi tk. ye beginers jaise ni pdha rhe hai. and mujhe ek ek chiz likhna pd rha hai, ki sir kya bol rhe hai video me. and wo atleast definations bhi likhwate to smjne me easy hota. it's difficult to understand. kya apko koi aur playlist pta hai. jisse jo machine learning ke liye ho.
fit_transform is used on training data to learn parameters and transform it, while transform is used on new or unseen data to apply previously learned transformations without re-learning the parameters.
fit_transform() is used on the training data to learn the scaling or transformation parameters and then applies the same transformation to the training data. transform() is used on new data (e.g. test data) to apply the same transformation that was learned on the training data.
Thanks brother
thanks bro
i am unable to do first step i.e load_ boston is showing error can you please help me
@@kartiknampalliwar8603 that data is not available in the new version. You can alternatively use "fetch_california_housing" and load it. Probably that is the similar sort of data.
@@kartiknampalliwar8603 load_boston is no longer available use some other data like load_diabetes or something
00:01 Practical implementation of linear regression
02:32 Explaining features and target in linear regression
05:22 Preparing data for linear regression
08:12 Understanding data normalization and standardization
11:00 Implementing linear regression using steps
13:08 Implementation of cross-validation for linear regression
15:47 Using negative mean squared error for model optimization
18:21 Verification is crucial for accurate predictions.
20:43 Understanding the practical implementation of linear regression and its key steps
23:11 Linear regression calculates the average change in one variable based on another
scaler.transform(X_test) used to calculate mean and stander deviation on test data to be used future scaling .
Sir , your effort is really wonderfull and is inspiration. please make a separate playlist for EDA and feature engineering , lakhs of aspirants are wait , please make it on serious note.
fit transform is used in train data set to predict the value(linear regression) test data set just to se our accuracy with the model.
fit transform and transform = we want to keep as a surprise is no longer unknown to our model and we will not get a good estimate of how our model is performing on the test (unseen) data which is the ultimate goal of building a model using machine learning algorithm.
Can you elaborate more. Please
Please more videos on machine learning also practical video more
Thankyou
playlist ke hisab se video dalo sir, theory aur practical implementation ke video mein bahot fark hai , much samjh nhi aya, r2 score, cross validation , xtrain ye sab kya hai theory mein to that he nhi ye sab.
yoo bhai
same problem bhai theory toh smhj aai pr implementation nahi
`load_boston` has been removed from scikit-learn since version 1.2.
The Boston housing prices dataset has an ethical problem: as
investigated in [1], the authors of this dataset engineered a
non-invertible variable "B" assuming that racial self-segregation had a
positive impact on house prices [2]. Furthermore the goal of the
research that led to the creation of this dataset was to study the
impact of air quality but it did not give adequate demonstration of the
validity of this assumption.
Gradient decent iss implementation mein kaise implement kaise hua?? Agar back end mein hua toh alpha value kaha diya?? @krish please explain
sir sklearn na dataset remove kar deya ha. dataset fetch nahi ho raha ha
Do I need to know sklearn before starting this playlist ?
Boston housing dataset has been removed from scikit-learn. Is there any way to load it as a bunch data??
import pandas as pd
import numpy as np
data_url = "lib.stat.cmu.edu/datasets/boston"
raw_df = pd.read_csv(data_url, sep="\s+", skiprows=22, header=None)
data = np.hstack([raw_df.values[::2, :], raw_df.values[1::2, :2]])
target = raw_df.values[1::2, 2]
or install version 1.0.1
pip install scikit-learn==1.0.1
Nicely taught the algorithm..Thanks for making learning simple
great job sir jee
Sir you explain soo good please continue making videos in hindi
Great video sir...just one concern...Why we are not checking VIF?
*** please create EDA and Feature engineering playlist in HINDI ***
Thank you sir 🙏
sklearn removed load_boston which dataset i can use to follow along?
Fetchcaliforniadataset
@@krishnaikhindican we use fetch_california_housing ??
@@dnswm95 what the concultion of this question can anyone give ans?
Very good Sir
bcoz we evaluate our model on test data set
boston dataset is been reomved from the kcikit liberary
which one do you use right now. facing the same problem bro
same issue
@@azharafridi9619
boston dataset is removed from sklearn
you can go for california housing dataset for same work linear regression. it works.
Thank You
Please keep uploading videos in hindi
Sir please l1 and L2 k liye bhi video banaiye
what is random state= 42 in that train test split command?
Can anyone please explain displot discussed in this video?
i use the same model on 'fetch_california_housing' dataset and the mse i got is 0.5.
Me too.. and the score is 0.34
same, and as mentioned by the other person, my score is 0.33
Good video sir.
sir i can't understand anything,should i learn nummpy and pandas for this
why are we using 'neg_mean_squared_error'
can u please share link linear regression loss function video ?
idid not got the same graph in the end my varinence is more then (-10)--10 wht to do help
Simply amazing ❤
What value of MSE , RMSE, R-square should be taken into consideration to come to conclusion that model build is accurate one? Is there any range of value for MSE, RMSE and R-square ?
R2 to be gerater than .70 that is 70%
How is it that you are predicting on x_test but calling your y_test as truth value?
The Boston datasathas been removed from sklearn....
Sir when new batch start for data science?
Great thank you
nice video sir
Sir this can be explain in English language some what difficult to understand Hindi
I am seeing you videos just to similary apply another multivarite problem but when I got the displot(with kind=kde) it came similar but of the rang eof the 1e^9 so How can decrease the error should I use the tunning or what ?
Sir the boston dataset is no more available in the scikit-learn datasets also can't load the boston dataset in juyter notebook can U please provide any solution for that?
you can use the alternate dataset like california housing , or you can search and save the boston dataset , and use pd.read_csv() method to use that dataset
please use " from sklearn.datasets import fetch_california_housing " alternative of Boston
here linear regression doing but why taken independent features more than 1 feature can anybody tell me
bouncer ho gya ye video
where is the theory playlist? can someone please attach the link in the reply to this comment
Sir i didnt able to understand what is y the dependent variable . I mean which column is gettimg predicted ?
Krish ap kon sa video software use kerto ho recording k liye
Sir in this where us accuracy
But what are we predicting here? Can someone explain please..what does the values in "reg_pred" tell us? what is the difference between values in target features array and "reg_pred" values?
so we are predicting the output feature house pricing...for the independent features in x_test, dependent feature or actual values are in y_test. After applying linear regression, predicted values are in reg_pred. In linear regression we find the difference between actual values and predicted values, that is the error. MSE is that error here.
kind='kde' not showing that graph
subscribed
sir if my accuracy_score is 0.85 then my predication model is good or bad?
Kuch samajh nahi aa raha...
Sir i have build the model in linear regression and performance of evaluation metrics are also done. Now additional I want to add one more new row(instance) and find the performance of it how to do can you guide me pl. How to check the performance particularly that single row.
Bro I couldn't understand this to that level how can I understand these concepts as sir is directly implemented it so
How to know model is overfitted or stable model.
if it perfectly fits to the traning data in simple meanings if it remember the data instead of learning it overfits
Why Y_train is not standardized? Please answer
Standardization is typically applied to the feature variables (X_train) rather than the target variable (Y_train) in machine learning.
too complex🙁
what is score?
sir ap dataset bhi dal diya kro.
load_boston to ho ni rha hai hmara.
kaise kre ab
same problem, kuch solution mila?
@kakhanna3585 hi niharika are you a data science student.
pehle mene socha inke 38 videos hi hai machine learning ke.
and me bhut jldi complete kr lunga.
but me 2 video se age hi bdha hi ni abhi tk.
ye beginers jaise ni pdha rhe hai. and mujhe ek ek chiz likhna pd rha hai, ki sir kya bol rhe hai video me.
and wo atleast definations bhi likhwate to smjne me easy hota.
it's difficult to understand.
kya apko koi aur playlist pta hai. jisse jo machine learning ke liye ho.
use fetch_california_housing class alternative of boston
load boston has been removed
please use " from sklearn.datasets import fetch_california_housing " alternative of Boston
Bhai fetch wali BHI Nahi Chal rahi
Plz do videos in English🥲
If you're looking for videos in English you can refer to his other channel. You will find all the videos in the English language.
i got a score of "0.017460452225004253" why i got low score?
Same prob
Noice
Esi video bnaya naa kro jo kisi k leptop m work naa kre kya jya oerte ho smj nhi atta padhne bethq toh sara dinak khrab ho gya kuch hua nhi
sir aap JSPM Tathwade ke student hai kya
youtube shanel
Boston housing dataset has been removed from scikit-learn. Is there any way to load it as a bunch data??
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()
from sklearn.datasets import fetch_california_housing use this