How the flow of a forwardtest such prediction is: Step 1: Download the csv till current date Step 2: Delete the last 20% of the timestamps of the csv so if it's like data of 2000 till 2023 then cut till 2019 and save this as training_SP500.csv Step 3: Do the opposite of step 2 wich u cut the first 80% off and save it as future_Sp500.csv Step 4: train only on the data of training_SP500.csv Step 5: u test the the prediction for the missing 20% of the current data Step 6: on same chart plot the original data of the current s&p500 data And Voila a chart wich u can clearly see is those prediction where in the right direction!! Nice vid, good education
@@itssardine5351 If you know anything about AI, you generally use 20% as testing data and the rest as training data. Also sometimes too much data can cause overfitting. This means it can predict existing data with ease, but when it comes to new data, it will struggle to perform well.
@@SPONGE2008 what u mean get "the" csv file u mean the one the video creator made(wath is his name again). I use my own csv wich is constructed by quering an api from a dataprovider, of course i don't wanna pay for data so i do this multiple times for diffrent time ranges and paste them togetther again
This type of algo is usefull just to predict the next time event. You have 70% accuracy but you are predicting just the next day price based in the last day price multiple times
Use Your prediction and check the return %, the results is very very poor. The prediction is basically noise with baseline of open price. You can just plot open price vs close price, the are similar to your prediction.
@@AB_498 , because it feed in the current open price and the volume, which is not available in actual trading. You can only get Volume after the market close.
For the volume, you can only get it after the market close. So putting current day of volume while the market is open has no much use, as the volume is changing throughout the day.
Predicting real-time market prices presents significant challenges due to the need for up-to-date data. While it's theoretically possible to predict future prices, accessing real-time data is often limited by market closing times and costly data acquisition. APIs typically provide data once the market closes, hindering the ability to make real-time predictions. Moreover, obtaining live data usually incurs substantial expenses. Therefore, predicting prices for future points requires historical data for training models, making real-time prediction impractical in many cases. In addition to data accessibility challenges, there are inherent complexities in modeling financial markets that further complicate real-time predictions. Financial markets are influenced by a multitude of factors, including economic indicators, geopolitical events, investor sentiment, and market psychology. These factors can lead to sudden fluctuations and volatility, making it difficult to accurately forecast prices in real time. Furthermore, market dynamics are constantly evolving, with new information continuously being incorporated into prices. This makes it challenging for predictive models to adapt quickly enough to capture and react to these changes in real time. Moreover, the presence of noise and randomness in financial data adds another layer of complexity. Despite best efforts to develop sophisticated predictive models, there is always a degree of uncertainty inherent in predicting future prices accurately.
Yeah it does. He is doing EXTRApolation where the future is being predicted. Predicting more than 1 times step into the future in this case compounds the margin of error greatly. This means you’ll want to train on all of the data up until the time period you are trying to predict. This is standard in time series. Even if you want to train on 70 percent to predict the remaining 30 percent you do it in what’s called a rolling forecast way which essentially retrains the model on all the data up to the point you’re trying to predict for every point within that original 30 percent. In machine learning where time series is not involved, an INTERpolation problem, training the model on 70 percent and testing on the remaining 30 is a lot more common and makes sense.
you have to define the model ===> model = XGBRegressor() model.fit(train_data[features],train_data[TARGET]) y_pred = model.predict(test_data[features]) #Accuracy model.score(test_data[features], test_data[TARGET])
@@AI_Vania No, there isn’t. You can’t predict, you’re only building a measurement based system that estimates risk and when to purchase and sell based on estimated risk.
@@Stopinvadingmyhardware What do you mean "you can't predict"? One can build a model to predict future prices or (maybe better) future price distributions. The NN doesn't necessarily make risk estimations or buy/sell decisions, those things may involve logic that is outside of the NN.
the dataset had the entire data including the one at the end. However, he split the dataset into 2 parts. 1 big and 1 small (the one at the end). He fed the bigger chunk of data into the ml model and kept the smaller chunk aside. Once the thing was trained, he asked the model to predict the next prices which he then compared to the smaller chunk of data he did not allow the model to see.
I'm surprised why this video is getting so many views and likes. You made a huge mistake in your logic. This is not the accuracy of your model. This is a coefficient of determination that does not play any useful role in building a trading strategy. The coefficient of determination shows how strongly you can describe the target variable using your features. If you take data from the previous day, and without a model. Your coefficient of determination will be 0.99 In short, it’s not accurate and it’s useless what you did
How the flow of a forwardtest such prediction is:
Step 1: Download the csv till current date
Step 2: Delete the last 20% of the timestamps of the csv so if it's like data of 2000 till 2023 then cut till 2019 and save this as training_SP500.csv
Step 3: Do the opposite of step 2 wich u cut the first 80% off and save it as future_Sp500.csv
Step 4: train only on the data of training_SP500.csv
Step 5: u test the the prediction for the missing 20% of the current data
Step 6: on same chart plot the original data of the current s&p500 data
And Voila a chart wich u can clearly see is those prediction where in the right direction!!
Nice vid, good education
So you’re just using 20% of available data to train? Seems like a whole lot of input is lost…
@@itssardine5351 Well 20% of what u find adaquite ofcourse
@@itssardine5351 If you know anything about AI, you generally use 20% as testing data and the rest as training data. Also sometimes too much data can cause overfitting. This means it can predict existing data with ease, but when it comes to new data, it will struggle to perform well.
ima be so fr i dont know how to get the csv file its like 2am im so tired lol anyone can help? would be appreciated
@@SPONGE2008 what u mean get "the" csv file u mean the one the video creator made(wath is his name again).
I use my own csv wich is constructed by quering an api from a dataprovider, of course i don't wanna pay for data so i do this multiple times for diffrent time ranges and paste them togetther again
This type of algo is usefull just to predict the next time event. You have 70% accuracy but you are predicting just the next day price based in the last day price multiple times
that is right mr future billionaire
@@rjsingh4255 lmao, just to share information bro 😂
so what? fake thing?
Use Your prediction and check the return %, the results is very very poor. The prediction is basically noise with baseline of open price. You can just plot open price vs close price, the are similar to your prediction.
that's true but i wonder why it still got an accuracy of 70% on test data
@@AB_498 , because it feed in the current open price and the volume, which is not available in actual trading. You can only get Volume after the market close.
How do I get it to predict more
actually thats almost useless prediction coz we know volume when only when close is known
should it be more reasonable to use the price the day before to predict the price next day in train and testing?
Smart idea but one days data might not be enough to make a good prediction
it's what he did, the N day open price is equal to the N-1 day close price
@@joao_paiva Totally incorrect closing_price != opening_price_next_day, google it!
Are you sure that it is not learning from the high and low prices of the current day that has not yet closed? How to make sure?
For the volume, you can only get it after the market close. So putting current day of volume while the market is open has no much use, as the volume is changing throughout the day.
Hope you had tutorial for deployment of this and for real time price to predit.
Predicting real-time market prices presents significant challenges due to the need for up-to-date data. While it's theoretically possible to predict future prices, accessing real-time data is often limited by market closing times and costly data acquisition. APIs typically provide data once the market closes, hindering the ability to make real-time predictions. Moreover, obtaining live data usually incurs substantial expenses. Therefore, predicting prices for future points requires historical data for training models, making real-time prediction impractical in many cases.
In addition to data accessibility challenges, there are inherent complexities in modeling financial markets that further complicate real-time predictions. Financial markets are influenced by a multitude of factors, including economic indicators, geopolitical events, investor sentiment, and market psychology. These factors can lead to sudden fluctuations and volatility, making it difficult to accurately forecast prices in real time.
Furthermore, market dynamics are constantly evolving, with new information continuously being incorporated into prices. This makes it challenging for predictive models to adapt quickly enough to capture and react to these changes in real time.
Moreover, the presence of noise and randomness in financial data adds another layer of complexity. Despite best efforts to develop sophisticated predictive models, there is always a degree of uncertainty inherent in predicting future prices accurately.
Nice Video but not useful for real time prediction, volume information will not be available beforehand
Stopped this video when he included 99% of the data in the train_data. Makes no sense
Why , explain me
I'm new in python
Yeah it does. He is doing EXTRApolation where the future is being predicted. Predicting more than 1 times step into the future in this case compounds the margin of error greatly. This means you’ll want to train on all of the data up until the time period you are trying to predict. This is standard in time series. Even if you want to train on 70 percent to predict the remaining 30 percent you do it in what’s called a rolling forecast way which essentially retrains the model on all the data up to the point you’re trying to predict for every point within that original 30 percent.
In machine learning where time series is not involved, an INTERpolation problem, training the model on 70 percent and testing on the remaining 30 is a lot more common and makes sense.
Only predicting the next day (or time interval) for day traders.
i keep getting an error at the "model.score" it says model is not defined
you have to define the model ===> model = XGBRegressor()
model.fit(train_data[features],train_data[TARGET])
y_pred = model.predict(test_data[features])
#Accuracy
model.score(test_data[features], test_data[TARGET])
Where or how did you produce the input file for historical data? Where can it be obtained?
Same question
import pandas_datareader.data as web
# Required steps to setup Yahoo Finance
import yfinance as yfin
yfin.pdr_override()
aapl = web.get_data_yahoo('AAPL',
start='2019-01-01',
end='2020-01-01')@@MeghanaHM
kaggle
This works only for one stock. Is there any way to design a neural network model that can be used to predict stock price of more than one stock?
No, there isn’t.
Yes, you can predict multidimensional stock returns with a nn by having one or more output nodes for each return series that you want to predict.
Just use threads and run the same program for many different stocks
@@AI_Vania No, there isn’t. You can’t predict, you’re only building a measurement based system that estimates risk and when to purchase and sell based on estimated risk.
@@Stopinvadingmyhardware What do you mean "you can't predict"? One can build a model to predict future prices or (maybe better) future price distributions. The NN doesn't necessarily make risk estimations or buy/sell decisions, those things may involve logic that is outside of the NN.
I followed you step by step, but my historical chart is a downtrend for some reason.
check the indexes of your dataset
typo? train_data= data.iloc[:int(.99 * len(data)), :]
test_data= data.iloc[int(.99 * len(data)):,:]
im new to ML, the final plot only show some orange line near the end, what does it mean and where are the rest?
the dataset had the entire data including the one at the end. However, he split the dataset into 2 parts. 1 big and 1 small (the one at the end). He fed the bigger chunk of data into the ml model and kept the smaller chunk aside. Once the thing was trained, he asked the model to predict the next prices which he then compared to the smaller chunk of data he did not allow the model to see.
great tutorials 👍👍👍👍👍
Hey, nice, can you do a tutorial for optimal execution on python using almgren kriss?
shame there's no code (just pitching for sign-up), I'll just type it in:)
Hi antonyhartley9586, could you please elaborate on your statement?
How do I, predict more, like not only the test set for example 3 years forward
this type of code is usefull just to predict the next time event
in the last 3 years we had : covid, war on ukraine. no python script could take that into account
easy money 😂
way of Telling the program is brief . how can we get the csv files for example like yours SPY.csv
you can download it from google or you can get it from google search
Very interesting! But, how could it be used in algotrading?
you can't it's 30-60% accurate. You can't predict black swan events or news.
Thx and god bless 😊
where to get dataset for this
Hi,
You can get the data set for this at patreon.com/computerscience
yfinance
Can see you are a programmer but not a data scientist
Overfitting bruh
Pregunta que le hago a todos, para que te sirve esto xD. Para nada, no te da ninguna info
etni jhak marane ke bad prediction acuuracy only 70 %....11:30..!!!
It is totally useless wtf ??
I'm surprised why this video is getting so many views and likes.
You made a huge mistake in your logic.
This is not the accuracy of your model. This is a coefficient of determination that does not play any useful role in building a trading strategy.
The coefficient of determination shows how strongly you can describe the target variable using your features.
If you take data from the previous day, and without a model. Your coefficient of determination will be 0.99
In short, it’s not accurate and it’s useless what you did
thanks bro