It would be nice, though, that you could be more conclusive and show either a case where LSTM does help predict stock prices, or state that it's not useful at all to predict them.
I really enjoyed the clarity in your video. It would be interesting to see how this model performs with dollar bars. I am new to this area and have been reading time bar sampling has inherent limitations. Thank you for sharing your code, I'd like to dig deeper and this helps tremendously.
Stumbled upon your video today after learning about backpropagation. Really appreciate the clarity in your presentation and being honest about the model's results. Sadly, 53% success rate is no better than flipping a coin. Another factor that lead to many fund managers giving up on Machine Learning in recent years is market reflexivity. Quants are able to predict with high probability using an ensemble of algos but once they place a trade, the prediction goes haywire, due to the trade meddling with the chart pattern. Stock Price prediction is possibly the only ML endeavour that the analyst "poisons" the data after acting on it. Weather prediction on the other hand, a time series I believe where LSTM is deployed, successfully makes predictions because meteorologists are simply observing the atmosphere. Still early days, but I have seen publishings about success predicting an Emerging Market: Vietnam, claiming that it was achieved through multiple inputs from Closing Price to Technical Indicators. No evidence of people profiting from it so it seems like just theory at this point in time... like you mentioned, these papers are meant to make the PhDs look good.
You raise many good points. The reflexivity of the market is what makes this challenge so hard. However, a 53% accuracy would be more than enough to make billions with. RenTec supposedly only achieves 50.75% accuracy on each trade.
The videos you produce are super helpful for me, since I am working hard on understanding the math and logic behind neural networks. When you were talking about it not being suitable for predicting just the next it got me thinking. You could try and predict the price tomorrow and the day after with the same input data. If the price is down tomorrow you do an close market order otherwise you could buy at market open. I am curious what you would think of such trading strategy and maybe issues that arise like for example it being harder to train maybe? Also won't this model perform much better if you were to add loads of extra data like indicators and such. I am so happy I found you, you are super helpful on this journey with your videos!
Really very practical. A lot of people try to predict price instead of return which might appear to give results but is an illusion and is not what we trade on. It sounds like you are not a big believer in the utility of these models but I am curious as to how well they might do at predicting return over a week or a month. It seems counterintuitive that they would be more successful at this but then if looking for patterns that might work and in some ways looking over a time period might actually eliminate some of the noise. In addition I am thinking some of the other potential inputs like sentiment, interest rate future prices, equity option prices might then have more relevance as inputs. Thoughts?
Hi Chris, thanks for the kind words. Predicting price instead of returns never made any sense to me either. I think you raise some good points on using longer timeframes. But there are also downsides to doing that. For one, you either have to choose between using fewer data points (there are fewer weeks than days in a year) or using data that overlaps (Mon-Fri and Tue-Mon would share 4 days). I'm not sure there are any downsides to using overlapping data, but there might be. You could use longer histories of data, but then you risk using data that's not relevant anymore (the behaviour of market participants change). There are no easy answers. That's one of the frustrating things about applying machine learning to finance. As for using a variety of different inputs, I support the idea. But a word of caution from someone who's trodden that road before - extracting actionable insights from those inputs won't be easy.
Great video! I appreciate how you delve into various aspects, unlike other videos that skim over crucial details. One aspect that caught my attention is the delay in prediction. Correct me if I'm wrong, but you utilize a sequence of returns up until a certain date, let's say February 16th in your example, to forecast the return on February 18th (calculated as the closing price on the 18th minus the closing price on the 17th, divided by the closing price on the 17th). My concern is practicality. If the model predicts a +10% return, does it mean one should buy the stock at the closing price on the 17th? But is it feasible to execute a trade at the exact closing price on the 17th? I'm curious about how this works in real trading scenarios.
That's a great question. Yes, if the model predicts a +10% return, it would be best to buy at the closing price on the 17th - doing so would be most consistent with the model. You can actually execute at that price by using the market-on-close order type with your broker. www.investopedia.com/terms/m/marketonclose.asp
This is an extremely useful tutorial; very clear and info-rich. Two questions: The two-week time-sequences you use seem to all be the same length. Could you please share any thoughts that you may have on: a) using varying-length time-sequences, ie different history lengths and b) using time-sequences w/ varying history lengths initially, but that are subsequently padded with zeros so that we end up with the same fixed history length across all time-sequences?
Thanks for the kind words! You can indeed use variable lengths. You'd want a clear rationale for having some inputs being longer than others, though. I'm not aware of a good rationale, but maybe you or others would have. You shouldn't need to pad inputs with 0s. LSTMs can handle variable lengths natively.
@@jinchoi-moneygeekThanks again! I also appreciated how you demonstrated that (vanilla) LSTMs can predict multiple timesteps. In your experience, do you find multiple timestep predictions to be more (or less) reliable from a vanilla LSTM vs an Encoder-Decoder LSTM?
@@jinchoi-moneygeek I am just repeating the code for the main part for other stocks, keep the other parts like the training and NN parts and just save it as a different model. Is there a more convenient way to go about.
Good question. A mixture of experts generally only works when there's a diversity of opinions among the experts. So if some experts use other architectures, or if some lstms use different hyper parameters, the mixture could have sizeable benefits. But if all of the experts are the same, you might not get much of a benefit if any.
Congratulations for being very clear in your explanations and very honest in the model evaluation.
It would be nice, though, that you could be more conclusive and show either a case where LSTM does help predict stock prices, or state that it's not useful at all to predict them.
This is amazing, im just starting out with testing my own model and your points are clarifying, would love more videos on this
I really enjoyed the clarity in your video. It would be interesting to see how this model performs with dollar bars. I am new to this area and have been reading time bar sampling has inherent limitations. Thank you for sharing your code, I'd like to dig deeper and this helps tremendously.
Thanks for the kind words, Paul. Applying dollar bars would be interesting indeed. I'll think about making a video about it.
Stumbled upon your video today after learning about backpropagation. Really appreciate the clarity in your presentation and being honest about the model's results. Sadly, 53% success rate is no better than flipping a coin. Another factor that lead to many fund managers giving up on Machine Learning in recent years is market reflexivity. Quants are able to predict with high probability using an ensemble of algos but once they place a trade, the prediction goes haywire, due to the trade meddling with the chart pattern. Stock Price prediction is possibly the only ML endeavour that the analyst "poisons" the data after acting on it. Weather prediction on the other hand, a time series I believe where LSTM is deployed, successfully makes predictions because meteorologists are simply observing the atmosphere. Still early days, but I have seen publishings about success predicting an Emerging Market: Vietnam, claiming that it was achieved through multiple inputs from Closing Price to Technical Indicators. No evidence of people profiting from it so it seems like just theory at this point in time... like you mentioned, these papers are meant to make the PhDs look good.
You raise many good points. The reflexivity of the market is what makes this challenge so hard. However, a 53% accuracy would be more than enough to make billions with. RenTec supposedly only achieves 50.75% accuracy on each trade.
Hi thanks for the tutorial very useful! I think all are waiting for the version with multiple stock
Noted
The videos you produce are super helpful for me, since I am working hard on understanding the math and logic behind neural networks. When you were talking about it not being suitable for predicting just the next it got me thinking. You could try and predict the price tomorrow and the day after with the same input data. If the price is down tomorrow you do an close market order otherwise you could buy at market open. I am curious what you would think of such trading strategy and maybe issues that arise like for example it being harder to train maybe? Also won't this model perform much better if you were to add loads of extra data like indicators and such. I am so happy I found you, you are super helpful on this journey with your videos!
would using a transformer based model be better? would love a video on that topic
I wonder if you chose stocks for your model that didn't have many or few options contracts if it would remove some of the randomness.
Torch 2.0 isn't available anymore and changing it to torch 2.2 causes the program to crash.
Really very practical. A lot of people try to predict price instead of return which might appear to give results but is an illusion and is not what we trade on. It sounds like you are not a big believer in the utility of these models but I am curious as to how well they might do at predicting return over a week or a month. It seems counterintuitive that they would be more successful at this but then if looking for patterns that might work and in some ways looking over a time period might actually eliminate some of the noise. In addition I am thinking some of the other potential inputs like sentiment, interest rate future prices, equity option prices might then have more relevance as inputs. Thoughts?
Hi Chris, thanks for the kind words. Predicting price instead of returns never made any sense to me either.
I think you raise some good points on using longer timeframes. But there are also downsides to doing that. For one, you either have to choose between using fewer data points (there are fewer weeks than days in a year) or using data that overlaps (Mon-Fri and Tue-Mon would share 4 days). I'm not sure there are any downsides to using overlapping data, but there might be. You could use longer histories of data, but then you risk using data that's not relevant anymore (the behaviour of market participants change). There are no easy answers. That's one of the frustrating things about applying machine learning to finance.
As for using a variety of different inputs, I support the idea. But a word of caution from someone who's trodden that road before - extracting actionable insights from those inputs won't be easy.
Great video! I appreciate how you delve into various aspects, unlike other videos that skim over crucial details. One aspect that caught my attention is the delay in prediction. Correct me if I'm wrong, but you utilize a sequence of returns up until a certain date, let's say February 16th in your example, to forecast the return on February 18th (calculated as the closing price on the 18th minus the closing price on the 17th, divided by the closing price on the 17th).
My concern is practicality. If the model predicts a +10% return, does it mean one should buy the stock at the closing price on the 17th? But is it feasible to execute a trade at the exact closing price on the 17th? I'm curious about how this works in real trading scenarios.
That's a great question. Yes, if the model predicts a +10% return, it would be best to buy at the closing price on the 17th - doing so would be most consistent with the model. You can actually execute at that price by using the market-on-close order type with your broker.
www.investopedia.com/terms/m/marketonclose.asp
Hey Jin, do you think transformers can do a better job than LSTM on stock prediction
This is an extremely useful tutorial; very clear and info-rich. Two questions: The two-week time-sequences you use seem to all be the same length. Could you please share any thoughts that you may have on: a) using varying-length time-sequences, ie different history lengths and b) using time-sequences w/ varying history lengths initially, but that are subsequently padded with zeros so that we end up with the same fixed history length across all time-sequences?
Thanks for the kind words! You can indeed use variable lengths. You'd want a clear rationale for having some inputs being longer than others, though. I'm not aware of a good rationale, but maybe you or others would have. You shouldn't need to pad inputs with 0s. LSTMs can handle variable lengths natively.
@@jinchoi-moneygeekThanks again! I also appreciated how you demonstrated that (vanilla) LSTMs can predict multiple timesteps. In your experience, do you find multiple timestep predictions to be more (or less) reliable from a vanilla LSTM vs an Encoder-Decoder LSTM?
Could you please make a follow-up video or reference materials where you trained an LSTM using multiple stocks?
I'll consider making a video if I get more similar requests.
Please🥺 @@jinchoi-moneygeek
@@jinchoi-moneygeek I am just repeating the code for the main part for other stocks, keep the other parts like the training and NN parts and just save it as a different model. Is there a more convenient way to go about.
Why are the RMSE values so low?
Can you predict the low and high of the next day?
That's an interesting thought. I can try, but I don't know how accurate or actionable the predictions would be.
Could a mixture of experts be used with multiple lstms as experts for better results?
Good question. A mixture of experts generally only works when there's a diversity of opinions among the experts. So if some experts use other architectures, or if some lstms use different hyper parameters, the mixture could have sizeable benefits. But if all of the experts are the same, you might not get much of a benefit if any.
Great video. Thank you.
My pleasure!