Cointegration and pair trading in Python: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7 Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Hi thank you so much for this content . Would you be able to give me some insight as to what excel version is best between 2016 and 2019 for importing web stock prices ect , I have some trouble doing with excel 2010 lol guess I finally have to upgrade . I was thinking 2019 but then I read that they took the option away to import data directly from the web on 2019 and that 2016 was better for this . Any recommendation you can offer would be appreciated ? Thanks again , I'm sending you a tip on patreon right now because your content is perfect and I actually bought a course from quantsi with very similar content and for some reason it is missing the videos for this same topic ,yet no customer service has replied me after two attempts in a two week period either . Then I stumble on your content , how lucky is that !! :)
These videos are way too good to be free. I am on my way to watch and follow every single one. Would you be interested in making a beginner to intermediate series? Perhaps it's a boring topic but if there's one person who can make a tutorial like that, with relevant content and real life application examples it's you. I know i'd binge it. Thank you very much.
Hi Jorge, and thanks so much for the kind words! As a matter of fact, I have already got many introductory videos into several statistics, econometrics, and investment management topics. Less challenging videos are typically placed towards the beginning of the respective playlists by design :) Please check the respective playlists on the channel page. But there will definitely be more beginner-style videos to come! Hope it helps!
Good explanation. But as usual, this sample works with "In-Sample Data". This mean your Slope/Intercept etc ... are based on the entire series and after you re-apply the estimation (and residual) on it. In real financial market we must work using "Out-Sample Data". Apply the past stats for the current day and this for each days in the future. And that's will give you a bit more trouble to get P&L trades over trades ;-) But thanks to get time to give the main idea behind Cointegration.
Hi Rick, and glad you liked the video! Yes, you are absolutely right, for the direct application of the concept to trading, it would require the estimation of the parameters on historical data and then application of it to real-time data. However, applying out-of-sample testing here would lengthen the video substantially, I might do something along these lines in the future, thanks for the suggestion :)
@@flanker6212 Hi, and thanks for the comment! Actually, a video on that has already been released, so check it out if you are interested :) ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
6:50 slope 7:00 intercept 7:40 cointegration 8:19 X (stationary) 9:36 delta X 9:56 lagged X 10:16 coefficient 13:10 testing results 16:45 beta 19:19 t-stat characteristics and how to read them
Hello Savva! Another awesome video, but I notified something. When you are calculating slope and intercept you're calculating it over some period of time (10 years in this case), and you are actually starting to apply that calculation from backwards from the beginning (so called in-sample-data). However to test the model properly we would need out-of-sample data, right? To get as much real model as possible wouldn't be better to divide sample to training data (roughly 75-80% of observations) and that to consider as in-sample data and testing data (20-25% of observations) and that to be considered as out-of-sample data?! Is my model of thinking ok - to apply exact the same procedure but to divide the entire dataset to training and testing datasets?!
love your videos good stuff indeed. Again the in sample issue is important. Otherwise we have "fitted" the data. We should use this strat only when the t-stat is statistically significant: this brings the question HOW MUCH data are needed (1 year, 5 years, 10 years). If the t-stat is not significant then we should not use this strat. Costs also are crucial: you have to pair the NOTIONAL value of both stocks at the end of the trading sessions because you subtract the returns of the 2 stocks hence you imply that at the open you have, say, $1mill vs $1 mill. There is therefore the cost of switching the position between the 2 names and the daily fettling on both sizes. Finally, for HOW LONG those t-stats are good and then start to decay. As we need a green light to start trading the pair, we need a red light to stop trading the pair. Thank you for all the great work you do.
Hi Sava, Thank you for great video. Besides early commented issues with using in sample data, I believe you need to correct the way how you compute the portfolio performance for the pair. In your spreadsheet the pair return in column K formula assume that you take both shares in equal sizes. In fact you trade Slope times stock 1 vs stock 2. The portfolio return column K shall be adjusted and multiply Slope to daily return of stock 1 before subtracting from stock 2.
Hi Dmitry, and glad you liked the video! As for your questions, there are indeed various approaches to deriving a market-neutral position from a cointegrated pair, I opted to go with the simplest one here. Another possibility would be to weigh stocks in inverse proportion to their market betas. Using the slope from the long-run equilibrium equation would not make that much sense to multiply returns as then the model would advise you to take extra leverage if a stock undergoes a split :) If we are talking in terms of share positions, then yes, buying slope x stock 1 versus selling one stock 2 is correct, but it is already accounted for in the calculation where we use a zero-investment strategy (+100% in stock 1 and -100% in stock 2). Hope this helps!
Hi Savva! Thank you for sharing this invaluable strategy for free on UA-cam. Actually, I'm a big fan of your channel and I had subscribed to your channel since a year ago. One question pertaining to the pair trading strategy that you've explained. I see that you are computing the slope and the intercept according to all available historical data. And you are computing the cointegration and backtesting using the slope and the intercept that you've computed before as if we know the future data. Shouldn't we compute the slope and the intercept according to latest historical data only? (just like we are computing moving average) so the data should be moving according to the analyzed cointegration and backtest period cell.
Dear Sava, first of all thank you for this awesome channel and your amazing videos. You really help us. I would like to ask you if you could create some videos using eviews too. For instance, could you make a video for pair traiding using eviews and showing us rolling regressions on eviews for this purpose? Thank you again for your amazing content!
Great video. Only part I am struggling with is to determine how much u invest in long and short positions. You say we have 100 USD, but for how much you buy google and short microsoft? Thank you for clarifying!
Hi Balazs, and glad you liked the video! In the simplest form of the strategy, you open a 100 USD worth of a long position of Google and 100 USD worth of a short position in Microsoft. There are also versions of the strategy with beta loading that some researchers and traders believe achieves more market neutrality, when you adjust these values by stocks' betas. I touch upon that in this video: ua-cam.com/video/x_xoq6eY85s/v-deo.html. Hope this helps!
Hi Anthony, and thanks for the question! Exactly, to avoid this issue the cointegrating equation should be estimated before trading is executed. I address this in my Python tutorial on pair trading, check it out: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
Hi Sava Nice information...thanks. i am facing problem to find out standard error because the formula u mentioned calculates only coefficient, please help
Hi Nitin, and glad you enjoyed the video! You can find the standard error for the regression using the STEYX formula if you are using the SLOPE and INTERCEPT approach. If you are interested in the standard errors of coefficients themselves, you can use LINEST, and they will appear in the second row of the LINEST template under the respective coefficients. You can select two rows instead of one to get standard errors in the second row, or use Enter to enforce the formula, and the whole template will spill automatically.
This video has explained many challenging calculation tasks in excel with great simplicity. Looking forward to one of the critical outputs of ADF TEST seems to be P-VAULE, which is not calculated in EXCEL. Request if its calculation is included would to helpful. Thanks Nisha Patel
Hi Nisha, and thanks for the question! The p-value is quite challenging to calculate directly in Excel so I opted to use critical values instead. Might do it at some point in the future!
@@NEDLeducation Thanks for update. Request, if you can suggest how reliable are Critical Values compared to P-VALUE to take a decision to initiate trade. Thanks, Nisha
@@nishapatel8195 Hi again Nisha, these are equivalent, comparing the test statistic to, say, a 95% critical value is equivalent to comparing a p-value to 5%. Therefore, if a test statistic exceeds the 95% critical value, it is by definition the same as a calculated p-value being less than 5%.
@@NEDLeducation Thanks for update. Will Back Test it. However, would request aid with P-VALUE calculation in near future. Once again Thanks... Thanks, Nisha Patel
Dear Sava, Thanks for the great content - loved it. One quick question though: you use actual time series values for the regression (slope & intercept). And then you use these values (slope & intercept) to calculate the cointegration and the "fair value" and base your trading strategy on it. Isnt that approach flawed as you use the value of the slope & intercept for determining each days cointegration, altough in reality you would have the informaton of slope & intercept available yet because you only have past data available? in reality you dont know yet how the value will be distributed in the next years - so how can you use the slope & intercept of future price informations that are not available at the point in time when you do your analysis? In other words: why didnt you use histroical price data for the slope & intercept before caluclated the first day cointegration value or applied a rolling average?
Hi Bond, and many thanks for the question! You are absolutely right, this is just an illustration that backtests a trading strategy. To be on a safer side, it makes sense to separate your data sample into a training set (that you use to estimate the parameters) and a test set (that you use to study how well a strategy based on pre-estimated parameters performs). It just would be overly complicated for a short video like that, but thanks so much for pointing this out. Hope it helps!
@@NEDLeducation Perfect - thanks a lot for the fast answer. I really like your video and everything you do - just fantastic work! Will check out also your other content and promote it to some colleauges. All the best from Munich, Gemany.
@@NEDLeducation Hey Sava! Any chance you could elaborate how you would move forward with the training and test set? I'm looking for input on how to estimate the MA and after determine the parameters for the strategy. Loved the video btw!
@@OR4NG3Productions Hi Martin, and glad you liked the video! As for your question, you can roughly split the data 50/50, estimating parameters on the more distant data and simulating strategy performance on the more recent data, could be a decent first step solution.
@@NEDLeducation hello Sava, great video. I've tested this on another pair of my choice using this method... Thanks for this kind of work.. could you please also do video as how to do this on a rolling over basis ?
Really an excellent video - thank you. What is the best way in Excel to find the coefficient of determination without an intercept, as you are doing, without using Excel's LINEST function?
Hi Eric, and glad you liked the video! The function you probably imply is RSQ - just like INTERCEPT for the constant and SLOPE for the coefficient, it can calculate the coefficient of determination for two arrays. Hope it helps!
@@NEDLeducation Thank you. I had tried that, but I get two different results using LINEST 0.0026 and RSQ 0.0012, which has meaningfully different results. Also, if I wanted to retain you for a short consulting project on this to make this analysis more complete. How could I contact you?
@@EricK-vu3sf Hi Eric, the RSQ coefficient gives you the R-squared for the simple linear regression while the LINEST can give you the RSQ for a multiple linear regression, this is probably the source of the discrepancy. If you would like to arrange a one-to-one tutorial with me for your project, this is a perk I provide to my Patrons, please check the link here: www.patreon.com/NEDLeducation Hope it helps!
Hi Sebastian, and glad you like the channel! I have actually got a Python tutorial on algorithmic pair trading strategies using cointegration, please check this out if you are interested: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
Hi Sava, Thanks for sharing your knowledge. Can I apply the same method for the same index, for example SP&500 index against Emini SP&500 next expiration. I trade in Brazil using Neologica platform and I'm able to subtract the emini contract from the maint index, however was not able replicate in the Metatrader. Do you know how can I do that ? Thanks in advance
Hi, and glad the video helped! As for your question, yes, you can apply the logic of cointegration for such arbitrage, however in terms of data downloading from specific trading platforms, I cannot advise in that much detail. If I understand correctly, you want to arbitrage on S&P500 and S&P500 futures pair, for this you might want to check my video on the law of one price (ua-cam.com/video/xjikItI6zcU/v-deo.html) and futures arbitrage (ua-cam.com/video/8otkAibdC94/v-deo.html). Hope it helps!
good video, I have a problem, I did the same thing on EXCEL file but still don't figure out how you calculated coefficient and standard error at the same time
Hi Xuhui, and thanks for the question! I believe you refer to LINEST? The trick is to input "1" (or "TRUE") for "additional statistics" and select two rows for the output so standard errors are shown there.
Thanks Sava for the inspiring video. regarding the t critical values are they D-F Values? because they are not consistent with the values in the tables and with your examples in your Unit root file.
Hi Eduardo, and glad you liked the video! As for your question, the Dickey-Fuller test is applied to the difference between the price of Google and the "fair value" calculated using the dynamic equilibrium, which serves as a residual in this instance. Hope it helps!
@@NEDLeducation Thank you for your answer. One last question, please: When doing the regression, what is the best option: 1.- To do it with the slope and intercept, 2.- To do it only with slope 3.- To do it with none of both. Why?
@@eduardopossealvarez7717 It depends on what you are running the Dickey-Fuller test for. If it is for pair trading, the go-to option would be 3) as you want the dynamic equilibrium to be stable.
Hi Savva, your video is great. In order not to have to take short positions, is it possible to use two assets with an inverse relationship in the cointegration?
Hi Rodrigo, and happy you liked the video! Excellent question, while it is possible in theory to have two assets that are "negatively cointegrated", and then you can do pair trading the way you stated, in practice it can be hard to find such assets.
Hi, and glad you liked the video! Generally in cointegration strategies you consider separate pairs (so in your case in would be five separate models and long-short decisions made for each pair individually), there are some techniques to estimate a similar equation in a panel setting (vectoral autoregression, for example), but it would be much trickier to implement and it is rarely used in practice for trading strategies as far as I know. Hope it helps!
Thank you very much for the video, very clear and efficient. Unless I am mistaken, there is quite a mistake in the strategy you are designing though. Indeed, it is not a beta neutral one (or call it market neutral) at all, since you are trading one vs one ! You are then left with quite a big market exposition (in that case, initially you should trade 1 vs 7.15). If you do so, you'll find that total return is affected quite significantly (positive or negative based on luck...)
Hi Jonathan, and glad you liked the video! As for your question, yes, there are many developments that address this limitation of the simple pair trading strategy, and I have got a video on beta loading as one of such methods, check it out if you are interested: ua-cam.com/video/x_xoq6eY85s/v-deo.html
Hi Saba. This is really a great video. I’m just wondering when we use this cointegration pair trading strategy, how can we determine the entry and exit point?
Hi Jade, and glad you liked the video! The exit point is naturally when the pair crosses back and the dynamic equilibrium is restored. The entry point can be determined using the degree of cointegration (t-stat needs to be high enough in terms of magnitude), and profit opportunity (the pair needs to be far enough from equilibrium). Hope it helps!
@@NEDLeducation how to determine if the pair is far away from equilibrium? Is there a mathematical function to find it out in this video Sorry for this amateur question... I'm new to this
Seems like the LINEST based upon OLS linear regression wasn't the BLUE and hence optimisation was necessary, it is possible that using GLE regression would have done better?
Hi Sava, many thanks for the video, it was very helpful. I have just 1 question- how did you get the t critical values of 10%, 5% and 1% that I see in the Excel? I see the values of -1.616, -1.941 and -2.567 respectively. But I don’t know where these figures are from or how they were derived. Please can you clarify? Thank you very much.
Hi, and glad you liked the video! These are tau distribution critical values which are appropriate to use for this stationarity test but unfortunately hard to calculate, therefore you generally just look them up in a table.
Привет Савва и Большое спасибо. It’s a very useful video. But I don’t understand the logic of optimizing beta for better t stat. If that is the case we don’t need to calculate beta before for linear regression since it’s value will be overwritten. I have programmed this in Amibroker and the beta optimization is not improving the results in a real situation (only using available information at the moment). Another question is regarding the t test. If it fails during the life of the trade do you recommend to close the spread or to continue with it. There are other approaches to trade in a non continuos way that improve the solution, like a z score of the residuals. Maybe I can share my results with you. Thank you.
Hi and glad you liked the video! The t-stat optimisation is executed as the OLS estimate of "b" parameter will be biased (prices are autocorrelated). It does not matter that simple linear regression estimates can never be used, they just will be less reliable on average. In terms of optimal entry and exit, the strategy can be tweaked and modified in many ways, you can implement stop-losses for instance, or check whether the pair breaks out of cointegration and close the trade if so happens. This is where finance becomes more of an art and a matter of judgement than a science. Hope it helps!
Hi David, I have just recorded a series of intro-level videos on T-tests, check these out if you are interested, starting from ua-cam.com/video/9C6oz_mADSE/v-deo.html. However, in unit root tests, the t-stat is not a Student t-stat used in regular t-tests but rather a Dickey-Fuller t-stat (or sometimes also called "tau-stat"). For a brief introduction to it, I would recommend economics.utoronto.ca/jfloyd/book/statabs.pdf Hope it helps!
Hi, and thanks for the question! A t-stat of -3.57 shows the combination is stationary meaning that the pair is cointegrated. As a rule of thumb, -2.5 is a pretty good result, -3 is a very good result, and anything below -3.5 is very reliable.
@@NEDLeducation I was checking co-integration for two Forex pairs... Major pairs.. they move in inverse directions... My t-stat value for those pairs was positive... 2.2 What does that mean? I couldn't find positive values for t-stats
I've double checked something... Based on the last 8 months, t-stat is -2.12 and based on the last three months it's 3.79 But still the month immediately after those three months resulted in losses 😂 what exactly am I doing Wrong ? Or is there any concept that I'm missing I'm presuming that even my returns generated from this method are mean reverting since they were extremely good in the last three months' data that I have used
Hi Anil, and glad you liked the video! As usual, from Google Drive link in the pinned comment, it is being constantly updated as more content is released :)
Hello! I´ve seen one practioner explaining the way he uses cointegration is that he would sell/buy X amount dependent asset and buy/sell the X*beta the other asset that are cointegrated, is it correct to assume that we need a "beta neutral" trade?
Hi Guilherme, yes, this is one of the ways cointegration is implemented by practitioners to get a value of beta that is closer to zero. This procedure is more assumption-sensitive than the one I showed in the video (stocks' beta can change with time), and even the simplest strategy when you long the undervalued and short the overvalued stock as per the cointegrating equation achieves a beta that is very close to zero.
You need to trade $ neutral and may also be beta neutral. You would have some market risk bias potential if you dont hedge out the difference in betas of each stock. The compounded return would either too long beta or too short beta potentially even when you are $ neutral. You may need to overlay market risk so you would have $ neutral beta netral mean reversion strategy, and then your benvhmark would not be index returns but return over risk free potentially.
Hi, and thanks for the comment! Generally, the opinions of different practitioners and academics vary widely over whether beta loading is attractive, and I have got a tutorial that implements various advanced techniques in pair trading, including beta neutrality: ua-cam.com/video/x_xoq6eY85s/v-deo.html
Hi Mithun, and thanks for the question! These are not conventional t-statistics but rather bootstrapped Dickey-Fuller t-statistics, and it is the easiest to just look them up in a table the old-fashioned way :)
Is this cointegration analysis done using rolling window of previous N prices? If not how are the results reliable, as the analysis assumes it know future prices correctly? If yes, which part of the formulas give the rolling window size? Appreciate any clarifications, if someone is still looking at this video and has subject matter expertise.
for calculating the strategy return , is it not right to multiply MFST return with slope , is this not correct (F2>0 , J3-slope*I3 , slope*I3-J3) or (F2>0 , J3-7.15*I3 , 7.15*I3-J3) ?
Sir, Have you tried pair trading forex using rsi7 ,rsi14,rsi30 (add them up for comparison) say on hourly chart & selling strong pair & buying weak pair--pairs have to be highly correlated(eg aud/usd and nzd/usd OR dow30 & sp500).One can do this on any correlating underlying stocks/commodities/futures/crypto/bonds. Trading on hourly charts there would be tons of opportunities all year around.
Hi Darshan, and thanks for the question! If the t-stat is either positive or negative but too small to pass the Dickey-Fuller significance threshold, the series cannot be treated as cointegrated, and the pair trading strategy would not work. Hope it helps!
No you have to know what ones you need to trade or you will be waiting for a profit that can take months or years to profit and costing you lots of money in overnights holding
Cointegration and pair trading in Python: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7
Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation
Hi thank you so much for this content . Would you be able to give me some insight as to what excel version is best between 2016 and 2019 for importing web stock prices ect , I have some trouble doing with excel 2010 lol guess I finally have to upgrade . I was thinking 2019 but then I read that they took the option away to import data directly from the web on 2019 and that 2016 was better for this . Any recommendation you can offer would be appreciated ?
Thanks again , I'm sending you a tip on patreon right now because your content is perfect and I actually bought a course from quantsi with very similar content and for some reason it is missing the videos for this same topic ,yet no customer service has replied me after two attempts in a two week period either . Then I stumble on your content , how lucky is that !! :)
Your explanation and teaching manner is perfect ! With Regards
You have a real knack for teaching. This video is superb. Thank you.
Thanks for the super video. You've simplified the concept of correlation and cointegration. Appreciate it.
These videos are way too good to be free. I am on my way to watch and follow every single one.
Would you be interested in making a beginner to intermediate series? Perhaps it's a boring topic but if there's one person who can make a tutorial like that, with relevant content and real life application examples it's you. I know i'd binge it.
Thank you very much.
Hi Jorge, and thanks so much for the kind words! As a matter of fact, I have already got many introductory videos into several statistics, econometrics, and investment management topics. Less challenging videos are typically placed towards the beginning of the respective playlists by design :) Please check the respective playlists on the channel page. But there will definitely be more beginner-style videos to come! Hope it helps!
@@NEDLeducation Thank you, i'll check those out!
Good explanation. But as usual, this sample works with "In-Sample Data". This mean your Slope/Intercept etc ... are based on the entire series and after you re-apply the estimation (and residual) on it. In real financial market we must work using "Out-Sample Data". Apply the past stats for the current day and this for each days in the future. And that's will give you a bit more trouble to get P&L trades over trades ;-) But thanks to get time to give the main idea behind Cointegration.
Hi Rick, and glad you liked the video! Yes, you are absolutely right, for the direct application of the concept to trading, it would require the estimation of the parameters on historical data and then application of it to real-time data. However, applying out-of-sample testing here would lengthen the video substantially, I might do something along these lines in the future, thanks for the suggestion :)
@@NEDLeducation I would also be super interested in this follow up video!!
@@flanker6212 Hi, and thanks for the comment! Actually, a video on that has already been released, so check it out if you are interested :) ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
6:50 slope 7:00 intercept 7:40 cointegration 8:19 X (stationary) 9:36 delta X 9:56 lagged X 10:16 coefficient 13:10 testing results 16:45 beta 19:19 t-stat characteristics and how to read them
Just.... WOW!!! 👏🏻
Useful info. Thx
Hello Savva! Another awesome video, but I notified something. When you are calculating slope and intercept you're calculating it over some period of time (10 years in this case), and you are actually starting to apply that calculation from backwards from the beginning (so called in-sample-data). However to test the model properly we would need out-of-sample data, right? To get as much real model as possible wouldn't be better to divide sample to training data (roughly 75-80% of observations) and that to consider as in-sample data and testing data (20-25% of observations) and that to be considered as out-of-sample data?! Is my model of thinking ok - to apply exact the same procedure but to divide the entire dataset to training and testing datasets?!
love your videos good stuff indeed. Again the in sample issue is important. Otherwise we have "fitted" the data. We should use this strat only when the t-stat is statistically significant: this brings the question HOW MUCH data are needed (1 year, 5 years, 10 years). If the t-stat is not significant then we should not use this strat. Costs also are crucial: you have to pair the NOTIONAL value of both stocks at the end of the trading sessions because you subtract the returns of the 2 stocks hence you imply that at the open you have, say, $1mill vs $1 mill. There is therefore the cost of switching the position between the 2 names and the daily fettling on both sizes. Finally, for HOW LONG those t-stats are good and then start to decay. As we need a green light to start trading the pair, we need a red light to stop trading the pair. Thank you for all the great work you do.
Can’t thank you enough ❤️❤️🤗
Awesome video!
thank you
Hi Sava, Thank you for great video. Besides early commented issues with using in sample data, I believe you need to correct the way how you compute the portfolio performance for the pair. In your spreadsheet the pair return in column K formula assume that you take both shares in equal sizes. In fact you trade Slope times stock 1 vs stock 2. The portfolio return column K shall be adjusted and multiply Slope to daily return of stock 1 before subtracting from stock 2.
Hi Dmitry, and glad you liked the video! As for your questions, there are indeed various approaches to deriving a market-neutral position from a cointegrated pair, I opted to go with the simplest one here. Another possibility would be to weigh stocks in inverse proportion to their market betas. Using the slope from the long-run equilibrium equation would not make that much sense to multiply returns as then the model would advise you to take extra leverage if a stock undergoes a split :) If we are talking in terms of share positions, then yes, buying slope x stock 1 versus selling one stock 2 is correct, but it is already accounted for in the calculation where we use a zero-investment strategy (+100% in stock 1 and -100% in stock 2). Hope this helps!
Hi - can you further explain how & why the negative cofficent indicates the lack of a unit root?
Hi Savva! Thank you for sharing this invaluable strategy for free on UA-cam. Actually, I'm a big fan of your channel and I had subscribed to your channel since a year ago.
One question pertaining to the pair trading strategy that you've explained. I see that you are computing the slope and the intercept according to all available historical data. And you are computing the cointegration and backtesting using the slope and the intercept that you've computed before as if we know the future data. Shouldn't we compute the slope and the intercept according to latest historical data only? (just like we are computing moving average) so the data should be moving according to the analyzed cointegration and backtest period cell.
Dear Sava, first of all thank you for this awesome channel and your amazing videos. You really help us. I would like to ask you if you could create some videos using eviews too. For instance, could you make a video for pair traiding using eviews and showing us rolling regressions on eviews for this purpose?
Thank you again for your amazing content!
Hi Peter, and glad you are enjoying the channel! Yes, I do plan to return back to eViews tutorials at some point in the future.
@@NEDLeducation Thank you so much Sava! This would be great for economic students like me. Thanks a lot!
Great video. Only part I am struggling with is to determine how much u invest in long and short positions. You say we have 100 USD, but for how much you buy google and short microsoft? Thank you for clarifying!
Hi Balazs, and glad you liked the video! In the simplest form of the strategy, you open a 100 USD worth of a long position of Google and 100 USD worth of a short position in Microsoft. There are also versions of the strategy with beta loading that some researchers and traders believe achieves more market neutrality, when you adjust these values by stocks' betas. I touch upon that in this video: ua-cam.com/video/x_xoq6eY85s/v-deo.html. Hope this helps!
Is there a look ahead bias in this? Should one do a train test split to avoid it?
Hi Anthony, and thanks for the question! Exactly, to avoid this issue the cointegrating equation should be estimated before trading is executed. I address this in my Python tutorial on pair trading, check it out: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
Hi Sava
Nice information...thanks.
i am facing problem to find out standard error because the formula u mentioned calculates only coefficient, please help
Hi Nitin, and glad you enjoyed the video! You can find the standard error for the regression using the STEYX formula if you are using the SLOPE and INTERCEPT approach. If you are interested in the standard errors of coefficients themselves, you can use LINEST, and they will appear in the second row of the LINEST template under the respective coefficients. You can select two rows instead of one to get standard errors in the second row, or use Enter to enforce the formula, and the whole template will spill automatically.
This video has explained many challenging calculation tasks in excel with great simplicity.
Looking forward to one of the critical outputs of ADF TEST seems to be P-VAULE, which is not calculated in EXCEL.
Request if its calculation is included would to helpful.
Thanks
Nisha Patel
Hi Nisha, and thanks for the question! The p-value is quite challenging to calculate directly in Excel so I opted to use critical values instead. Might do it at some point in the future!
@@NEDLeducation Thanks for update.
Request, if you can suggest how reliable are Critical Values compared to P-VALUE to take a decision to initiate trade.
Thanks,
Nisha
@@nishapatel8195 Hi again Nisha, these are equivalent, comparing the test statistic to, say, a 95% critical value is equivalent to comparing a p-value to 5%. Therefore, if a test statistic exceeds the 95% critical value, it is by definition the same as a calculated p-value being less than 5%.
@@NEDLeducation Thanks for update.
Will Back Test it.
However, would request aid with P-VALUE calculation in near future.
Once again Thanks...
Thanks,
Nisha Patel
Dear Sava, Thanks for the great content - loved it. One quick question though: you use actual time series values for the regression (slope & intercept). And then you use these values (slope & intercept) to calculate the cointegration and the "fair value" and base your trading strategy on it. Isnt that approach flawed as you use the value of the slope & intercept for determining each days cointegration, altough in reality you would have the informaton of slope & intercept available yet because you only have past data available? in reality you dont know yet how the value will be distributed in the next years - so how can you use the slope & intercept of future price informations that are not available at the point in time when you do your analysis? In other words: why didnt you use histroical price data for the slope & intercept before caluclated the first day cointegration value or applied a rolling average?
Hi Bond, and many thanks for the question! You are absolutely right, this is just an illustration that backtests a trading strategy. To be on a safer side, it makes sense to separate your data sample into a training set (that you use to estimate the parameters) and a test set (that you use to study how well a strategy based on pre-estimated parameters performs). It just would be overly complicated for a short video like that, but thanks so much for pointing this out. Hope it helps!
@@NEDLeducation Perfect - thanks a lot for the fast answer. I really like your video and everything you do - just fantastic work! Will check out also your other content and promote it to some colleauges. All the best from Munich, Gemany.
@@NEDLeducation Hey Sava! Any chance you could elaborate how you would move forward with the training and test set?
I'm looking for input on how to estimate the MA and after determine the parameters for the strategy. Loved the video btw!
@@OR4NG3Productions Hi Martin, and glad you liked the video! As for your question, you can roughly split the data 50/50, estimating parameters on the more distant data and simulating strategy performance on the more recent data, could be a decent first step solution.
@@NEDLeducation hello Sava, great video. I've tested this on another pair of my choice using this method... Thanks for this kind of work.. could you please also do video as how to do this on a rolling over basis ?
Jesus puru gold. What an insight...
Really an excellent video - thank you.
What is the best way in Excel to find the coefficient of determination without an intercept, as you are doing, without using Excel's LINEST function?
Hi Eric, and glad you liked the video! The function you probably imply is RSQ - just like INTERCEPT for the constant and SLOPE for the coefficient, it can calculate the coefficient of determination for two arrays. Hope it helps!
@@NEDLeducation Thank you. I had tried that, but I get two different results using LINEST 0.0026 and RSQ 0.0012, which has meaningfully different results.
Also, if I wanted to retain you for a short consulting project on this to make this analysis more complete. How could I contact you?
@@EricK-vu3sf Hi Eric, the RSQ coefficient gives you the R-squared for the simple linear regression while the LINEST can give you the RSQ for a multiple linear regression, this is probably the source of the discrepancy.
If you would like to arrange a one-to-one tutorial with me for your project, this is a perk I provide to my Patrons, please check the link here: www.patreon.com/NEDLeducation
Hope it helps!
@@NEDLeducation Hello! love your video. I need a little guidance, how can I set up a one-to-one tutorial?
I tried this link www.patreon.com/NEDLeducation but which one do I sign up for?
Amazing! You got yourself a new subscriber! Do you have any reading recommendations about pair trading strategies?
Hi Sebastian, and glad you like the channel! I have actually got a Python tutorial on algorithmic pair trading strategies using cointegration, please check this out if you are interested: ua-cam.com/video/jvZ0vuC9oJk/v-deo.html
Perfect
Hi Sava, Thanks for sharing your knowledge. Can I apply the same method for the same index, for example SP&500 index against Emini SP&500 next expiration. I trade in Brazil using Neologica platform and I'm able to subtract the emini contract from the maint index, however was not able replicate in the Metatrader. Do you know how can I do that ?
Thanks in advance
Hi, and glad the video helped! As for your question, yes, you can apply the logic of cointegration for such arbitrage, however in terms of data downloading from specific trading platforms, I cannot advise in that much detail. If I understand correctly, you want to arbitrage on S&P500 and S&P500 futures pair, for this you might want to check my video on the law of one price (ua-cam.com/video/xjikItI6zcU/v-deo.html) and futures arbitrage (ua-cam.com/video/8otkAibdC94/v-deo.html). Hope it helps!
good video, I have a problem, I did the same thing on EXCEL file but still don't figure out how you calculated coefficient and standard error at the same time
Hi Xuhui, and thanks for the question! I believe you refer to LINEST? The trick is to input "1" (or "TRUE") for "additional statistics" and select two rows for the output so standard errors are shown there.
@@NEDLeducation it refers to LINEST but I still don't get the standard error. MY excel is 2016 version, is it different version problem?
Hi xuhui, are you sure you used an array formula?
Thanks Sava for the inspiring video. regarding the t critical values are they D-F Values? because they are not consistent with the values in the tables and with your examples in your Unit root file.
Hi Sava. Great video. One question please: Why you didn't apply the dicky-fuller test to the residuals of the regression? Thanks
Hi Eduardo, and glad you liked the video! As for your question, the Dickey-Fuller test is applied to the difference between the price of Google and the "fair value" calculated using the dynamic equilibrium, which serves as a residual in this instance. Hope it helps!
@@NEDLeducation Thank you for your answer. One last question, please: When doing the regression, what is the best option: 1.- To do it with the slope and intercept, 2.- To do it only with slope 3.- To do it with none of both. Why?
@@eduardopossealvarez7717 It depends on what you are running the Dickey-Fuller test for. If it is for pair trading, the go-to option would be 3) as you want the dynamic equilibrium to be stable.
@@NEDLeducation thank you
Hi Savva, your video is great. In order not to have to take short positions, is it possible to use two assets with an inverse relationship in the cointegration?
Hi Rodrigo, and happy you liked the video! Excellent question, while it is possible in theory to have two assets that are "negatively cointegrated", and then you can do pair trading the way you stated, in practice it can be hard to find such assets.
Thank you Savva!
Great video, in this spreadsheet you are using MSFT/GOOGLE, how would this work out for 5 spreads?
Hi, and glad you liked the video! Generally in cointegration strategies you consider separate pairs (so in your case in would be five separate models and long-short decisions made for each pair individually), there are some techniques to estimate a similar equation in a panel setting (vectoral autoregression, for example), but it would be much trickier to implement and it is rarely used in practice for trading strategies as far as I know. Hope it helps!
Thank you very much for the video, very clear and efficient. Unless I am mistaken, there is quite a mistake in the strategy you are designing though. Indeed, it is not a beta neutral one (or call it market neutral) at all, since you are trading one vs one ! You are then left with quite a big market exposition (in that case, initially you should trade 1 vs 7.15). If you do so, you'll find that total return is affected quite significantly (positive or negative based on luck...)
Hi Jonathan, and glad you liked the video! As for your question, yes, there are many developments that address this limitation of the simple pair trading strategy, and I have got a video on beta loading as one of such methods, check it out if you are interested: ua-cam.com/video/x_xoq6eY85s/v-deo.html
Can I use it in India market I was already trade in pair
Hi Saba. This is really a great video. I’m just wondering when we use this cointegration pair trading strategy, how can we determine the entry and exit point?
Hi Jade, and glad you liked the video! The exit point is naturally when the pair crosses back and the dynamic equilibrium is restored. The entry point can be determined using the degree of cointegration (t-stat needs to be high enough in terms of magnitude), and profit opportunity (the pair needs to be far enough from equilibrium). Hope it helps!
@@NEDLeducation how to determine if the pair is far away from equilibrium? Is there a mathematical function to find it out in this video
Sorry for this amateur question... I'm new to this
Hi Sava. I am aiming to set up a scalping strategy over Markov chains. Can you help me with this?
Пасиб! Интересно!
dear dava when i try to calculate the coefficient, it doesn't give me the standard error the delimiter n my excel is ;
thanks in advance
Seems like the LINEST based upon OLS linear regression wasn't the BLUE and hence optimisation was necessary, it is possible that using GLE regression would have done better?
Hi Sava, many thanks for the video, it was very helpful. I have just 1 question- how did you get the t critical values of 10%, 5% and 1% that I see in the Excel? I see the values of -1.616, -1.941 and -2.567 respectively. But I don’t know where these figures are from or how they were derived. Please can you clarify? Thank you very much.
Hi, and glad you liked the video! These are tau distribution critical values which are appropriate to use for this stationarity test but unfortunately hard to calculate, therefore you generally just look them up in a table.
@@NEDLeducation thanks for the reply. Do you have that table in your material files you linked?
Привет Савва и Большое спасибо. It’s a very useful video. But I don’t understand the logic of optimizing beta for better t stat. If that is the case we don’t need to calculate beta before for linear regression since it’s value will be overwritten. I have programmed this in Amibroker and the beta optimization is not improving the results in a real situation (only using available information at the moment). Another question is regarding the t test. If it fails during the life of the trade do you recommend to close the spread or to continue with it. There are other approaches to trade in a non continuos way that improve the solution, like a z score of the residuals. Maybe I can share my results with you. Thank you.
Hi and glad you liked the video! The t-stat optimisation is executed as the OLS estimate of "b" parameter will be biased (prices are autocorrelated). It does not matter that simple linear regression estimates can never be used, they just will be less reliable on average. In terms of optimal entry and exit, the strategy can be tweaked and modified in many ways, you can implement stop-losses for instance, or check whether the pair breaks out of cointegration and close the trade if so happens. This is where finance becomes more of an art and a matter of judgement than a science. Hope it helps!
is there a way to do it quicker i have like 50 pairs I have to check but this seems to be a very large task to do it manually.
hi Sava,
Do you know where i can Find information on t-stat ?
Hi David, I have just recorded a series of intro-level videos on T-tests, check these out if you are interested, starting from ua-cam.com/video/9C6oz_mADSE/v-deo.html. However, in unit root tests, the t-stat is not a Student t-stat used in regular t-tests but rather a Dickey-Fuller t-stat (or sometimes also called "tau-stat"). For a brief introduction to it, I would recommend economics.utoronto.ca/jfloyd/book/statabs.pdf
Hope it helps!
I was checking a pair and the t-stat was -3.57 something
What does that mean ?
Hi, and thanks for the question! A t-stat of -3.57 shows the combination is stationary meaning that the pair is cointegrated. As a rule of thumb, -2.5 is a pretty good result, -3 is a very good result, and anything below -3.5 is very reliable.
@@NEDLeducation I was checking co-integration for two Forex pairs... Major pairs.. they move in inverse directions... My t-stat value for those pairs was positive... 2.2
What does that mean? I couldn't find positive values for t-stats
I've double checked something... Based on the last 8 months, t-stat is -2.12 and based on the last three months it's 3.79
But still the month immediately after those three months resulted in losses 😂 what exactly am I doing Wrong ? Or is there any concept that I'm missing
I'm presuming that even my returns generated from this method are mean reverting since they were extremely good in the last three months' data that I have used
From where I can download the excel sheets? For this
Hi Anil, and glad you liked the video! As usual, from Google Drive link in the pinned comment, it is being constantly updated as more content is released :)
Hello! I´ve seen one practioner explaining the way he uses cointegration is that he would sell/buy X amount dependent asset and buy/sell the X*beta the other asset that are cointegrated, is it correct to assume that we need a "beta neutral" trade?
Hi Guilherme, yes, this is one of the ways cointegration is implemented by practitioners to get a value of beta that is closer to zero. This procedure is more assumption-sensitive than the one I showed in the video (stocks' beta can change with time), and even the simplest strategy when you long the undervalued and short the overvalued stock as per the cointegrating equation achieves a beta that is very close to zero.
You need to trade $ neutral and may also be beta neutral. You would have some market risk bias potential if you dont hedge out the difference in betas of each stock. The compounded return would either too long beta or too short beta potentially even when you are $ neutral. You may need to overlay market risk so you would have $ neutral beta netral mean reversion strategy, and then your benvhmark would not be index returns but return over risk free potentially.
Hi, and thanks for the comment! Generally, the opinions of different practitioners and academics vary widely over whether beta loading is attractive, and I have got a tutorial that implements various advanced techniques in pair trading, including beta neutrality: ua-cam.com/video/x_xoq6eY85s/v-deo.html
Hi Saba,
Great vdo. Can you get a VDO on Johansen Test in excel or Google Sheet. Looking forward.
How are you finding the "t critical values" 10%,5% and 1%.?
thx
Hi Mithun, and thanks for the question! These are not conventional t-statistics but rather bootstrapped Dickey-Fuller t-statistics, and it is the easiest to just look them up in a table the old-fashioned way :)
Is this the dickey fuller test! Or the augmented dicker fuller!
Is this cointegration analysis done using rolling window of previous N prices? If not how are the results reliable, as the analysis assumes it know future prices correctly? If yes, which part of the formulas give the rolling window size? Appreciate any clarifications, if someone is still looking at this video and has subject matter expertise.
for calculating the strategy return , is it not right to multiply MFST return with slope , is this not correct (F2>0 , J3-slope*I3 , slope*I3-J3) or (F2>0 , J3-7.15*I3 , 7.15*I3-J3) ?
Sir, Have you tried pair trading forex using rsi7 ,rsi14,rsi30 (add them up for comparison) say on hourly chart & selling strong pair & buying weak pair--pairs have to be highly correlated(eg aud/usd and nzd/usd OR dow30 & sp500).One can do this on any correlating underlying stocks/commodities/futures/crypto/bonds. Trading on hourly charts there would be tons of opportunities all year around.
Hi, I was testing. Need help. What value should I look for if pair trading strategy doesn't qualify. Is it positive t-stat?
Hi Darshan, and thanks for the question! If the t-stat is either positive or negative but too small to pass the Dickey-Fuller significance threshold, the series cannot be treated as cointegrated, and the pair trading strategy would not work. Hope it helps!
@@NEDLeducation thank you so much . I like NEDL content. Very detailed. All the best
Hi sir, can you also help me calculate pvalues for stock x and stock y please
🤘респект
No you have to know what ones you need to trade or you will be waiting for a profit that can take months or years to profit and costing you lots of money in overnights holding
You are all wasting your time. I'll smoke cointegration. You Have to know how to the markets seek liquidity....plain and simple.