Do stock returns follow random walks? - Runs test (Excel)

Поділитися
Вставка
  • Опубліковано 24 тра 2020
  • Do stock returns follow random walks? And how might one test if they do? In 1940, Wald and Wolfowitz have developed the first and perhaps the simplest market efficiency test - runs test - that is based on the logic of a coin toss series and runs of consecutive heads or tails. Today we discuss the mathematical and statistical intuition behind runs test and apply it to see if S&P 500 returns do indeed follow a random walk.
    Don't forget to subscribe to NEDL and give this video a thumbs up for more videos in Finance!
    Please consider supporting NEDL on Patreon: / nedleducation

КОМЕНТАРІ • 59

  • @NEDLeducation
    @NEDLeducation  4 роки тому +1

    You can find the spreadsheets for this video and some additional materials here: drive.google.com/drive/folders/1sP40IW0p0w5IETCgo464uhDFfdyR6rh7
    Please consider supporting NEDL on Patreon: www.patreon.com/NEDLeducation

    • @user-oh9uf3ht3o
      @user-oh9uf3ht3o 3 роки тому +2

      классное видео, а Вы б не моги снять видео на русском языке по этому материалу "Modern Probability Theory For Stock Traders" доступному на сайте:
      stock_tradingninja_com ?

  • @alexandershaindlin9495
    @alexandershaindlin9495 Рік тому +1

    When he says "best platform around", it's the biggest understatement of the century.

  • @sohonyaidia
    @sohonyaidia 4 місяці тому

    Thank you!! I really needed this help for my thesis! :)

  • @user-sf1hl2ti5e
    @user-sf1hl2ti5e 2 роки тому +1

    Thank you very much for this video.

  • @ivanklful
    @ivanklful 2 роки тому

    Another awesome video Savva! A question (questions😀) here: 1. If we determine that returns of any asset follow random walk, does it mean that any modeling to predict the future returns are pointless? 2. If p-value is let's say 10% (which would mean that this is the lowest level of significance where we can reject the null hypothesis) does that would mean that returns are following random walk, or we can not comfortably reject the null hypothesis of random walk? I mean what would be the lowest p-value to reject the null?

  • @daciovillarreal4742
    @daciovillarreal4742 3 роки тому +3

    Excellent video, I really enjoyed it. A question would the correct formula in G16 be "=1 - NORM.S.DIST(ABS(G15),1)"? Thanks.

    • @NEDLeducation
      @NEDLeducation  3 роки тому +6

      Hi Dacio and glad you liked the video! As for your question, yes, if you have got a negative z-stat, it is correct to apply the ABS function to it.

  • @nas9140
    @nas9140 3 роки тому +1

    Thank fou for the vidéo ! I have a question : does this test work if we use log-price instead of the tow return rate ?
    Thank you !

    • @NEDLeducation
      @NEDLeducation  3 роки тому

      Hi Nas, and glad you liked the video! Yes, this test would work with log returns as well.

  • @martivalex9521
    @martivalex9521 2 роки тому +1

    Hi Sava, if total runs are less than expected runs, then the z-stat is negative, so more negative the z-stat, mor closer to a random walk probabilitie? Thank you very much, as always great video!

    • @NEDLeducation
      @NEDLeducation  2 роки тому +1

      Hi, and glad you liked the video! The interpretation is that the closer the z-stat to zero, the closer the time series is to a random walk. Positive z-stat means that there are more runs than expected, which in turn implies "oscillating" behaviour or negative autocorrelation, and a negative z-stat means there are less runs than expected, implying "trending" behaviour or positive autocorrelation. Hope this helps!

  • @user-ez5ju4cr2c
    @user-ez5ju4cr2c 4 роки тому +2

    thanks so much for the video. I have a small problem with your explanation. The number of positive runs and negative runs should always either be the same or have a difference of one as a run must go after the other run. SO it is not actually a coincidence that the runs are the same. Since the number of positive days is larger, I would speculate that there is more clustering in the positive days rather than in the negative days if my understanding is right, Savva, what do you think?

    • @NEDLeducation
      @NEDLeducation  4 роки тому +1

      Hi Hayo and many thanks for the feedback! As for your question, yes, your interpretation is exactly right, we are not really concerned with either positive or negative runs, we are concerned with total number of runs, and, as you correctly said, the degree of clustering. Great job!

  • @anirudhpokharel1489
    @anirudhpokharel1489 4 роки тому +1

    Very Informative videos . Can you provide information on finding the RFR for an emerging market where the government bond market (in both domestic and USD denominated currency) is very small and illiquid .

    • @NEDLeducation
      @NEDLeducation  4 роки тому +1

      Hi Anirudh and many thanks for the question! For such markets the risk-free rate can be assumed equal to the US government bond yield (as local investors in such a situation have an incentive to consider US debt; such an assumption is what many platforms like Bloomberg do), or alternatively consider deposit rates (particularly if you are interested in local currency rates). These can be obtained from World Bank, for example: data.worldbank.org/indicator/FR.INR.DPST
      It is particularly useful as a risk-free rate in countries where large "too big to fail" banks exist and the risks of runs on banks is minimal. Hope it helps!

    • @anirudhpokharel1489
      @anirudhpokharel1489 4 роки тому

      ​@@NEDLeducation I have been using CDS spreads to compute RFR for emerging markets and Fixed Deposits rates for countries without CDS spreads and USD denominated bond. Thank you

  • @henchr1051
    @henchr1051 3 роки тому +1

    Thanks for the detailed description. However, I have one question. You arrive at the result that a Z score of 1.56 has a P-value of 0.0593. But as price changes can be both negative and positive, was it not more correct to state it as 0.0593 *2 or 0.11876.
    A norm.s.dist in excel for 1.56 is 0.94062 for the CDF, so 0.94062-0.5 = is 0.44062 for the right area of the distribution, and for both sides/areas it is double or 0.88124 and everything outside those 2 borders is 0.11876. Am I wrong here?
    I suppose your H-null is the data is random and as the old rule goes, "If P is low.........null has to go" and the assumed breakpoint is alfa 0.05 and whether you are right or I am right, Hnull > alfa and confirmation of the Hnull? = your data are randomly distributed.
    Counting the days via the yield you include zero as a positive day, but it would be more correct to exclude zero totally. Zero is just a prolongation of market time, but it has no effect on it, we could be without a zero move and still obtain the same result. Just imagine the extreme you had a lot of zero, you would end up with a lot of positive days and it will affect your expected runs calculation.
    Another way to evaluate the dataset could also be to compare the Z score, with the expected oscillation of a simple random walk, which is sqr(N) and it is in fact the same as 2 standard deviations if you dig a bit deeper into that, so 1.56 is less than the expected oscillation of a Random Walk and thus within its parameters and I can conclude the data is random.
    At 1258 days you get an expected oscillation of sqr(1258)= 35.456 and your difference is 28, so within the expected oscillation of a random walk.
    It tells me your conclusion that the probability statement is not really useful, it is instead a question of yes or no, is the data statistically random or not, whereas " not much random" is an impossible statement.
    My next issue is the drift in stocks how does that affect the market in general. Would it in general mean that the stock has a tendency to go up and have more positive than negative, because of this nature of drift? I think it will, so the number of positive runs will dominate because it contains this compensation for drift and that does not necessarily mean the stock does not follow a random walk. But I have not found any solution of how to normalize the runs and eliminate the influence of drift.
    Looking forward to your comments, it is an interesting subject you bring to the surface and nice to see someone with interest in that subject.

  • @anmolpreet8959
    @anmolpreet8959 2 роки тому +1

    Hi nedl, would it be possible to detect when the series is not showing random walk ? Just curious if any study exist regarding the same.

    • @NEDLeducation
      @NEDLeducation  2 роки тому

      Hi Anmol, and many thanks for the question! I have got several videos on very common time-series market efficiency tests that seek to distinguish random walks from dependent patterns, including simple (this video) and multivariate runs test, variance ratio test (basic and more advanced generalisations), BDSL test, and Hurst exponent. Please check the Mathematical Finance playlist, and there is a series of videos on such random walk tests. For good applied studies on the topic, I would suggest Borges (2010) or Urquhart and McGroarty (2016). Alternatively, you can check the Behavioural Finance playlist for more subtle tests of market efficiency such as calendar anomalies tests, and tests for herding detection on financial markets. Hope it helps!

  • @isabelledurbeck7289
    @isabelledurbeck7289 2 роки тому +2

    Hi, which article may I cite for the formulae as per spreadsheet?

    • @NEDLeducation
      @NEDLeducation  2 роки тому

      Hi Isabelle, and thanks for the question! The origin of the test is Wald and Wolfowitz (1940). For applications to market efficiency, see Borges (2010) and Urquhart and Hudson (2013).

  • @Narceus55
    @Narceus55 2 роки тому +1

    Hi ! Thank you for the video, it was very interesting. I applied this to Decentraland returns and got a Z-stat of 0,68 and a p-value of 25%, should I then consider that there is a very high chance that Decentraland is efficient ? thank you

    • @NEDLeducation
      @NEDLeducation  2 роки тому

      Hi, and thanks for the question! Yes, this is indeed the correct interpretation of test results.

  • @raffaeleberti3293
    @raffaeleberti3293 3 роки тому +1

    Hi! I used the Runs test for BTC returns, I got a negative z-score (-5) and a 100% p-value! It also seems totally inconsistent with p-value I got from the VR test, that shows a 0,00 p-value! What did I do wrong?

    • @NEDLeducation
      @NEDLeducation  3 роки тому

      Hi Raffaele, and thanks for your question! If your z-stat is negative for the runs test, you can apply the two-tailed z-test using the 2*(1 - NORM.S.DIST(ABS(z-stat),1)) formula. This will give the consistent result, that is, Bitcoin returns are positively autocorrelated (persistent), and that is why variance ratio is higher than expected from a random walk (positive and significant z-stat), while the number of runs is lower than expected from a random walk (negative and significant z-stat). Hope it helps!

  • @edisons.671
    @edisons.671 3 роки тому +2

    Hi sir could you elaborate how to derive the equation that calculates the expected or mean number of runs?

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      Hi Edison and many thanks for the question! Long story short, it is derived using a combinatorial summation formula given independent data, and then the deviation from what would be expected if data was independent is used for formal hypothesis testing. For full derivation, you can check the folder with papers in Google Drive (pinned comment), the paper that originally derived runs test is Wald and Wolfowitz (1940). Hope it helps!

    • @edisons.671
      @edisons.671 3 роки тому +1

      @@NEDLeducation Hi I checked the paper, but it says that it omits the tedious calculations of the derivation of the mean equation ^'

    • @NEDLeducation
      @NEDLeducation  3 роки тому +2

      @@edisons.671 Hi Edison, finding a paper where the proof would be explained in detail was very challenging, the best I managed to find is the following: onlinelibrary.wiley.com/doi/pdf/10.1111/j.1469-1809.1939.tb02193.x Wald and Wolfowitz themselves actually stated that this proof can be used to illustrate their own argument to great extent.

    • @edisons.671
      @edisons.671 3 роки тому

      @@NEDLeducation Thank you so much! I find this paper much more helpful for statistics beginners!

  • @thezorrinofromgemail6978
    @thezorrinofromgemail6978 3 роки тому +2

    Hi savva .
    Can you make a video on risk parity portfolio. Nothing serious on youtube about it.
    Preferably in excel and without solver (to make it transposable to c# / vba)
    Thanks in advance

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      Hi, and thanks for the suggestion! I have the video on risk parity in the pipeline already and it will be live within a week. Hope this helps!

  • @nikosje
    @nikosje 2 роки тому +1

    Super

  • @TheChainiz
    @TheChainiz 2 роки тому +1

    Great explanation, very clear and to the point.
    I am considering repliacating your test here in Forex, Crypto, Commodities and also with intraday data.
    So if I get this right, the lower the P value, the lowe the chance each of those markets are efficient, the more probable that technycal analysis may actually work on that market.
    Also , intersting to try on different periods. During normal times, markets should be more random, during crisis, as you said, returns start to be correlated more.
    This all makes sense with the common advice "If there is no trade, then don´t trade" since it is most probable that at that time we are dealing with random-therefore unbeatable- markets.
    thanks

    • @NEDLeducation
      @NEDLeducation  2 роки тому

      Hi Carlos, and thanks so much for the comment! Yes, the logic behind the test is exactly as you have put it. You can test the code for currencies using YahooFinance tickers of the form "EURUSD=X", and data on most popular commodities such as oil or gold is also available through the same package. For intraday data, YahooFinance only stores a handful of most recent trading days but this can be enough for such tests and can be retrieved using yf.Ticker().history feature, specifying the period of interest.

  • @abhinnaacharya8051
    @abhinnaacharya8051 3 роки тому +2

    i got a negative z stat score -1.45 and a p value of 0.92781. does this mean its highly likely that there is a random walk?

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      Hi Abhinna, and thanks for the question. In case of a negative z-stat, you can apply the normal distribution function to the absolute value of the z-stat. It will give you a p-value of around 7.2%, which is insignificant at 5% but significant at 10%, meaning there is some evidence rejecting the random walk but not as substantial as for higher z-stats. Negative z-stats for runs test imply there are fewer runs than expected, meaning that the series is persistent, while a positive z-stat would signal a mean-reverting series. Hope it helps!

  • @shivamazad2712
    @shivamazad2712 Рік тому +1

    at 8:14, you said that for getting the signs we've to subtract current day from previous day value, but actually you're doing opposite of that. So what should I follow?

    • @NEDLeducation
      @NEDLeducation  Рік тому +1

      Awkward choice of language perhaps. You can proceed with what is shown in the Excel file.

    • @shivamazad2712
      @shivamazad2712 Рік тому

      @@NEDLeducation Thank you Sir

  • @TimothyMbuga
    @TimothyMbuga Рік тому

    Hi Saava i need some guidance with my coursework.

  • @javalemcgee4723
    @javalemcgee4723 3 роки тому +1

    Hi, I wondered if this test can be used as a test to detect chaos? Thank you.

    • @NEDLeducation
      @NEDLeducation  3 роки тому

      Hi JaVale, and thanks for the question! In finance studies, chaos is quite an ambiguous term that can be defined differently. If you think of it broadly as non-linear dependence, then the BDSL test is most common tool to detect so-called "deterministic chaos". I believe you have already watched this, but here is the link just in case: ua-cam.com/video/Dvz7Ia84YC8/v-deo.html Hope it helps!

    • @javalemcgee4723
      @javalemcgee4723 3 роки тому +1

      Hi, and thanks for your reply, I highly appreciate it. Do you know any other methods that you have done, that can detect chaos in finance?

    • @NEDLeducation
      @NEDLeducation  3 роки тому

      @@javalemcgee4723 Hi again! You might also look into the approximate entropy (ApEn) model and other entropy measurement techniques, they are quite close in nature to BDSL but not as commonly known. I actually plan to investigate these concepts quite a bit in further videos over the summer. Hope it helps!

    • @javalemcgee4723
      @javalemcgee4723 3 роки тому +1

      @@NEDLeducation Thanks for that! Also I wanted to know what my results may suggest (for m=2):
      n=42
      epsilon=3.24%
      c1=70.07%
      k=52.92%
      c2=47.68%
      c12=68.41%
      difference=0.88%
      variance=0.00586
      stddev: 1.20%
      z-stat: 0.734
      p-value: 23.16%
      Thank you!

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      @@javalemcgee4723 The results suggest the data is independent (p-value is higher than conventional significance levels of 10%, 5%, and 1%). So it seems there is no deterministic chaos in your data.

  • @Kondio317
    @Kondio317 3 роки тому +1

    Hey,
    as far as I understood there is ~95% that the SP500 does not follow a random walk and hence why EMH is not really true. In that case why would the Black Scholes’ formula be considered good, as it incorporates a random walk? I also noticed that a lot of masters universities teach a lot of stochastic calculus. What would be the reason for that?
    Liked the video and would be thankful if you can reply 🙌 sorry if I am asking or saying nonsense

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      Hi Borislav, and thanks for the question! Generally, the non-normality and inefficiency of financial markets are quite well-documented, so models that assume either of the two or both work quite poorly in practice. Nevertheless, they are extremely good starting points in thinking about relevant topics (option pricing, for example) and appreciating the beauty of mathematical modelling, and are necessary to grasp before considering more complicated and less pretty models that work better. Hope it helps!

  • @yujunliu8581
    @yujunliu8581 Рік тому

    I have a question. The author said that there are fewer negative days than positive days, however, the number of positive and negative runs is the same. So, we can speculate that negative returns have a little bit more clustering. (video 11 minutes.) However, since negative days are fewer, should it indicate the average length of negative runs is shorter, and positive days are clustering?

  • @user-wr4yl7tx3w
    @user-wr4yl7tx3w Рік тому

    Not sure about others but I prefer Jupyter Notebooks or Colab using Python or Julia.

  • @shivamazad2712
    @shivamazad2712 Рік тому +1

    what if expected runs are higher than total runs?

    • @NEDLeducation
      @NEDLeducation  Рік тому +1

      Hi Shivam, and thanks for the question! This would imply that the series is mean-reverting as the signs of the returns are alternating more often than a random walk would.

    • @shivamazad2712
      @shivamazad2712 Рік тому

      @@NEDLeducation Thank you Sir

  • @drm4047
    @drm4047 3 роки тому +1

    This assumes that the random walk is Normally distributed. We already know this assumption is false. It does not test that it could be a random walk based on some other probability distribution function.

    • @NEDLeducation
      @NEDLeducation  3 роки тому +1

      Hi and many thanks for the comment, really enjoyed the maths content on your channel by the way! Overall, runs test is a non-parametric test, so it can detect random walk violations even without the normality assumption (it can be seen as a rough analogy of testing equality of medians instead of the equality of means in non-normally distributed data). Variance ratio test is generally more assumption-sensitive (mainly along the lines of heteroskedasticity).