Bollinger Bands revisited
A continuation of Standard Deviation ... of Prices and Returns.

Bollinger Bands: Introduction

Years ago, when I first ran across Bollinger Bands, I thought they were pretty neat ... stock prices bouncing between two curves, the Upper and Lower boundaries of the "band" and ...
>Remind me. Bollinger bands?
We look at the last n stock prices P1, P2, ... Pn (where we include Pn, today's price, and where P0 is the price n days ago) and we calculate their average, Pav, and their Standard Deviation, SD:

      Pav = (1/n)(P1+ P2+ ... +Pn) = (1/n)ΣPk
      SD2 = (1/n) [ (P1-Pav)2+ (P2-Pav)2+ ... + (Pn-Pav)2) ] = (1/n)ΣPk2 - Pav2  
(See SD stuff.)

Then, each day, we plot the two points:

[a]       U = Pav + k SD
[b]       L = Pav - k SD
These points trace out two curves and we see the current stock price bounce between the two curves, U and L, as in Figure 1.
>So what are n and k?
You can pick anything., but we'll choose n = 20 days and k = 2 standard deviations.
>So you buy at L and sell at U?
I didn't say that!

Figure 1
>So what are you saying?
I just want to look at Bollinger Bands, again, because although one often calculates the SD (or Volatility) of stock returns, it's strange to see the SD of stock prices and ...
>As in Bolli bands?
Yes, as in Bollinger Bands. We did, at one time, try to find a relationship between the statistical properties of returns and of prices, here. What we want to do now is investigate WHY one would expect stock prices to oscillate between U and L.

When we consider the SD of daily returns, we often assume they have a Normal distribution ... in which case it's unlikely that returns will lie too far from the Mean return. In fact, we would expect most returns to lie within 2 Standard Deviations of the Mean return. In fact, if they were Normally distributed, the probability that the returns lie within two SDs of the Mean is X%. However, if we consider prices, do we also expect them to lie (mostly) within 2 Standard Deviations of the Mean price Pav?
>That'd be like choosing k = 2, eh?
Exactly! When today's price is larger than U or smaller than L, then it's outside that 2S band centred on Pav ... so we might expect tomorrow's price to return to the band. That says something about tomorrow's price, eh?
>And the last n = 20 prices are Normally distributed?
What do you thnk?
>Huh? You're asking me?
That was a rhetorical question.
>Can I just go to the final result ... huh?
Sure. Just click here

the Distribution of Stock Prices

Suppose that, over the last n days, the daily Gain Factors are g1, g2, g3, ... gn.
>Gain Factors?
Yes, if a stock price goes from $P to $Pg in a day, then g is the Gain Factor for that day.
For example, g = 1.056 corresponds to a 5.6% daily return.

Then n successive daily stock prices (after the starting price of $P0) are P0g1, P0g1g2, P0g1g2g3, ... P0g1g2g3...gn
... the last being today's stock price.

So here's the question:
What's the distribution of the numbers Gn = g1g2g3...gn ??

We have the following:
      Results
  1. The price n days ago is given as P0.
  2. The daily Gain Factors, g, have Mean[g] = M and Variance[g] = Var = S2.
          These are determined from historically data and are assumed to be independent!
  3. The n-day Gain Factor Gn = Pn / P0 = g1g2g3...gn has:
          M = Mean[Gn] = Mean[g1]Mean[g2]...Mean[gn] = Mn   the Mean of a Product = the Product of the Means
          S2 = Variance[Gn] = (M2+S2)n - M2n     See this

Okay, now we'll assume that the g's are Lognormally distributed.
>Lognormal? I thought you wanted Normal?
Well, it's common practice to consider daily gains (or, in our case, Gain Factors) to be Lognormally distributed.
Besides, it makes the math easier.  

In any case, if g has a Lognormal distribution, then g = ey where y = log(g) has a Normal distribution.
That's the definition of Lognormal!

Further, if we let yk = log(gk), then log(Gn) = log(g1g2g3...gn) = log(g1) + log(g2) + ... + log(gn) = y1 + y2 + ... +yn.

Since the y's are independent Normally distributed numbers, their sum is also Normally distributed.
That makes log(Gn) Normally distributed hence Gn itself is Lognormally distributed.
Remember what it means to say that F(x) is the cumulative distribution for a variable Y:
It means that the probability that a randomly chosen Y is less than some x is F(x) (as in Figure 2).

So, for any x and n random g's, what is the probability that Gn = g1g2g3...gn < x ?
That requires that log(Gn) < log(x).


Figure 2
But, as we've said, Gn is Lognormally distributed, then Y = log(Gn) is Normally distributed.

Suppose we call N[u,Mean,SD] the cumulative Normal distribution function with prescribed mean and Standard Deviation.

Then log(Gn) has a cumulative distribution described by N[u,Mean,SD]
where Mean and SD are the Mean and Standard Deviation of log(Gn).

>We know the mean and SD of Gn. That's results #3 ... but what about log(Gn)?
Good question. In fact, for Lognormally distributed Gn there's a relation between the Means and Standard Deviations ... like so:

If M and S are the Mean and Standard Deviation of Gn, then the Mean and Standard Deviation of the logarithm is:
      M = Mean[log(Gn)] = log(M ) - (1/2)S2
      S2 = Variance[log(Gn)] = log(1 + S2 / M2)
      ... assuming that Gn is Lognormally distributed.

Since we now have labels for the Mean and SD of log(Gn), we can write the cumulative distributon for log(Gn) as: N[u,M,S]

Probability that Today's Price lies within some interval centred on the n-day Mean: Pav

Notice that, if P0 = $1.00, then the numbers G1, G2, G3, etc., are just the subsequent stock prices.
We'll assume that's the case.
>Huh? What's the case?
That P0 = $1.00 so the products Gk = g1g2...gk are the stock prices. We'll stick P0 in our formulas ... later.

Okay, we have that today's price Pn = g1g2...gn is Lognormally distributed with a known Mean and Variance as given is Results #5.

Now we ask:
If the random variable G has a Lognormal distribution with given Mean and Variance,
what is the probability that G < x ... for a given number x ??

>You're asking me?
That was a rhetorical question. Now pay attention. We've been here before.

  • If G < x then log(G) < log(x)
  • Since G is Lognormal then log(G) is Normal
  • The distribution of log(G) is then described by N[u,Mean,SD]
        the Normal cumulative distribution function
        and Mean and Standard Deviation are the Mean and Standard Deviation of log(G) ... not of G itself!
  • The probability that log(G) < log(x) is then N[log(x),Mean,SD]
  • But log(G) < log(x) is the same condition as G < x so the probability is the same:   N[log(x),Mean,SD]

>What about our stock prices?
Yes, of course. I'm sure you've recognized our G. It's today's stock price Gn ... assuming the starting price was P0 = $1.00, n days ago.
In fact, we know the Mean and SD to use in this formula: M and S.
In other words: The probability that G < x is N[log(x),M,S]

>So the chances of being in that Bollinger band is ... what?
If A is the probability of being less than U and B is the probability of being less than L, then ...
>It's B - A, eh?
Actually, it's A - B as in: N[log(U), M, S)] - N[log(L), M, S)]

Note:

  • Remember that we're talking about the probability that the n-day Gain Factor lies between two numbers.
  • Don't confuse Gain Factors with daily returns.
  • In fact, a Gain Factor is 1 + (daily return).
  • The Mean of the Gain Factors, that's M, is "1" greater than the mean of the daily returns.
  • If the Mean of the daily returns is 0.0123 (that's 1.23%), then M = 1.0123.

>When are you going to insert some other starting price ... P0?
Other than $1.00? Right now.
The numbers U and L given in [a] and [b] assume an arbitrary P0 value.
To generate the appropriate numbers for the case P0 = $1.00, we'd divide each of U and L by P0.
Assume we've divided U and L by P0. We'll call these U' = U/P0 and L' = L/P0, okay?
Now we're talking about the case where P0 = $1.00 (as we did above).
As we've seen above, the probability that G = Pn/P0 < U' is N[log(U'), M, S)]
But U' = U/P0 so Pn/P0 < U' is the same condition as Pn < U.
If we then want the probability that P lies within the Bollinger Band for arbitrary starting Price, then ...
>Why don't you just give the result, okay?

Here's our final result:


Magic Formula

Assuming n random daily Gain Factors which are Lognormally
distributed with Mean = M and Standard Deviation = S then
the probability that the price, Pn, will lie within between L and U
is given by:

      Prob[L < Pn < U] = N[log(U/P0), M, S)] - N[log(L/P0), M, S)]
      where Pn is the stock price at the end of the n day period
      P0 is the stock price at the start of the n day period
      N[x, Mean, SD] is the Normal cumulative distribution function
      M = Mn
      S2 = (M2+S2)n - M2n
      M = log(M) - (1/2)S2
      S2 = log(1 + S2 / M2)


>There a lot of coloured numbers!
Sure.


Remember: the probability (in the Magic Formula ) is really the probability that the Gain Factor (over n days) lies between L/P0 and U/P0.
That's the same as:   Prob[L/P0 < P/P0 < U/P0].
If, for example, both L and U are less than the starting Price P0, then you're asking for the probability that, after n random gains, the price Pn has dropped to some range of lower values. Similarly, if both L and U are greater than P0 and ...



>Do you have an example?
Okay, we'll consider GE over the past n = 20 days and ...
>Doesn't that Magic Formula work for the next n days?
Okay let's start with today's GE price (that's our P0).
We'll look 20 days into the future to get a probability that the stock price will lie between L and U.
But we have to decide what historical data we use to calculate that Mean and Standard Deviation ... like maybe the past 150 days or maybe ...

>Don't you have a spreadsheet?
Yeah, it gives a picture ... like Figure 3.

>That 27% probability ... do you really believe that?
Of course! Don't I offer a money-back guarantee on my spreadsheets?


Figure 3

To download the .ZIPd spreadsheet, RIGHT-click on Figure 3 and Save Target ...


>Is it any good ... that probability?
Okay, here's what we'll do:

  • We'll use a Mean and Standard Deviation based upon the past D days (example: D = 150 days).
  • We'll start a year ago (that's December, 2002) and look at the price of GE stock at that time: That's our P0.
  • We then look ahead 20 days and look at the stock price. That's Pn for n = 20.
  • Then we see if the stock price Pn is $1.00 to $3.00 higher than P0. That is: L = P0 + 1 and U = P0 + 3.
  • We repeat this for every day from Dec/02 to to Dec/03.
  • Then we see how many times the stock price did lie in the prescribed range L < Pn < U, 20 days later.
  • Was it the percent suggested by our Magic Formula ?

>So U and L will always be $1 and $3 higher than the starting price, right?
Right, and n will stay fixed at 20 days.
>But the Magic Formula percent will depend upon your D, right?
Yes, that determines M and S ... so we'll do this for various values of D.
>But the actual percent won't change, right?
No. It depends only upon the past year's daily prices. They won't change.
>And?
Here's the result for various values of D:


Predicting n = 20 days ahead: L = P0 + 1 and U = P0 + 3

>So you're checking to see if the price, 20 days hence, is between $1 and $3 higher than the current price.
Well, we're seeing what the Magic Formula says and what the actual result was, over the past year.

Here's another result for different U and L and n:


Predicting n = 30 days ahead: L = P0 + 2 and U = P0 + 5

>That's looking 30 days into the future to see if the price has increased by between $2 and $5.
Yes.

>Some are pretty lousy, eh?
Ya win some, ya lose some ...