Bollinger Bands

Bollinger Bands revisited

A continuation of Standard Deviation ... of Prices and Returns.

Bollinger Bands: Introduction

Years ago, when I first ran across Bollinger Bands, I thought they were pretty neat ... stock prices bouncing between two curves, the Upper and Lower boundaries of the "band" and ...
>Remind me. Bollinger bands?
We look at the last n stock prices P₁, P₂, ... P_n (where we include P_n, today's price, and where P₀ is the price n days ago) and we calculate their average, P_av, and their Standard Deviation, SD:

P_av = (1/n)(P₁+ P₂+ ... +P_n) = (1/n)ΣP_k
SD² = (1/n) [ (P₁-P_av)²+ (P₂-P_av)²+ ... + (P_n-P_av)²) ] = (1/n)ΣP_k² - P_av²
(See SD stuff.)

Then, each day, we plot the two points:

[a] U = P_av + k SD
[b] L = P_av - k SD
These points trace out two curves and we see the current stock price bounce between the two curves, U and L, as in Figure 1.
>So what are n and k?
You can pick anything., but we'll choose n = 20 days and k = 2 standard deviations.
>So you buy at L and sell at U?
I didn't say that!
Figure 1
>So what are you saying?
I just want to look at Bollinger Bands, again, because although one often calculates the SD (or Volatility) of stock returns, it's strange to see the SD of stock prices and ...
>As in Bolli bands?
Yes, as in Bollinger Bands. We did, at one time, try to find a relationship between the statistical properties of returns and of prices, here. What we want to do now is investigate WHY one would expect stock prices to oscillate between U and L.

When we consider the SD of daily returns, we often assume they have a Normal distribution ... in which case it's unlikely that returns will lie too far from the Mean return. In fact, we would expect most returns to lie within 2 Standard Deviations of the Mean return. In fact, if they were Normally distributed, the probability that the returns lie within two SDs of the Mean is X%. However, if we consider prices, do we also expect them to lie (mostly) within 2 Standard Deviations of the Mean price P_av?
>That'd be like choosing k = 2, eh?
Exactly! When today's price is larger than U or smaller than L, then it's outside that 2S band centred on P_av ... so we might expect tomorrow's price to return to the band. That says something about tomorrow's price, eh?
>And the last n = 20 prices are Normally distributed?
What do you thnk?
>Huh? You're asking me?
That was a rhetorical question.
>Can I just go to the final result ... huh?
Sure. Just click here

the Distribution of Stock Prices

Suppose that, over the last n days, the daily Gain Factors are g₁, g₂, g₃, ... g_n.
>Gain Factors?
Yes, if a stock price goes from $P to $Pg in a day, then g is the Gain Factor for that day.
For example, g = 1.056 corresponds to a 5.6% daily return.

Then n successive daily stock prices (after the starting price of $P₀) are P₀g₁, P₀g₁g₂, P₀g₁g₂g₃, ... P₀g₁g₂g₃...g_n
... the last being today's stock price.

So here's the question:
What's the distribution of the numbers G_n = g₁g₂g₃...g_n ??

We have the following:
Results

The price n days ago is given as P₀.
The daily Gain Factors, g, have Mean[g] = M and Variance[g] = Var = S².
These are determined from historically data and are assumed to be independent!
The n-day Gain Factor G_n = P_n / P₀ = g₁g₂g₃...g_n has:
M = Mean[G_n] = Mean[g₁]Mean[g₂]...Mean[g_n] = Mⁿ the Mean of a Product = the Product of the Means
S² = Variance[G_n] = (M²+S²)ⁿ - M²ⁿ See this

Okay, now we'll assume that the g's are Lognormally distributed.
>Lognormal? I thought you wanted Normal?
Well, it's common practice to consider daily gains (or, in our case, Gain Factors) to be Lognormally distributed.
Besides, it makes the math easier.

In any case, if g has a Lognormal distribution, then g = e^y where y = log(g) has a Normal distribution.
That's the definition of Lognormal!

Further, if we let y_k = log(g_k), then log(G_n) = log(g₁g₂g₃...g_n) = log(g₁) + log(g₂) + ... + log(g_n) = y₁ + y₂ + ... +y_n.

Since the y's are independent Normally distributed numbers, their sum is also Normally distributed.
That makes log(G_n) Normally distributed hence G_n itself is Lognormally distributed.
Remember what it means to say that F(x) is the cumulative distribution for a variable Y:
It means that the probability that a randomly chosen Y is less than some x is F(x) (as in Figure 2).
So, for any x and n random g's, what is the probability that G_n = g₁g₂g₃...g_n < x ?
That requires that log(G_n) < log(x).

Figure 2
But, as we've said, G_n is Lognormally distributed, then Y = log(G_n) is Normally distributed.

Suppose we call N[u,Mean,SD] the cumulative Normal distribution function with prescribed mean and Standard Deviation.

Then log(G_n) has a cumulative distribution described by N[u,Mean,SD]
where Mean and SD are the Mean and Standard Deviation of log(G_n).

>We know the mean and SD of G_n. That's results #3 ... but what about log(G_n)?
Good question. In fact, for Lognormally distributed G_n there's a relation between the Means and Standard Deviations ... like so:

If M and S are the Mean and Standard Deviation of G_n, then the Mean and Standard Deviation of the logarithm is:
M = Mean[log(G_n)] = log(M ) - (1/2)S²
S² = Variance[log(G_n)] = log(1 + S² / M²)
... assuming that G_n is Lognormally distributed.

Since we now have labels for the Mean and SD of log(G_n), we can write the cumulative distributon for log(G_n) as: N[u,M,S]

Probability that Today's Price lies within some interval centred on the n-day Mean: P_av

Notice that, if P₀ = $1.00, then the numbers G₁, G₂, G₃, etc., are just the subsequent stock prices.
We'll assume that's the case.
>Huh? What's the case?
That P₀ = $1.00 so the products G_k = g₁g₂...g_k are the stock prices. We'll stick P₀ in our formulas ... later.

Okay, we have that today's price P_n = g₁g₂...g_n is Lognormally distributed with a known Mean and Variance as given is Results #5.

Now we ask:
If the random variable G has a Lognormal distribution with given Mean and Variance,
what is the probability that G < x ... for a given number x ??

>You're asking me?
That was a rhetorical question. Now pay attention. We've been here before.

If G < x then log(G) < log(x)
Since G is Lognormal then log(G) is Normal
The distribution of log(G) is then described by N[u,Mean,SD]
the Normal cumulative distribution function
and Mean and Standard Deviation are the Mean and Standard Deviation of log(G) ... not of G itself!
The probability that log(G) < log(x) is then N[log(x),Mean,SD]
But log(G) < log(x) is the same condition as G < x so the probability is the same: N[log(x),Mean,SD]

>What about our stock prices?
Yes, of course. I'm sure you've recognized our G. It's today's stock price G_n ... assuming the starting price was P₀ = $1.00, n days ago.
In fact, we know the Mean and SD to use in this formula: M and S.
In other words: The probability that G < x is N[log(x),M,S]

>So the chances of being in that Bollinger band is ... what?
If A is the probability of being less than U and B is the probability of being less than L, then ...
>It's B - A, eh?
Actually, it's A - B as in: N[log(U), M, S)] - N[log(L), M, S)]

Note:

Remember that we're talking about the probability that the n-day Gain Factor lies between two numbers.
Don't confuse Gain Factors with daily returns.
In fact, a Gain Factor is 1 + (daily return).
The Mean of the Gain Factors, that's M, is "1" greater than the mean of the daily returns.
If the Mean of the daily returns is 0.0123 (that's 1.23%), then M = 1.0123.

>When are you going to insert some other starting price ... P₀?
Other than $1.00? Right now.
The numbers U and L given in [a] and [b] assume an arbitrary P₀ value.
To generate the appropriate numbers for the case P₀ = $1.00, we'd divide each of U and L by P₀.
Assume we've divided U and L by P₀. We'll call these U' = U/P₀ and L' = L/P₀, okay?
Now we're talking about the case where P₀ = $1.00 (as we did above).
As we've seen above, the probability that G = P_n/P₀ < U' is N[log(U'), M, S)]
But U' = U/P₀ so P_n/P₀ < U' is the same condition as P_n < U.
If we then want the probability that P lies within the Bollinger Band for arbitrary starting Price, then ...
>Why don't you just give the result, okay?

Here's our final result:

Magic Formula

Assuming n random daily Gain Factors which are Lognormally
distributed with Mean = M and Standard Deviation = S then
the probability that the price, P_n, will lie within between L and U
is given by:
Prob[L < P_n < U] = N[log(U/P₀), M, S)] - N[log(L/P₀), M, S)]
where P_n is the stock price at the end of the n day period
P₀ is the stock price at the start of the n day period
N[x, Mean, SD] is the Normal cumulative distribution function
M = Mⁿ
S² = (M²+S²)ⁿ - M²ⁿ
M = log(M) - (1/2)S²
S² = log(1 + S² / M²)

>There a lot of coloured numbers!
Sure.

We start with the Mean and Standard Deviation of daily Gain Factors

then to the Mean and Standard Deviation over n days

then, assuming a Lognormal distribution, the Mean and Standard Deviation of the logarithm

Remember: the probability (in the Magic Formula ) is really the probability that the Gain Factor (over n days) lies between L/P₀ and U/P₀.
That's the same as: Prob[L/P₀ < P/P₀ < U/P₀].
If, for example, both L and U are less than the starting Price P₀, then you're asking for the probability that, after n random gains, the price P_n has dropped to some range of lower values. Similarly, if both L and U are greater than P₀ and ...

>Do you have an example?
Okay, we'll consider GE over the past n = 20 days and ...
>Doesn't that Magic Formula work for the next n days?
Okay let's start with today's GE price (that's our P₀).
We'll look 20 days into the future to get a probability that the stock price will lie between L and U.
But we have to decide what historical data we use to calculate that Mean and Standard Deviation ... like maybe the past 150 days or maybe ...
>Don't you have a spreadsheet?
Yeah, it gives a picture ... like Figure 3.
>That 27% probability ... do you really believe that?
Of course! Don't I offer a money-back guarantee on my spreadsheets?

Figure 3

To download the .ZIPd spreadsheet, RIGHT-click on Figure 3 and Save Target ...

>Is it any good ... that probability?
Okay, here's what we'll do:

We'll use a Mean and Standard Deviation based upon the past D days (example: D = 150 days).
We'll start a year ago (that's December, 2002) and look at the price of GE stock at that time: That's our P₀.
We then look ahead 20 days and look at the stock price. That's P_n for n = 20.
Then we see if the stock price P_n is $1.00 to $3.00 higher than P₀. That is: L = P₀ + 1 and U = P₀ + 3.
We repeat this for every day from Dec/02 to to Dec/03.
Then we see how many times the stock price did lie in the prescribed range L < P_n < U, 20 days later.
Was it the percent suggested by our Magic Formula ?

>So U and L will always be $1 and $3 higher than the starting price, right?
Right, and n will stay fixed at 20 days.
>But the Magic Formula percent will depend upon your D, right?
Yes, that determines M and S ... so we'll do this for various values of D.
>But the actual percent won't change, right?
No. It depends only upon the past year's daily prices. They won't change.
>And?
Here's the result for various values of D:

Predicting n = 20 days ahead: L = P₀ + 1 and U = P₀ + 3
>So you're checking to see if the price, 20 days hence, is between $1 and $3 higher than the current price.
Well, we're seeing what the Magic Formula says and what the actual result was, over the past year.

Here's another result for different U and L and n:

Predicting n = 30 days ahead: L = P₀ + 2 and U = P₀ + 5
>That's looking 30 days into the future to see if the price has increased by between $2 and $5.
Yes.

>Some are pretty lousy, eh?
Ya win some, ya lose some ...