Standard Deviation: some comments on Volatility and Risk and Risk
We consider, at some point in time, the various monthly gains in our portfolio over the past N months ... like so (where the number of occurrences of each gain, out of N = 200, is plotted):

The spread about the average (or mean) gain is measured by the Standard Deviation:
SD2 = (1/N)Σ (Gk - A)2

where there are N portfolio gains, Gk (k=1 to N), and A = (1/N)ΣGk is their Average (or Mean) and SD is the Root Mean Square Deviation between the gains and their average and it can also be computed like so:

SD2 = (1/N)Σ Gk2 - {(1/N)ΣGk}2


Fig. 1 A 10% monthly gain is unusual, but what the heck.
It looks like a Normal Distribution, but that's an accident!
i.e. SD2 is the difference between the average of the squares and the square of the average.

To prove this we first drop the subscripts (for sanitary reasons) and write:
SD2 = (1/N)Σ (G - A)2 = (1/N)Σ{G2 - 2GA + A2} = (1/N)ΣG2 - 2A(1/N)ΣG + (1/N)ΣA2
    = (1/N)ΣG2 - 2AA + (1/N)NA2 = (1/N)ΣG2 - A2
>Why the "squares" of the deviations from the Mean?
It's convenient, mathematically speaking. We could, however, consider other things to measure how far the returns are, from their Mean. For example, we could pick the largest deviation magnitude, or the average of the deviation magnitudes:
(1/N)Σ |Gk - A|

>Example?
For the S&P 500, if we consider the annual returns from Jan 1 to Dec 31, starting in Jan/50 (and ending Jan/00), we'd get:
Mean = 10.1%
Standard Deviation = 15.8%
Maximum Deviation magnitude = 39.8%     in 1952
Average Deviation magnitude = 13.1%
>I like the last guy.
Pay attention.
Suppose each Gk is increased (decreased) by a factor λ. Then:

SD = {(1/N)Σ (λGk)2 - {(1/N)ΣλGk}2 }1/2 = λ{k2 - {(1/N)ΣGk}2 }1/2

Conclusion? SD also increases (decreases) by the factor λ. (If all gains double, then SD will double.)

Fig. 2 & 3 Distribution of 200 Normally Distributed monthly returns where each has doubled
Now, suppose the individual gains are changed by ADDING a constant C (rather than multiplying by a constant λ)
Here's a picture of two stocks whose returns differ by a constant

We have, as the new average:

(1/N)Σ(G + C) = (1/N)ΣG + (1/N)ΣC = (1/N)ΣG + (1/N)NC
    = A + C
hence the new Standard Deviation:
(1/N)Σ{(G + C) - (A+C)}2 = (1/N)Σ{G - A}2 = SD2
In other words, the SD is unchanged by the addition of a constant. (Uh ... did I mention that we dropped the subscripts again?)
These results are independent of the type of statistical distribution: Normal, logNormal, MickeyMouse, etc.

One important consequence of these results is that if we have a collection of numbers, say {Gk},
with Mean = 0 and SD = 1, then the collection {λ Gk + C} will have Mean = C and Standard Deviation = λ.

Aah, but what if the individual gains do NOT increase by the same factor?

Write:

SD2 = (1/N)Σ Gk2 - {(1/N)ΣGk}2

and consider the effect of modifying a single gain, namely Gi.
Compute d/dGi of each side and get:

SD dSD/dGi = (1/N) {Gi - {(1/N)ΣGk} = (1/N) (Gi - A)

and this is positive (or negative), implying an increase (or decrease), if the gain Gi is greater (or less) than the average of all the gains, A = (1/N)ΣGk.

     Increases in those gains which are LESS than the average gain, will cause the SD to decrease.
     Decreases in those gains which are GREATER than the average gain, will also cause the SD to decrease.

That's sort of obvious since SD measures the spread of gains about the average. Increasing the smaller gains and/or decreasing the larger gains will reduce this spread, hence the SD. Indeed, for a Normal Distribution (the infamous "Bell Curve"), about 2/3 of the returns lie between Mean - SD and Mean + SD. Stare at the above graphs and convince yourself that this is true. (For the first graph, this range is from 10% - 20% = - 10% to 10% + 20% = 30%.) When SD decreases, these 2/3 crowd closer to the Mean.


Now, it's reasonable to measure the volatility of the gains in terms of their spread: widely varying gains means high volatility, right? So, investment gurus DEFINE volatility as the Standard Deviation. Who can argue with that?

Now suppose MY monthly gains are all increased by some positive constant C compared to YOUR monthly gains.(Say C = 30%, that'd be nice.) As we've seen, SD, hence the volatility, doesn't change. Our graph just gets shifted to the right by an amount C. Same spread, same SD, same volatility. But wouldn't you say that the risk has decreased, for my portfolio? After all, my gains are now 30% higher than yours. Less risky, right? Alas, the investment community says the risk is the same for both our portolios because they normally DEFINE risk as the Standard Deviation (and not as the risk of a loss).

For this association of "risk" with "SD", see 1 and 2 and 3 and ... and n. Even William Sharpe defines the Return per unit of Risk as some Return divided by the Standard Deviation!


Fig. 4
Now suppose MY monthly gains are all increased by some positive multiplier F > 1 compared to YOUR monthly gains. If your monthly gains are all positive, then F = 2 would be nice. That'd make my gains twice the size of yours.
As we've seen, SD, hence the volatility, changes by the same factor. All the gains in MY distribution graph get shifted to the right by a factor F, the spread increases by that factor and the SD (and the volatility) change by the same factor. (The height of my graph decreases 'cause the number of returns hasn't changed, so there are fewer in each of the intervals along the horizontal axis - because of the increased spread.) Because of this increased spread, there is a greater probability of Returns far from the mean; that's the extended tail of the distribution chart. For some distributions, this extended tail means a greater chance of disastrous Returns! (See Kurtosis)
See Fig. 3, compared to Fig. 2, or, better still, here's a closeup of the tails:

Fig. 5a The tail of the distribution

Fig. 5

But, if all YOUR gains were positive, wouldn't you say that the risk has decreased, for my portfolio? After all, my monthly gains are now higher than yours by the factor F. Less risky, right? Alas, the investment community says the risk is higher for my portfolio because (have we said this before?) they usually DEFINE risk as the Standard Deviation ... and that's increased!

Fig. 6
One other thing: the charts above look like Normal Distributions. Well ...uh, they are, but it's only so I could get pretty pictures. The real, live monthly returns for, say, the S&P 500 look sorta Normal ... but that's not important for what we're discussing.

Fig. 7
Oh yeah ... one other thing: We often (?) hear that the Standard Deviation increases as the time interval increases. In fact, a common statement is that SD varies as the square root of the time: SQRT(time).

This follows (mathematically speaking) from Einstein's 1905 analysis of a random walk (or Brownian Motion) and, earlier, it's associated with Louis Bachelier and even earlier with Jules Regnault (See this.PDF). It assumes the returns are random in this sense. Are they?

Looking at all intervals of length 1 month, 6 months, 12 months, etc., and calculating the SD for each set, we get this chart. Close, but no cigar.

P.S. The argument goes something like this:

If you start at x = 0 and take N steps of length x1, x2, ...., xN then your distance from the origin (x = 0) is R where, to get the Standard Deviation of the set of distances from the origin (over all possible sets x1, x2, ... ), we consider the Mean Square of R ... hence we consider:
R2 = (x1+ x2+...+ xN )2 = x12+ x22+... +xN2 +2x1 x2 +2x1 x3+ ...
If the Mean = 0 (for the steps x1, etc.), then each is equally likely to be positive as negative and, for large N, the cross terms like x1x2 will average to zero and we're left with
R2 = x12+ x22+... +xN2 = N { (x12+ x22+... +xN2)/N } = N {Standard Deviation of the x's}2

leaving us with: R = N1/2 {Standard Deviation of x's} * ... and the square root pops up.
If the number of steps, N, increases with time, then this is proportional to SQRT(time).

* Note that {Standard Deviation of the x's}2 is the difference between the average of the squares and the square of the average. That is:
{Standard Deviation of the x's}2 = (1/N)Σ xk2 - {(1/N)Σxk}2
But, if the average value of the x's is zero, then Σxk = 0 so
(1/N)Σ xk2 = {Standard Deviation of the x's}2

We might also do this is 2 dimensions.
Suppose we start at the origin (0,0) and move left or right in steps of length x1, x2, ...
and, at the same time, move up or down in steps of length y1, y2, ...

Then, after N steps, we're at position (x1+x2+ ...+xN, y1+y2+ ...+yN)

Our distance from the origin is then R where:

R2 = (x1+x2+ ...+xN)2+(y1+y2+ ...+yN)2
As above (assuming that the Mean of the x's and y's is zero and there is no correlation between successive x's or y's), we'd get
R2 = N [{Standard Deviation of the x's}2 + {Standard Deviation of the y's}2]

or (since the Variance is the square of the Standard Deviation)