Transcript

In Part 1 of this video we used an experiment to show that population and sample variances differ from each other, and in Part 2 we showed why they differ.  Here, in Part 3, we will show how these variances are related to each other mathematically. Some math is required, so you had better hang on! But it’s not that hard – you can do it – and it won’t take any longer than a roller coaster ride.

(0:29/6:18)

Recall that the population variance is defined as, the sum of the squared differences between each member of the population and its mean, divided by the number of members in the population.

However, in order to calculate this quantity, we would need to know the values of every member in the population. When the population is large, perhaps consisting of hundreds, or millions or billions of members, it becomes impractical to measure every member of the population.  So, what we do instead, is to sample a few of the members of the population and try and infer the properties of the whole population from that sample.

It turns out that if we do this same kind of calculation for just a sample size n we get an unbiased estimate of the population variance.  The estimate is not exact and we call it an expected value, and write it like this.

But, there is one more problem, if we only have a sample, we won’t know the true population mean, mu. As a result, the best we can do is to calculate the squared differences in the formula with respect to the sample mean rather than the population mean. We recognize the new quantity we get, that is the sum of  minus x-bar squared, divided by n, as being the sample variance, the version with an n in the denominator.

So we have two problems, we don’t know mu and we only have limited sample data.

Using a large enough sample solves the population size problem, but to address not having the value of mu will take a bit more work.

Now, let’s think about how expected population variance, the quantity we want, is related to the sample variance, the quantity we can actually calculate.

(2:16/6:18)

Fortunately, there are strong similarities between these two equations. Let’s rewrite them in a form that will make it easier to do some mathematical manipulations.

Then let’s expand them and write them as separate sums.

Now, let’s add and subtract the sums of x-bar squared over n and two times  times x-bar divided by n, to the top expression, a step that does not change its value. Why we do so will become apparent in a moment.

Next, let’s rearrange the upper equation so that the first three terms are exactly the same as those in the sample variance. This allows us to replace them with the sample variance.

Next we notice that some of the remaining sums can be simplified.

The first one, mu squared added to itself n times divided by n, which simplifies to mu squared.  The next one is x-bar squared added to itself n times divided by n, which simplifies to x-bar squared. The next term we see that the two and the x-bar are constants inside the sum so we can take them out front and we get two x-bar times the sum of  over n, that sum of course is simply x-bar. And for the vary last term we can take the two and the mu out in front, and when we get that it simplifies to two mu times x-bar. We can future rearrange these terms into the following expression, which finally simplifies to our end result which is the expected value of sigma squared is equal to the sample variance squared plus x-bar minus mu squared. 

(4:08/6:18)

Before we can continue we need to make a small diversion, but don’t forget this result. You may remember that the sample mean can be considered a random variable equal to the sum of the random variables , that is the members of the sample, divided by n.  As a result the variance of x-bar is equal to, one over n-squared times the sum of the variances of the data  This is equal to one over n-squared times n times sigma, which can be simplified to sigma squared over n.  The variance of x-bar can be interrupted as the expected squared difference between the sample and population means. Thus we can write that the expected value of the squared difference between x-bar and mu is equal to sigma squared over n.

(5:04/6:18)

Returning to our earlier result, we substitute sigma squared over n for the squared difference between the sample and population means. The substitution is not exact since we are replacing a true difference specific to a particular sample with an expected value, that is the average difference one would expect over many samples.

Since sigma squared is the expected value of sigma squared, we replace the former with the latter.

Next we do a few algebraic manipulations to solve for the expected value of sigma squared.

The n divided by (n-1) factor that appears in the final equation is called Bessel’s correction.

Now, if we write out the definition of the sample variance, and do a bit more simplification, we get the answer we were seeking – a relationship between sample variance and population variance.

And so, your patience has paid off, you have finally arrived at the famous and often mysterious variance formula with an n-1 in its denominator. Putting an n-1 in the denominator rather than an n removes the bias that would otherwise exist.