7. Sampling Dists.

The sampling distribution of a sample mean is the distribution of all sample means for samples of a fixed size, say n, taken from some population, usually without replacement, although for mathematical convenience, sampling with replacement is investigated first. Also, in most cases the population has many members (i.e., the population size, N, is large). The size of the population is often the major reason for using sampling--if the population were very small, you could survey the entire population and make statements based on the entire population. For convenience, a very small population is used in the next example.

In this first example, the population consists of the numbers 1,2,3,4,5, and 6. The 36 random samples of size 2 taken with replacement from this population are shown in the next table. Also shown are the sample means, sample standard deviations (stdev), and sample variances (var) for each sample. This sampling situation can be simulated by tossing a pair of fair dice--for convenience, suppose one die is colored green and the other is the normal white color. The number on the white die is shown in the column at the left of the table, and the number on the green die is shown across the top of the table.

	1	2	3	4	5	6
1	1,1 mean=1 stdev=0 var=0	1,2 mean=1.5 stdev=0.71 var=0.504	1,3 mean=2 stdev=1.41 var=1.99	1,4 mean=2.5 stdev=2.12 var=4.49	1,5 mean=3 stdev=2.83 var=8.01	1,6 mean=3.5 stdev=3.54 var=12.53
2	2,1 mean=1.5 stdev=0.71 var=0.504	2,2 mean=2 stdev=0 var=0	2,3 mean=2.5 stdev=0.71 var=0.504	2,4 mean=3 stdev=1.41 var=1.99	2,5 mean=3.5 stdev=2.12 var=4.49	2,6 mean=4 stdev=2.83 var=8.01
3	3,1 mean=2 stdev=1.41 var=1.99	3,2 mean=2.5 stdev=0.71 var=0.504	3,3 mean=3 stdev=0 var=0	3,4 mean=3.5 stdev=0.71 var=0.504	3,5 mean=4 stdev=1.41 var=1.99	3,6 mean=4.5 stdev=2.12 var=4.49
4	4,1 mean=2.5 stdev=2.12 var=4.49	4,2 mean=3 stdev=1.41 var=1.99	4,3 mean=3.5 stdev=0.71 var=0.504	4,4 mean=4 stdev=0 var=0	4,5 mean=4.5 stdev=0.71 var=0.504	4,6 mean=5 stdev=1.41 var=1.99
5	5,1 mean=3 stdev=2.83 var=8.01	5,2 mean=3.5 stdev=2.12 var=4.49	5,3 mean=5 stdev=1.41 var=1.99	5,4 mean=4.5 stdev=0.71 var=0.504	5,5 mean=5 stdev=0 var=0	5,6 mean=5.5 stdev=0.71 var=0.504
6	6,1 mean=3.5 stdev=3.54 var=12.53	6,2 mean=4 stdev=2.83 var=8.01	6,3 mean=4.5 stdev=2.12 var=4.49	6,4 mean=5 stdev=1.41 var=1.99	6,5 mean=5.5 stdev=0.71 var=0.504	6,6 mean=6 stdev=0 var=0

The collection of 36 sample means constitutes the sampling distribution of sample means for samples of size 2 taken with replacement from the population 1,2,3,4,5, and 6. Since each one of these 36 sample means occurs with equal probability, the probability distribution of the sample means can easily be found and is displayed in the next table. Later, the probability distribution of sample standard deviations will be studied.

The mean of this sampling distribution of sample means for samples of size 2 equals (1)(1/36)+(1.5)(2/36)+(2)(3/36)+...+(6)(1/36) = 3.5.

The variance of this distribution is E[X²]-(E[X])². E[X] was just computed and equals 3.5. E[X²]=(1²)(1/36)+(1.5²)(2/36)+(2²)(3/36)+...+(6²)(1/36) = 13.71. Then Var[X]=13.71-(3.5²)=1.458. The standard deviation is the square root of the variance, or 1.21.

This probability distribution doesn't look like the distribution of the population from which the samples were selected. The distribution of the population is shown in the next table followed by a graph of that distribution.

The mean or expected value of the population is (1)(1/6)+(2)(1/6)+(3)(1/6)+(4)(1/6)+(5)(1/6)+(6)(1/6)=3.5.

The variance of this distribution is E[X²]-(E[X])². E[X] was just computed and equals 3.5. E[X²]=(1²)(1/6)+(2²)(1/6)+(3²)(1/6)+(4²)(1/6)+(5²)(1/6)+(6²)(1/6)=15.17. Then Var[X]=15.17-3.5²=2.92, so the standard deviation is the square root of 2.92, or 1.71.

Looking at the graphs of these two probability distributions and their underlying probability tables, what are the relationships between them? First, the means are equal, secondly, the standard deviation of the sample distribution is smaller than the standard deviation of the population. Finally, what about the graph shapes? In order to answer this question, take a look at the next link.

Sampling Distributions

Introduction and Definitions

Introduction

Parameters and Statistics

Sampling Distributions of Statistics

To Top

Sampling Distribution of Sample Means

To Top

Normal Approximation to Binomial

To Top

Sampling Distribution of Sample Variance

To Top