A sampling distribution describes the behavior of a sample statistic (e.g., the sample mean) by noting what happens to the statistic for many different samples of the same size.
Recall from section 4.1, we use statistics to estimate parameters. If the sample is an SRS, then the statistic should be close in value to the parameter.
For any population, the law of large numbers tells us that as the sample size increases, the statistic becomes closer to the paramter.
Steps to constructing a sampling distribution:
The sampling distribution of a statistic is the distribution of values taken by the statistic in all possible samples of the same size from the same population.
If x-bar is the mean of an SRS of size n drawn from a large population with mean μ and standard deviation σ, then the mean of the sampling distribution of x-bar is μ and its standard deviation is σ/√n. Note that the larger the sample, the smaller the standard deviation (i.e., the smaller the spread) and the closer the values of x-bar will be to the μ.
The upshot is that statistic x-bar is an unbiased estimator of the parameter μ.
The sampling distribution of the sample mean of a normally-distributed population (i.e., N(μ, σ)) is also normally-distributed with mean μ and standard deviation σ/√n (i.e., N(μ, σ/√n)).
Central Limit Theorem (CLT)
Draw an SRS of size n from any population (i.e., any type of population distribution) with mean μ and standard deviation σ. When n is large, the sampling distribution of the sample mean is approximately normal (i.e., approximately N(μ, σ/√n)).
Explore the sampling distribution and CLT.
Example
Question 4.43, p. 247: Given μ = 18.6 and σ = 5.9.
(a) The probability that a single
student randomly chosen [from the population] scores 21 or greater is P(X≥21) = 0.3421 using NDAREA with the above
mean and standard deviation.
(b) The SRS of 50 students has mean = 18.6 and standard deviation = 5.9/√50 = 0.8344.
(c) The probability that the mean score [of the SRS] is 21 or greater is P(X-bar≥21) = 0.0020 using NDAREA with the mean and
standard deviation found in (b).