www.john-weber.com  

Chapter 8: Inference for Proportions

Introduction

This chapter presents the z procedures for one–sample and two–sample inference about population proportions, p. p is the proportion of the population that has some desired property (i.e., success). The population proportions, p, are unknown, so we use the statistic, , (i.e., of the sample(s)) to estimate p.

Section 8.1: Inference for a Population Proportion

The sampling distribution of

If is the sample proportion of successes of an SRS of size n from a large population, then

  1. the sampling distribution of is approximately normal for large n
  2. the mean of the sampling distribution is p (i.e., is an unbaised estimator of p)
  3. the standard deviation of the sampling distribution is
Note: the s.d. decreases when n increases. So, is a more accurate estimator of p when n is large.

Assumptions for inference

The z statistic is .

The distribution of the z statistic is approximately standard normal, N(0, 1) when:

  1. the population is at least 10 times larger than the sample
  2. np ≥ 10 and n(1 – p) ≥ 10
The second case means the estimation of p using is NOT valid when there are too few or too many successes in the sample.

Since p is NOT known, we need to use the standard error as an estimate of the s.d.: SE = . This is valid because for large n, is close to p.

The z procedures

A level C CI for p is ± z*×SE. Luckily, the TI83 can calculate the one–sample proportion z CI!

The one–sample z–test for a population proportion:

  1. State the hypotheses:
  2. Calculate the P–value
  3. Make your conclusion

Recall: H0 and Ha always refer to the population and NOT to a particular outcome. It is often easier (and more appropriate) to state H0 and Ha before looking at the data.

NOTE: Here is a great quote from the text: "statistics in practice involves much more than recipes for inference" (p. 436).

Choosing the sample size

The margin of error is . Reaaranging, we get

where z* is determined from N(0, 1) and p* is either a guess about p OR p* = 0.5. The latter case is called conservative since it results in the largest sample size for a given z* and m.


Back to John Weber's MATH 1431 Page
Back to john-weber.com