www.john-weber.com  

Chapter 1: Examining Distributions

Section 1.2: Describing Distributions with Numbers

Measuring center: The mean

This is the most common measure of the center of a distribution. The mean is found by adding the values of the variable and dividing this sum by the total number of observations. The notation used for the mean is a bar placed over the variable.

The mean is NOT a resistant measure of the center. In other words, the mean is sensitive to the extreme values within the distribution.

Measuring center: The median

In section 1.1, we used the idea of a center of a distribution. The formal name of the center is the median.

Here are the steps to find the mean:

  1. Arrange all observations in increasing order.
  2. Count the number of observations.

The median IS a resistant measure of the center. In other words, the median is NOT sensitive to the extreme values within the distribution.

Comparing the mean and the median

If the distribution is symmetric, then the mean and median are the same values. If the distribution is right-skewed, mean is greater than the median and if the distribution is left-skewed, mean is smaller than the median.

Measuring spread: The quartiles

The mean and median are not enough to describe a distribution. We need to describe the spread or variability of the distribution.

Measures of spread:

The first quartile if 1/4 of the way up the list. OR the first quartile is the median between the minimum observation and the median.

Here are the steps to find the quartiles:

  1. Arrange all observations in increasing order and locate the median.
  2. The first quartile (Q1) is the median between the minimum observation and the median.
  3. The third quartile (Q3) is the median between the maximum observation and the median.
  4. The second quartile (Q2) is the median of all the observations.

The five-number summary and boxplots

The five-number summary of a data set consists of:

MinimumQ1MedianQ3Maximum

The five-number summary offers a reasonably complete description of the center and spread of a distribution. These numbers lead to a new type of graph, the boxplot. Here are the steps to construct a boxplot on the TI-83.

Boxplots show less detail than histograms. Thus, they are used mainly for side-by-side comparisons of distributions.

Here is how to find the five-number summary on the TI-83.

Measuring spread: The standard deviation

The standard deviation is the more common measure of the spread of a distribution. The standard deviation measures how far the observations are from the mean. To find the standard deviation by hand calculation, you will first need to find the variance, s2. The variance is the average of the squares of the deviations of the observations from their mean (see the formula on p. 38). The standard deviation (s) is the square root of the variance. Luckily the ti-83 calculator can find the standard deviation for us.

Properties of the standard deviation:

The standard deviation is a very useful measure of the spread of a distribution.

Choosing measures of center and spread

ALWAYS plot your data.

Choose five-number summary for describing a skewed distribution.

Choose mean and standard deviation to describe a relatively symmetric distribution.


Back to John Weber's MATH 1431 Page
Back to john-weber.com