We are often interested in comparisons among several distributions or relationships among several variables. A study of data often leads us to ask whether there is a correlation between two variables that are closely linked in the data.
Types of variables
explanatory variable | explains changes in a response variable | graphed on horizontal axis |
response variable | measures outcome of a study | graphed on vertical axis |
To study a relationship between two variables, we need to measure both variables on the same individuals. But we need to be cautious of possible lurking variables, i.e., other variables not being studied by which may influence a possible relationship between the explanatory and response variables.
Another caution: a relationship between two variables may not be a causal one.
General procedure for studying possible relationships between two variables:
The most common way to display the relation between two quantitative variables. The explanatory–response pair is graphed on a set of axes analogous to graphing the point (x, y). If there is no explanatory-response distinction, either variable can go on the horizontal axis.
Here is how to construct a scatterplot on the TI-83.
The overall pattern of scatterplots by examining the form, direction and strength. Thus, to describe the overall pattern of scatterplots look for:
We can add a categorical variable to a scatterplot by using different colors or symbols to plot points. See examples 2.5 and 2.6 on pp. 87–89 of the text.