The following data is based on the first class letter postage for the US Mail from 1933 to 1966.
|YEAR||RATE (IN CENTS)|
Lets look at the scatter plot of this data:
Does this function represent a linear model, log model, power model or an exponential model? Even though when we look at the scatter plot it appears to be exponential, but we can't just use our eyes to determine which model the data fits best.
Since a power function and an exponential function are pretty similar lets examine both models.
A power regression is in the form y=ax^b, by transforming the equation into a linear form we can look at the correlation coefficient. By looking at the correclation coefficient we are able to decide which model best fits the data, power or exponential.
Lets begin with our form for power function
Take the log of both sides
log y = log (ax^b)
Use log properties to perform the following:
log y = log a + (b log x)
Let log y = Y and Let log a = A and log x = X
therefore, Y = A + bX, so now we have a linear form of the power function and we can use (log x, log y) to graph our data. By using (log x, log y) we can look at the correlation coefficient and determine if it is a good fit for the graph.
|log x||log y|
The graph below shows the power function of the stamp data using (log x, log y). We have drawn the trend line for the data and found the correlation coefficient which is 0.96. The closer the correlation coeffiicient is to one the better fit the data is.
Now lets look at the exponential model and see if its correlation coefficient better fits the model.
Lets begin with the form y = ab^x
Take the log of both sides:
log y = log (ab ^x)
Use the properties of log to obtain the following:
log y = log a + x log b
Let log y= Y and log a = A and log b= B
then Y = A + Bx
therefore, Y = A + BX, so now we have a linear form of the exponential function and we can use ( x, log y) to graph our data. By using ( x, log y) and we can look at the correlation coefficient to determine if it is a good fit for the graph.
The graph below shows the exponential function of the stamp data using (x, log y). We have drawn the trend line for the data and found the correlation coefficient which is 0.9598. The closer the correlation coeffiicient is to one the better fit the data is.
Since the correlation coefficient of the exponential model and the power model are very close either model will work.
Now we need to find the equation that best fits the power function and the exponential function so we can use the equations to predict years and rates.
In order to find the equation we need to find the slope and y-intercept for our data.
To find the slope: Correlation Coefficient * (Standard Deviation of X divided by the Standard Deviation of y).
To find the y-intercept: Mean of y-Slope*Mean of x
The following spreadsheet shows how the standard deviation of x,y, slope and the y-intercept were calculated.
|Total Sum Squared||8204||2412.2368|
|Divide by Sample||512.75||150.7648|
Now let's take a look at the equation of the power function and the exponential function:
Y = 1943.9823 + 1.77041096 X
Y = 1943.98886 + 1.77004212 X
Now we can use these equations to predict which year the stamp will reach $1.00.
Since the functions are so close it doesn't matter which one you choose.
When the cost of the stamp will be 64 cents?
How soon should we expect the next 3 cent increase?
Return to: Jennifer Weaver's Home Page