Spreadsheet software has been around almost as long as the personal computer. The early versions, such as Visicalc, were very limited in scope, and could do little more than sort, add, multiply and present numerical data. It was easy to see the potential of spreadsheet software. There almost immediately began efforts by programmers and users to make spreadsheets do more. Today spreadsheet software, such as MicroSoft Excel©, can do just about anything you want through the use of internal functions and after market addins. Also, the early spreadsheet software had no graphics capability. Now, most spreadsheet software has graphing capability for just about any type of graph that you would want for a presentation.
I will attempt to show how this graphing capability and the other capabilities of MicroSoft Excel© can be used as a curve fitting tool. There are addins available for Excel© that will do non-linear regression. What I will show here is that you can do non-linear regression, though maybe not as accurately, without these addins. All you need to know is what types of mathematical models fit the shape of a particular curve.
Elementary cooling can be modeled by the function where Ta is the ambient temperature, To is the temperature at time zero and k is a constant. If the ambient temperature and the temperature at time zero are known, it is a very easy process to calculate k and write the equation. What if we have a set of cooling data and the person who created the data forgot to record the ambient temperature? Then we must use more of a trial and error method to find the cooling function.
The data set shown below is the cooling data from a cooling experiment done with water. The water was heated to boiling and then allowed to cool. The temperature of the water was recorded every minute for 30 minutes. No ambient temperature was given. We will find the cooling function and, in the process, estimate the ambient temperature.
Time Temperature 0 212 1 205 2 201 3 193 4 189 5 184 6 181 7 178 8 172 9 170 10 167 11 163 12 161 13 159 14 155 15 153 16 152 17 150 18 149 19 147 20 145 21 143 22 141 23 140 24 139 25 137 26 135 27 133 28 132 29 131 30 130
This data can be placed into a spreadsheet in Excel©. Part of the spreadsheet is shown below with the data.
Let's make an initial guess at a cooling function. Assume an ambient temperature of 75 degrees F and calculate the corresponding value of k at t = 30 minutes. (see below)
Let's use this value for k and write an initial guess at the cooling function.
Using this guess, let's create another column in the spreadsheet and compare the values from the function with the empirical values in the spreadsheet.
I also added a column for residuals; that is, the differences between our function estimates and the empirical data. Let's look at the last few rows of the spreadsheet and see what the sum of the residuals is. We will use this value as a goodness of fit test as we make adjustments to the parameters in the function
As our function gets to be a better model for the cooling curve, sum of the residuals should decrease. We can also graph this data to get a visual idea of how well our function fits the data.
Not a very good fit, is it? What is wrong? First, from the shape of the blue diamonds representing the empirical data it would seem that the ambient temperature is somewhat greater than 75 degrees. We can say this because the graph is much less steep between 25 and 30 minutes, indicating that an asymptote was being approached. Let's try an ambient temperature of 100 degrees F and see what parameters we get and what the effect is on the residuals. With an assumed ambient of 100 degrees, k = -0.04391. Using this value for k we replace the function and look at the result on the residuals and the graph.
Much better. The sum of the residuals has decreased to -91.176 and the graph of the function looks much closer to the empirical data. However, the size of the sum of the residuals indicates that we can still do better. Lets try 110 degrees F as an estimate of the ambient temperature and see what happens. At 110 degrees k = -0.054308. Using these values in the function we will again look at the residuals and the graph.
Another dramatic decrease in the sum of the residuals to -34.28, and the graph appears to be almost perfect. Using the sum of the residuals as a control we can continue incremental increases in the estimate of the ambient temperature. At some point, the sum of the residuals will no longer decrease. Then we must back up and use smaller increments. Using this method, I found that the probable ambient temperature for this experiment was 114.2 degrees F. This seems a little high, but if the container for the water was left on something like a stovetop, this is not unreasonable. Using this value for the ambient temperature, k = -0.06076, and we get the following:
The sum of the residuals is now less than 5 one hundredths and the graphs of the function and the empirical data seem to lay one on top of the other. This manual technique is similar to the more rigorous statistical techniques in that the objective is to minimize the sum of the residuals. In the absence of sophisticated statistical software, such as SPSS or MiniTab, this technique can be useful. There are four more spreadsheets available for different types of data where I have used this technique to do curve fitting. If you wish to view the spreadsheets, click on one of the selections below.
If you want to view the spreadsheet for the cooling problem click here.