|This is the write-up of Assignment #12||
Brian R. Lawler
A slinky was held above the ground with distance sensor placed below. After the slinky was released, data was collected (with a CBL probe) for 30 seconds to record the height of the bottom. The slinky bobbed up and down and the probe measured the distance from the floor 295 times in about 30 seconds. Click here to see the data.
The task is to generate a function to predict observed data. Calculate a measure of the error between your model and the observed data by taking the square of the difference for each time, sum the squares, and divide by the number of data points.
To begin looking for a function to match this data, I first copied it into an Excel worksheet. Next, I plotted the data as an x-y scatterplot.
Next I began recording some observations. Most obviously, the data is sinusoidal, or oscillating. And from the initial situation, the height of the bottom of the slinky will oscillate. I also noticed that the amount of this oscillation decreases. In other words, the amplitude of the data decreases with time. And finally, it appears that the maximum heights may be coming down more than the minimum heights are coming up. In other words, a sort of midline for the data would not be horizontal, but have some sort of negative slope. There is an occasional "blip" in the data; however I decided this is typical of collecting this sort of data under these conditions and decided not to worry about these outliers.
My next step was to begin testing potential functions. I began with the linear function. I looked for a function that would go through the center of the data with a slight downward trend. After some initial testing (and some fine tuning later) I decided y = 0.73 - 0.0002x fit well. This looks like:
Next I jumped into working towards the sinusoid. I decided to combine the general sinusoid with this linear function and work towards something in the form y = (asin(bx + c) + d) * (e - fx). This became much more difficult than I initially thought. I decided I could compute the amplitude by dividing the range by 2. The peak of the data is 1.096 and the first trough is at 0.397. Subtracting and dividing by two yields an amplitude of 0.35. This served as my a value. Next I tackled the other coefficients through trial and error. The results yielded the function y = (0.35sin(0.2335x + 0.51) + 0.75) * (1 - 0.00035x), and look like the picture below. (Click on it to view the Excel file.)
I should say that in future work I veered away from this model a little bit. In the end, instead of multiplying by the linear I decided to add the linear function instead. If I were to extend this investigation further, I am curious to understand better the symbolic relationships in changes such as this.
I also should will briefly discuss my reasoning up to this point as to why this is a reasonably fit curve. I liked that my modeling function was decreasing, which acknowledged my earlier work with the linear function. I felt this was an important part of the model, since it was likely the spring holder's arm became tired and drooped a bit. Also, the initial amplitude and the period of the model function match well to the data. The most significant problem, and visually evident, is that I have not yet decreased the amplitude as time elapsed. Clearly, the bouncing spring expands and contracts less and less over time.
The next step for me was to figure out how to incorporate the damped oscillation into the function. I turned to another piece of software called Graphing Calculator. I was familiar with the notion of combining an exponential decay function of some sort with the sine curve to obtain this dampening. In conversation with classmates, someone alluded to the notion of the exponential behaving as a "boundary curve". I toyed with this for a little while in the software and more or less discovered that multiplying an exponential with a sine function would yield the type of dampening I hoped for. I initially thought that I may need to use a composition of functions, but by experimentation, that proved a faulty conjecture.
The graph below is an example to help illustrate my investigation and conclusion. The purple function is y = (e^-0.5x)(sin(12x + 0.8) + 1. Notice that it is a damped sinusoid. It is bounded by the exponential functions, both in blue, y = e^0.5x and y = -e^0.5x. Click on the image below to further investigate on your own.
Next I went back to Excel and tried putting this knowledge to use. I generated a generic curve I decided to try to make work. It was y = a - bx + (c^(dx + e))*j(sin(fx + g) + h). With all my best monkeying around, I could not obtain the exact behaviors I wanted. Click the image below to see my trials and the specific parameters that yielded the attempt below.
At this point, I realized that some of the parameters in my form of this function may be redundant. In hopes to streamline my efforts namely by giving me fewer parameters to estimate, I looked up a general damped harmonic function in a physics textbook and saw that y = a*exp(-bx) * cos(cx + d) involved some of the parameters I hoped for, namely the exponential decay and the oscillating function, in this case cosine. Also, I was happy to see it used base "e". However, it did not take into account the linear decrease I observed. I incorporated that by just adding the linear function I observed earlier and found this to work out well.
After further guess-and-test refinement of the parameters, I concluded that I was getting quite close with my estimation for the function to model the slinky data. This conclusion was based solely on my eyeballing the data and my function graph. I realized it was time to incorporate a more quantifiable analysis of the error in my model. I chose to look at the sum of the square of the residuals. Squaring the residuals results in a positive number. My goal then was to continue modifying parameters in search of the smallest sum of these values. My final function, as reported in the conclusion below, was the most accurate (i.e. lowest sum of the squares of the residuals) I could find as I adjusted each parameter on it's own to at least 2 significant digits.
As a brief aside, before presenting my conclusion, I wish to show a graph of the residuals associated with my best fit curve.
Surprisingly, it is quite obvious an oscillating pattern emerges in the residuals. Hmm... I am unsure if this means that my model function is consistently incorrect in some manner that I could fix with an adjustment in the parameters I've used, or if this means that my model function could be improved structurally... maybe an embedded sine function. At least it is comforting to notice that early in the data, the residuals seem to be much more random in their behavior - something I am more certain I could not find a way to deal with very effectively with an adjustment to the function.
The equation y = 0.36 * exp(-0.0038x) * COs(0.233x + 5.3)+0.73 - 0.00019x is my modeling function for the slinky data. The sum of the squares of the residuals is less than 0.10. And to the eye, this function matches the trends of the data extremely well. As always, click on the picture to bring up the Excel file in which you may further explore my investigation and modify it as you wish.
First of all, as I've said through out this investigation, I have many more questions I am interested in answering. Early in my work, I wondered what is the symbolic connection between various forms of functions that I worked with. I am curious as to which could be equivalent with the appropriate parameters and which are structurally distinct functions.
I am also interested in the challenge of linearizing the set of data in stages and applying each of the sinusoidal, exponential, and linear functions in a more symbolic manner as opposed to the mainly guess-and-test approach I undertook. I am aware of these techniques and software such as Maple that can aid in this analysis - certainly the number crunching involved in obtaining a linear regression is too tiring to want to do it by hand, this decade.
I would also further explore the residuals. I could look for significant trends in the data in a more formal, statistical manner rather than the eyeballing technique I have currently employed. The resulting sinusoidal behavior is rather intriguing
I am also aware of how an ANOVA analysis can state what portion of the variance can be attributed to the data and how much to the mismatch of the modeling function. It may be interesting to refresh my memory and apply those techniques in this case.
And finally, if I were given unlimited time (and funding), I would enjoy building a stochastic model that may consider the minor randomness in the data. I don't think I could determine whether or not this variance can be attributed to error in the measuring device, other components of the experiment, the natural behavior of a slinky, or some other cause. I could either use excel's random function or go to another piece of software designed to model systems I am familiar with called STELLA.
This problem was enjoyable because I got to bring back to the surface MANY different areas of mathematics that I am more or less comfortable in. I feel like I thought symbolically, numerically, graphically, geometrically, and statistically. Not to mention the always-present learning curve of putting these write-ups to electronic format. Thanks for reading!
|Comments? Questions? e-mail me at firstname.lastname@example.org|
|Last revised: December 28, 2000||