For this assignment, we begin with four numbers, A, B, C, and D. Because we're going to use a spreadsheet, we'll restrict them to real numbers and avoid the complex plane. Given two numbers, we define the distance.
Distance: How far we would have to travel along a number line from one the the other.
In symbolic math, distance is absolute value. dist(A, B) = |A - B|
For example, the distance between A = 4 and B = 2 is 2. The distance between 2 and 4 is also 2, since we travel the same amount from point A to B, just in the other direction. The distance between 5 and (-3) is 8. The distance between 2.45 and 0.984 is 1.466.
If we travel from A to B, then B to C, C to D, and D to A, we form a tour. Our tour distance is the sum of the four trips, |A - B| + |B - C| + |C - D| + |D - A|. The minimum tour distance is zero, when A = B = C = D. There's a whole body of research about how to form tours of minimum length, the traveling salesman problem. Our problem is not like that, because we're fixed in our path. We can't jump from A to C or anything like that.
Instead, we're going to imagine taking repeated tours on the resulting distances. For example, if we start with A = 4, B = 2, C = -2, D = 5, the four journeys are of length |A - B| = 2, |B - C| = 4, |C - D| = 7, and |D - A| = 1. We'll then use these four numbers to conduct another trip, yielding 2, 3, 6, and 1. If we do it again, we get 1, 3, 5, and 1. If we do it ... well, you get the picture. This works well with a spreadsheet, because we can make the computer repeat the distance taking. I created a sheet in Microsoft Excel that does this task. For (4, 2, -2, 5), on the sixth step all the distances become 2, and then on the seventh step total tour distance becomes zero.
Let's say we are trying to find a set of A, B, C, and D that had a large number of steps before tour distance becomes zero. (Maybe it's a contest for fame and fortune.) The first approach to doing this is trial and error. If you want to try this on your own, here's the Microsoft Excel worksheet. Have fun!
Trial and error isn't very efficient. We'd like to make the computer work. One approach is to have the computer loop through and look at all the possibilities, keeping track of which one gives the most steps. This approach is called Grid Search. It's more brute force than elegant, because the computer will have to try a lot of bad possbilities. But it will find the maximum, eventually. Microsoft Excel is very slow, so I wrote some functions in the statistical programming language R. If you want them, they're linked at the bottom of the page. All we need now is a computer and time.
The problem with grid search is that we need LOTS of time. For instance, let's say we wanted to look at integer values of A, B, C, and D between 1 and 20. If we just ran loops of that size, we'd need to run the algorithm 20 \times 20 \times 20 \times 20 = 160,000 times. In general, searching from 1 to N means N^4 attempts. That power of 4 is really scary, because doubling the possible numbers means the number of trials increases 16 times!
Mathematics can help us by reducing the number of searches. The most important aid is translation. Say we have A, B, C, and D. We add the same number X to all four values, yielding (A + X), (B + X), (C + X), and (D + X). What happens to the distances?
The distance between (A + X) and (B + X) is the same as the distance between A and B. This makes physical sense, because moving two places down the road doesn't change the length between them. Knowing this means that we can fix one of the values, then generate additional solutions by translation. Because we like simple numbers, let's fix A = 0. Doing so reduces our N^4 attempts to N^3 , which is very helpful.
There's another symmetry trick we can use. Now that we've fixed A = 0, if we look at the starting set (0, B, C, D), the first step becomes |B|, |C - B|, |D - C|, |D|. If we flip D and B, starting with the set (0, D, C, B), we get a first step of |D|, |C - D|, |B - C|, |B|. Absolute value is symmetric, meaning |B - C| = |C - B| and |C - D| = |D - C|.
This means the first step for (0, D, C, B) has the same numbers as for (0, B, C, D), in reverse order: |D|, |D - C|, |C - B|, |B|. Physically, our trip is a circle, so it doesn't matter in which direction we go. The two start sets (0, B, C, D) and (0, D, C, B) lead to the same answer, and we only need to test one of them. This symmetry reduces the problem by a factor of 2.
Even with the reduction from N^4 attempts to N^3 / 2 , grid search will still take a long time. I ran the program to test the integers from 1 to 1000, and it took about 7 hours. It's a good thing we found the translation and symmetry, though. Without that, it would have taken 7 \times 2 \times 1000 = 14,000 hours, roughly a year and a half. That might be a problem.
|Smallest sets for given number of trials|
Even with a faster computer, grid search will not cover enough space in any reasonable amount of time. We need to rely more on active thought than brute weaponry. We can look for patterns in solutions from the computer. One possibility is the multipicative gap between B, C, and D. I used Excel to look at those ratios, C / B and D / C.
It appears that the optimal ratio for C / B is converging to something around 2.83 or 2.84, and the optimal D / C ratio is around 2.19. Instead of wandering throughout the entire space, we can focus on that subspace and get progressively tighter. Once we find a higher number of trials, we focus on combinations near that new set of B, C, and D values. Increasing depth gets us longer and longer paths. The details are more about computer science than math, so we'll leave them for another day.
If we use numbers that are 9 digits long, we get the set (0; 100,000,000; 283,928,677; 622,226,257). There are 37 non-zero steps!
We're limited only by the ability of our computer program to precisely handle numbers with lots of digits. The language R handles 16 significant digits. By repeatedly focusing, I achieved a sequence of 66 non-zero steps. Note that the first few digits of these numbers are the same as the 9 digit set.
As a historical note, the name grid search arose from searching for missing persons. The area where a person might be is drawn on a map, then people are assigned to travel across each area. Here's an example of setting up grids. While slow and often painful, grid search is still used successfully. For instance, here's a 2008 missing person case.
If you're an R programmer, all four functions are available in this text file. Comments are included.