Thirteen Spheres
In 1694, Isaac Newton and David Gregory debated whether or not thirteen non-overlapping spheres could simultaenously touch a central sphere. It was known that this was possible to do this with twelve spheres. They would never get a definitive answer, and a proof that this is not possible only emerged in 1953.
Above, we try to get a hint as to the answer by using an optimization technique known as simiulated annealing, so named because of its similarity to the annealing process of manipulating a material's final properties by heating it up and cooling it down in a slow and controlled manner. In essence, the technique works by starting at a random state, and by constantly modifying it slightly. We progress slowly to an optimal solution by accepting modifications that lead to worse outcomes with a probability less than 1. If the modification is better, we always accept it. We slowly decrease this negative acceptance probability to 0, in order to arrive at a final solution. By accepting worse outcomes with a certain probability we get to explore more of the state space than we would if we used a greedier algorithm.
In the case of the thirteen spheres, we place the spheres randomly. Then we proceed by slightly nudging the location of one of the spheres. In each case, we calculate the maximum radius, \(R\) that the spheres can have without overlapping, and accept the new configuration with some probability less than one if the maximum radius has gotten smaller. To converge on a final solution, I decrease this negative acceptance probability according to the formula \[p=0.2 \exp(-t/5000),\] where \(t\) is the number of milliseconds that the animation has been running.
To make it clear which spheres are limiting the maximum size of the spheres, I have coloured the touching spheres red.
In the case of 13 spheres, we usually land at a final radius of between 0.9 and 0.91, which suggests that it may not be possible to simultaneously position 13 spheres of unit radius.
Of course, this technique applies to any number of spheres, not just 13. In the case of 12 spheres, the algorithm readily discovers that you can simultaneously place 12 unit spheres, and in fact you can position somewhat larger spheres, with a radius of around 1.1.
Simulated annealing is a very widely used optimization technique because of its extreme simplicity and generality. For the technique to apply, all you need is to have some notion of modifying the configuration to a nearby one.
Simulated annealing is an example of a Monte Carlo Markov Chain technique by the way that states are selected at random and future states are selected probabilistically.
On the other hand there are some shortcomings to the simulated annealing algorithm. Doing the algorithm repeatedly can be slow, especially if it slow to compute the cost function. The algorithm also uses very little information about the problem, which can make it inefficient. For instance, when the landscape of the cost function is smooth, stochastic gradient descent would be a better choice.
The simulated annealing algorithm works better the more moves you make and the slower you decrease the acceptance probability. In the demonstration, we decrease the temperature faster than you probably would if you were serious about finding an optimal solution.
There are some theorems that have been proven regarding the convergence of the simulated annealing algorithm to the optimal solution, but these generally require an impractical number of moves to be made, and are not relevant to practice.
I got the idea to study this problem from watching this video by the École Normale Supérieure, which covers exactly this topic.