August | 2010 | Things

One of the most elegant algorithms I have seen in a while is Timothy Chan’s algorithm for finding the minimum enclosing cylinder of a bunch of points in the streaming model. The actual paper can be found here, in case you are interested.

The minimum enclosing cylinder of a set of points is basically the cylinder with minimum radius that encloses all the points in the set. It is equivalent to finding the straight line for which the most distant point in the set is the closest.

Streaming model is basically a model where you are allowed to look at the input exactly once and you are not allowed to store the input in your memory, i.e., you are allowed only sub-linear amount of memory. So in this particular case, it means that you are shown the points one by one and exactly once. You are allowed to do computations as and when you feel like. In the end, you must return the minimum enclosing cylinder of all the points that were shown.

Now, this definitely is a difficult task, in fact, it might even be impossible to return the exact minimum enclosing cylinder in such a restrictive model. Therefore, we allow some error. We say that it’s fine even if the radius of the cylinder that you return is at most some constant (that does not depend on the number of points) times the radius of the actual minimum enclosing cylinder.

Consider the following algorithm. Call the first point $o$ and the second point $v$ . Assume that this is the axis of your cylinder and whenever a new point is shown, calculate the distance of that point from the line $ov$ . If this distance is bigger than the previous maximum, then update. Whatever you have in the end is the radius of the cylinder. Is this algorithm good? Or in other words, can we say that the radius that we have in the end is at most some constant times the radius of the actual minimum enclosing cylinder? No, we can’t. The reason is that it might turn out that $v$ is really close to $o$ and all the other points that we get are really really far away from $ov$ , but are very close to the perpendicular bisector of $ov$ . In that case, the cylinder with the perpendicular bisector of $ov$ as its axis would have been a much better choice, and the cylinder that this algorithm is returning can be very large compared to this cylinder. So this algorithm is no good.

Notice however, that the problem was caused by all those points that were very far away from $o$ . So if we can somehow make sure that there are no such points, then we could probably get somewhere. In fact, the paper I am talking about proves a more precise version of the statement above. In particular, it proves that if $d(p, ov)$ is the distance between the point $p$ and the line $ov$ and if $Rad(\{o, v, p\})$ be the radius of the minimum enclosing cylinder of the points $o, v,$ and $p$ , then

$d(p, ov) \leq 2 ( \frac{\|op\|}{\|ov\|} + 1 ) Rad(\{o, v, p\})$

This says that if we fix the radius of the minimum enclosing cylinder of the points $o, v,$ and $p$ and the ratio $\frac{\|op\|}{\|ov\|}$ , then that gives an upper bound on the distance between $p$ and $ov$ . So if we somehow make sure that the ratio $\frac{\|op\|}{\|ov\|}$ is small, say, bounded by 2, then that immediately tells us that the distance between $p$ and $ov$ is not more than $2(2+1) = 6$ times the radius of the minimum enclosing cylinder of the points $o, p,$ and $v$ . Therefore the cylinder with $ov$ as the axis won’t be that bad after all.

The only problem is that we don’t have such a bound on the ratio $\frac{\|op\|}{\|ov\|}$ . One way out is to cheat and say that we will look at the input twice instead of once. In the first pass, we will find the point $v$ that is farthest from the point $o$ and in the second pass we will use the above algorithm with $ov$ as the axis. This will make sure that $\frac{\|op\|}{\|ov\|} \leq 1$ and therefore we will know that the radius we return will be at most 4 times the radius of the actual minimum enclosing cylinder.

The other solution is to not cheat, and use the awesome two line algorithm given in the paper that, in a way, does both passes in just one pass. The algorithm starts with calling the first point $o$ , the second point $v$ , and assigning the value $0$ to the radius $r$ initially. Next, whenever it encounters a new point $p$ , it does the following.

Find $Rad(\{o, v, p\})$ . If it’s more than $r$ , then update $r$ , else do nothing.
If $\|op\| \geq 2\|ov\|$ , then change $v$ to $p$ , else do nothing.

The claim is that the final $ov$ is a good axis to choose for the cylinder. To see why this is true, let’s say that $v_1, v_2, \ldots , v_f$ are the different values that $v$ gets and $r_f$ is the final value of $r$ . The points that came after $v$ became $v_f$ were obviously close to $ov_f$ (when compared to $r_f$ ), because for all those points, the ratio $\frac{\|op\|}{\|ov_f\|}$ was small. The points that came between $v_{f-1}$ and $v_f$ were close to $ov_{f-1}$ for the same reason. Also, since $\frac{\|ov_{f-1}\|}{\|ov_f\|}$ is also not more than 2, we know that $v_{f-1}$ is close to $ov_f$ (when compared to $r_f$ ). Therefore, using some geometry, one can prove that points that came between $v_{f-1}$ and $v_f$ were not just close to $ov_{f-1}$ , but also to $ov_f$ . Similarly, points that came between $v_1$ and $v_2$ were close to $v_3$ and therefore were also close to $v_4$ and so on till $ov_f$ . In fact, the paper proves that all the points are within a distance of 16 times $r_f$ from $ov_f$ . Therefore, the cylinder with $ov_f$ as its axis and $16r_f$ as its radius contains all the points. The reason why it is a good cylinder to return is that the minimum enclosing cylinder has its radius at least as large as $r_f$ .

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31

	Connor Ritter on When is a fact a fact and why…
	Mariah Jackson on An argument against universal…
	Owen P on On Hofstadter’s Law (Par…
	Debra O on The Nash equilibrium of online…
	Aravind on The two most important lessons…

Things

"Things are not what they appear to be: nor are they otherwise." -Surangama Sutra

Menu

Monthly Archives: August 2010

Minimum enclosing cylinder