Wednesday, February 08, 2006

Hercules vs. Hydra



While we're on the topic of Greek mythology, which we still sort of are due to the recent post about Echidnas, I recently came across a fun bit of math known as the "Battle of Hercules and the Hydra". As the ancient myth goes, one of the 12 labors of Hercules was to go kill the Hydra, a nasty creature with lots and lots of heads. When you'd cut its heads off, the Hydra would just grow more of them. It's one of the better-known Hercules stories; snakes and snake-like creatures look good in art, and they're easy to draw, so the battle's been a popular topic for artists over the centuries. Here's John Singer Sargent's take on it. Here's a version by Francisco de Zurbarán a Spanish painter of the 1630s. And one by the Italian Renaissance artist Antonio del Pollaiuolo. And that's not even the tip of the iceberg.

But the big reason I think the story's so well known is that we've all been in situations like that at some time. Well, not precisely like that, but situations where the problem seems to keep piling up and getting worse despite, or possibly because of, one's best efforts to solve it. It turns out that the situation is less hopeless than you might think, at least in theory. It turns out that Hercules always wins eventually, by simply standing there lopping off Hydra heads, if he just sticks to it long enough, and doesn't get discouraged by the rapidly mounting number of heads early on. Hercules didn't actually need that torch-wielding assistant, after all.

This paper quotes the exact mathematical definition for the Hercules vs. Hydra game:

At stage n (n a positive integer), Hercules chops one head h of the hydra H. If the predecessor of h is the root, nothing happens. Otherwise let h1 and h2 be respectively the father and the grandfather of h. The hydra sprouts n copies of the principal subtree determined by h1 without the head h from the node h2 (the the roots of the new copies are immediate successors of the node h2). Hercules wins the battle if he reduces in a finite number of attacks the hydra to its root.


Another paper mentioning the problem can be found here.

In short, most of the time you end up with more heads than before, but not always, and that's apparently enough to guarantee that Hercules will win in the end. But it'll take him a good while, but hey, he's a demigod and all, he's got the time. You'll notice that the problem as formulated isn't precisely the same as the mythological account. If anything, it's even more of a challenge, since each lopped head is replaced with two new ones. The reason for the difference is that the problem was formulated not to resolve a hair-splitting dispute over an ancient myth, but to express certain ideas in the mathematical discipline of proof theory. Namely, the battle is a "natural" example of a theorem which is true but not provable within the normal realm of addition, multiplication, and exponentiation (known in the trade as "Peano Arithmetic"), therefore requiring the use of stronger axioms. If I understand the explanation properly, anyway. The whole discipline is highly esoteric, and discussions aimed at anyone below grad student level are few and far between. And even then it seems to be assumed you've got a faculty advisor holding your hand and guiding you through the material.

It's reasonable to wonder why I'm bothering to poke around in all this mysterious abstract stuff. Well, it turns out that proof theorists are among the very few people on the planet who have a genuine need for really large infinite ordinal numbers, which is a subject I find weirdly fascinating. I'm not even going to try to explain what these numbers are used for, since I'm sure I'd bungle the attempt. For the time being, we're just interested in the numbers themselves, not so much what they're used for. Think of it as a grownup version of the playground "who can count the highest?" game.

First a quick note on notation: When you see a 'w', it's really supposed to be a lower-case 'omega', which looks sort of like a curly 'w', and which represents the first infinite ordinal. The plain vanilla one that I imagine is what everyone has in mind when they say "infinity", which is followed directly by w+1, the legendary "infinity plus one" you knew existed back on the playground in third grade, before your math teacher told you otherwise. Following typical comp-sci notation, w*2 is omega times two, and w^w is omega to the omegath power. When you see an underscore, it indicates something's a subscript. And when Greek letters are named, the naming's sort of case sensitive: "epsilon_0" indicates a lower-case epsilon with a subscripted zero, while "Gamma_0" indicates an upper-case gamma, for example.

Most discussions of ordinal numbers tail off after you get to epsilon_0, like the Wikipedia article referenced above. If you're lucky, you'll get a brief mention that there are additional epsilon numbers -- epsilon_1, epsilon_2, epsilon_w, epsilon_epsilon_0, and so forth, just to reinforce that the process goes on forever. Recall that epsilon_0 is defined as w^w^w^w... , and it's the point where you can't get any further with just finite ordinals, 'w', "+', '*' and '^'. Your system's run out of steam, and adding another w to the already-infinite stack of w's doesn't change anything. Epsilon_0 = w^epsilon_0. We've reached what's known as a fixed point. You'd think this would be a bad thing, but far from it. They're solid gold. The ordinals, even just the countable ordinals, are an unimaginably vast collection, and fixed points of various functions are among the rare guideposts in the territory. So instead of bailing out, we just introduce the new symbol epsilon_0, and we can take its successor ( epsilon_0+1) and otherwise multiply, divide, and exponentiate to our hearts' content, until we reach a new fixed point, namely epsilon_1. And so on, until the inevitable, infinitely nested epsilon_epsilon_epsilon... , also known as the first "critical epsilon number", and as phi_2(0). What you're looking at there is an example from the hierarchy of Veblen functions, which basically take the fixed point thing and abstract it up another level. Each phi function enumerates the fixed points of the previous phi function. You can start out a Veblen-style function hierarchy with whatever function you like, but classically you start out with phi_0(n) = n, i.e. 0,1,2,3...w,w+1...w^2...w^w., etc. And phi_1(n) enumerates the fixed points of phi_0(n): epsilon_1, epsilon_2, epsilon_3, and so forth, on and on and finally we hit that infinite stack of epsilons mentioned earlier. the first fixed point of phi_1(n), giving us phi_2(0). And then we've got phi_3, phi_4, phi_w, phi_(w^w), and so on.

This may be a good point to pause a moment and try to imagine, for example, the size of the gap between phi_42(666) and phi_42(667). That's 41 levels of nested fixed points, which themselves are an incredibly-widely-spaced set of little corks bobbing in a vast sea of 'normal' ordinals. Ok, maybe it's better not to try to imagine that. Perhaps the only reasonable thing to do at this point is forget that we're talking about infinity at all, and just think of it as an exercise in abstract symbol manipulation, with levels upon countless levels of nested recursion. Which is still fun to think about, at least for a computer geek like myself. We certainly aren't going to resolve the question of whether these rarefied numbers we're talking about here are "real" or not. It's not that I'm against philosophical debates, per se. It's just that I usually have very little to add. If I was forced to pick a position, I'd incline toward a sort of Platonist notion that infinitely large and small numbers are somehow reflective of an underlying reality independent of human observers. Real, but perhaps not truly graspable by anybody with a mere few pounds of grey matter to work with. But that's just a personal hunch, and I can't imagine how you'd go about testing it.

In any case, the phi_n hierarchy itself runs out of steam eventually, and its limit is the ordinal Gamma_0. Gamma_0 is a milestone in that it's the first "impredicative" ordinal. I don't really understand the implications of this very well, but I gather that at least some schools of thought hold it's something to be avoided if possible. Or at least not overly enjoyed.

Beyond Gamma_0, the landscape is less clear (to me, anyway). Well, for starters you have Gamma_0+1, I guess, and then you repeat the whole process that got you to Gamma_0 until you run out of steam again, at which point you're at Gamma_1, Gamma_2, and so forth. The subject of proper ordinal notation for "large" ordinals seems to be the subject of ongoing research and debate. A Google search on the term "ordinal notation" will bring up lots and lots of hits. Here's one somewhat comprehensible treatment of the subject. Here's a decent article titled "Transfinite Ordinals and their Notations: For the Uninitiated". [Note: Like a lot of academic papers, it's on the net in PostScript (not PDF) format, which is sort of inconvenient for most people. One way to go is save it to a file, and run it through a PS to PDF converter, like this one.]

The referenced article describes Bachmann's theta notation, which is a more powerful extension of what we've looked at so far. Theta_0(0) = 1, Theta_1(0) = epsilon_0, Theta_2(0) = phi_2(0), Theta_Omega(0) = Gamma_0, and that's just the beginning. Note the capital 'O' in Omega there. This is not a reference to the familiar 'w', the first infinite ordinal, but rather the first uncountable ordinal, a.k.a. the cardinal number Aleph_1. Note: I'd originally had a comment here saying that all depended on what you thought of the Continuum Hypothesis, but that was a momentary bit of boneheadedness on my part. Ordinals actually don't have anything to do with CH, as far as I can tell. In any case, cardinal numbers aren't important to the present topic anyway. The important, and weird, thing here is that Omega (also alternately known as w1/omega_1) is much larger than the countable, recursive ordinals we're talking about right now, but it, and even larger ordinals, are necessary as indexes into the massive Theta hierarchy. We're no longer defining numbers as combinations of smaller, simpler numbers; no, at this point to describe a given number we need to assume the existence of a vastly larger number, and use that number to help us describe the number we really care about. Some people are ok with this, others get the philosophical heebie-jeebies.

Since not everyone is comfortable with monkeying around in the higher number classes just to come up with names for plain old countable, recursive ordinals, there are a number of other approaches one can use. The simplest is just to extend the classical phi hierarchy and create ternary, or even n-place Veblen functions. The notation usually gets modified a little, dropping the function notation and subscripts and such, so it looks like you're naming a number, not a function, i.e. instead of something like phi_1(0,0) we just have phi100 -- which just happens to be yet another name for our friend Gamma_0. It might help to think of the levels of nesting as the ordinal equivalent of decimal places, except that each 'place' can be as big as we like: phi110 (= Gamma_1), phi[w]00, phi[epsilon_0]00, etc., where the square brackets are just something I'm adding for clarity since I've been trying to avoid using actual greek letters. Anything in square brackets is really a single "digit". You don't often see phi extended past 3 places, but apparently it can be done if you're so inclined. I suppose the next logical step would be to give a count of the number of places, assuming zeros for trailing "digits". This is kind of the equivalent of scientific notation, so let's say phi100 = phi1^3 (in a trivial case), with phi6^4 = phi6000, phi6.6^4 = phi6600, phi1^10, phi1^100, ad infinitum. I don't know enough about the n-place Veblen thing to be able to say if there's a limit to this madness. Can one have a non-finite number of places? Is there such a beast as phi1^w? phi1^[phi100]? phi1^[phi1^[phi1...]]]? I really don't know.

There are also more complicated notations out there: Something called "Klammersymbolen", which translates simply to "bracket symbols", for one. I don't really understand how they work, but the notation is quite complex: There's an upper and lower row, each of which specifies a sequence of ordinals, with the whole assembly surrounded with large round brackets. There's a further extension of the idea called "ordinal systems". Concrete examples of either are hard to come by, so I don't know what the equivalent Klammer notation would be for, say, phi[epsilon_0]^[10^100].

Continuing on, we run into a few semi-important ordinals: The Ackermann ordinal (Theta_Omega^2(0)), and the small (Theta_Omega^w(0)) and large (Theta_Omega^Omega(0)) Veblen ordinals. The literature varies on the exact notation for all three, and it's unclear which but either the Ackermann or the small Veblen is the limit of finite n-place Veblen notation, in other words one of the two is equal to phi1^w, but I'm not sure which one. Which does indicate that the bigger-cardinal notation is way more extensible than the idea of simply adding more places to your phi notation, all philosophical objections aside.

The next really important landmark out there is an entity known as the Howard (or Bachmann-Howard) Ordinal, not to be confused with "Bachmann Turner Overdrive", of course. A lot of the discussion I've seen treats this number like some sort of Holy Grail, without explaining why. So I know they think it's very important, for some reason, I can say that much. In the Theta notation, it's Theta_(Omega^Omega^Omega...)(0), or Theta_epsilon_(Omega+1)(0). Which really tells me nothing useful, and leaves me scratching my head, I have to say.

As an example of how the notation can trip you up, this post used to assert that the Bachmann-Howard ordinal was Theta_(Omega_Omega_Omega...)(0) rather than the correct Theta_(Omega^Omega^Omega...)(0). That may seem like a trivial difference, but the former is vastly larger, and gets us into the thrilling world of large cardinals, specifically (I think) the first inaccessible cardinal (also known as "Aleph_Aleph_Aleph...").
If we're going to take the step into large cardinal land, we can go on iterating the Theta hierarchy as long as we need to, letting the big cardinal do all the heavy lifting. Alternately, Setzer's "ordinal systems" approach is supposed to get you well beyond the Howard ordinal without ever referring to any number larger than itself. If I had examples of this, I'd show them, but I don't. I understand that the ordinal systems notation is an extension of the already-complex Klammersymbolen, so if there are any examples out there, there probably aren't any concise examples out there.

And far beyond that we have the Church-Kleene ordinal, often denoted by w1CK. This is the first non-recursive, or non-computable, ordinal, the absolute limit of what it's possible to do recursively. Think of a rough analogy with the Busy Beaver functions discussed a few days back (although BB(x) is merely an example of a noncomputable funtion, not the actual upper bound on recursive functions). Any scheme you come up with for counting really, really high is guaranteed to run out of steam before you get to w1CK. It's kind of like the lightspeed of counting. The higher you go, the harder it gets to go even higher, and you'll still never get to w1CK by counting, no matter how hard you try, no matter what clever tricks you devise. Which is not to say we're out of ordinals. Oh, no, far from it. It seems that w1CK is just the first of something called "admissible ordinals", so that there's also a w2CK, w3CK, and so on. The fundamental thing here is that while you can count countable ordinals as long as you like, the set as a whole is uncountable. Unlike the (also uncountable) real numbers, which are uncountable due to their overwhelming density everywhere on the number line, ordinals stick the uncountability out on one end. The reals you can't count anywhere, but ordinals you can count, and count, and count, and it'll just never be enough. Not only can you not count to Omega (the next higher cardinal), you can't even count to a lesser milestone on the way there (w1CK). And then there's still an unimaginably huge sea of ordinals separating w1CK and Omega.

Even then, the ordinal Omega (or omega_1) (a.k.a Aleph_1) still lies off in the distance, far, far away, and then there's omega_2, omega_3, omega_omega, stack_of_omegas, and on, and on, and on.

All of which really makes me want a beer. And I see -- in a remarkable coincidence -- it's right about time for happy hour. So I think I'll stop right here for the time being.

Updated: Two subsequent posts of mine sort of elaborate on the material here. So if you liked this one, you'll love SuperHyperRedux and No Name. To teh Megist(r)on is also somewhat related.

1 comment :

Jim Apple said...

I used to think Omega_Omega_Omega... = Aleph_Aleph_Aleph... was the first inaccessible cardinal, but it is not regular (it has cofinality Aleph_0).