Sunday, June 13, 2010

Cross-Post: How Many Books are in the Library of Babel?

I remembered at some point in my cogitations upon qualitative wrongitude that I had actually covered something even wronger back on my Playskool blog. Here it is. I mixed up a couple things, like math at one point, and Borges actually lays out some of his figures in the story and I didn't take this into account. Research fail. But I wanted to mainly highlight that Garou is wronger here than the Creationists are in, well, any context I could think of, but he still manages to change his mind. Good on him! The difference, of course, is that Garou was willing to listen to reason and concede defeat instead of dogmatically defending his misconceptions. Anyway. Enjoy!
Jack and I were talking in Borders the other day about a short story by Jorge Luis Borges called The Library of Babel (Wikipedia page here - I've skimmed parts of it, and there are some differences between it and what we had talked about). In this story, according to Jack (and that's what I'm working with here, since this was the frame for the original disagreement), Borges describes a library containing a series of books, all 450 pages in length, and each book is one of all the possible combinations of characters that can be placed in 450 pages of space. Jack went into the details of the story, and then we started talking about just how many books that would be. After ruminating on all the possible combinations of Hamlet with any number of typos (including moving the word "fuck" one space to the right in successive iterations, as well as multiple repetitions of Hamlet and variations thereof, such as Hamlet-Tom Clancy Novel-Hamlet, or Backwards-Hamlet with or without the character of Hamlet being named "Backwards Hamlet"), Jack decided that it was more books than there are atoms in the Universe. I readily agreed.

I related the conversation to Silver Garou, who expressed extreme skepticism that there were more books in that Universe than atoms in ours. In fact, I believe his exact words were, "There's no way there's more books in that library than atoms in our Universe!" Or something to that effect. So today, I decided to do some math. By some standard measurements, there are 250 words per page, and a "word" - for publishing purposes - means six letters. Working with 450-page books, that gives us:

250 words/page x 6 letters/word x 450 pages/book = 675,000 letters/book

As for how many books this is, we can think of each book as a number - a long number in a strangely high base. For instance, if we were looking at all the "books" we could have using only the numbers zero through nine, and each "book" is only two characters long, that leaves us with 100 books - 00 thru 99 - or 1x102 books. Each of our Babel books is simply a number that is 675,000 characters long, and for each number in that series, we have a single book. In base ten, this would be every combination from 675,000 zeroes in a row to 675,000 nines in a row, for a total of 1x10675,000 books. So... what's our base? That's determined by how many characters are in our total alphabet, as each one of those can be a digit in our number:

52 alpha characters (26x2 - for caps)
30 accented characters (tilde, both ways accents, umlaut, carrot, horizontal line - 6 accents over each of 5 vowels)
48 greek characters (again with the caps)
10 numbers
32 additional characters on a keyboard
2 more for the cedilla (that fuckin' French C with the curlicue beneath it, caps & lower)
1 space
TOTAL: 175 characters

So we're looking at 1x10675,000 books - in base 175. So we're clear, this is a severe lower-bound number, as I'm excluding Egyptian/Chinese/Arabic/etc. characters. Mainly because I don't know how many characters there are in those languages. But anyway, imagine that you had to count to a number, but your first digit had to get up to 175 before you got to "10," and you had to get to 175 175's before you got to "100" (ten tens), and you had to keep counting until the number was 675,000 digits long, and then exhaust all of those possibilities (you get to stop counting right before the next number in sequence would make your number 675,001 digits long).

We would have 1x175675,000 books in our Library of Babel. At least. The reason this works, in short, is that scientific notation is awesome. At length, any number can be represented as [number][base]x[base][base][power][base] - and each of those numbers has its own base. As long as you're sticking with the same base throughout, then you don't need to worry about notating it and that will give you your straightforward number (we use base ten most of the time, so we don't even bother).

A "base" determines how high you can count on one digit before you need to go back to zero and count with the next number, or when they all go back to zero you add another digit: base two (binary) counts 1, 10, 11, 100, 101, 110, 111, 1,000, etc. The pattern is that you get one number (zero doesn't count because it's 0, 00, 000, and so on), then have to increase the digit count to count higher, then you get two numbers, then increase the digit count, then get four numbers, and increase the digit count. Base three (ternary) counts 1, 2, 10, 11, 12, 20, 21, 22, 100, 101, 102, 110, 111, 112, 120, 121, 122, 200, 201, 202, 210, 211, 212, 220, 221, 222, 1,000, etc. The pattern now is that you get two numbers and then have to increase the digit count, then you get six numbers and increase digit count, then get eighteen numbers and increase digit count, and so on. In base ten, we humans count 1-9, 10-99, 100-999, and so on. The pattern here is that you get nine digits and increase, then 90, then 900. Here's the magic: 1, 2, 4; 2, 6, 18; 9, 90, 900; are all series of the same composition, namely (x-1)x10x(n-1), where x is your base (and n is your step in the series). Looks an awful lot like scientific notation, doesn't it?

I think this might be some universal language among base number systems, or just an easily-convertible method of notating numbers (which is useless for anything else). I don't know, I kind of discovered this on my own while trying to figure out the answer to this problem. There's probably a name for this, and math majors probably know it. I don't (but I know the math works). Whatever, the point is that you can convert numbers from one base to another by "exporting" that base like I've done - I just left some labels out. I started with the figure "1x10675,000," but I should have notated it as "110x1010675,00010," or "One, base ten, times ten, base ten, to the power of six-hundred-seventy-five-thousand, base ten." For an example of how this works, the number 365 (days in the year) can be represented as:

This gives us 3.65 (in base ten) times ten (in base ten) to the power of two (in base ten). The first number (3.65) gives you the first few digits of the number, the second number (10) tells you your base, and the third number (2) tells you how long your number is (102 means that two zeroes come after the one).

Now, I want to find out how big a number is if it's a one (in base ten) with six-hundred-seventy-five-thousand (in base ten) zeroes after it, in base one-hundred-seventy-five. I could shortcut this as (1x10675,000)175, but this is useless; I want my answer to be in base ten. So how do I do this? Well, using base ten throughout, 1x10[anything] will give me that many tens, all "times" each other - 1 is just ten, 2 is "ten times ten," 3 is "ten times (ten times ten)," 4 is "ten times (ten times (ten times ten))," and so on. Just replace every time I said "ten" with "one-hundred-seventy-five," and even King Douchebag of Fuckhead Hill (don't ask) - who says he's shitty at math (I tested this on him) - can understand that this is like counting in base 175, converted to base ten. So, the number I want to find is 110x10175675,00010, or "one, base ten, times ten, base one-hundred-seventy-five, to the power of six-hundred-seventy-five-thousand, base ten." I replace 10175 with its decimal equivalent, 17510 (just like 102=210=23, or 1012=123=510 if you like advanced stuff), and do the math: 175650,000, and put it back in scientific notation. Ka-pow, finished!

The trick, of course, is keeping your bases straight and knowing when to do the math and when not to. That done, it's a piece of cake, I swear!
This gives us... too large a number, it turns out. No calculator I was able to find had the capacity to tackle that straight on. I had to break it down:


This can also be written as (1756.75)100,000, and 1756.75=3.8x1014. And, as everyone knows, (3.8x1014)100,000=3.8x101,400,000. That many books. (EDIT: I fucked up. In my original calculations, I had somehow substituted 650K for 675K, and I did a double-plus un-good math when I decided that 175^(6.75x10^5)=175^(6.75^(10^5)), and that puts me at the same roadblock I'm at in the next problem (outlined below), so results are pending the math professor's review. I'd given an outline of the problem to King Douchebag of Fuckhead Hill with instructions to the math professor to show work, but he never came through. Hoo-ha! Edit over.)

Now our question is, how many atoms are there in the Universe? There are several answers to this question. I'm going to go with Wikipedia's calculations on matter content of the observable Universe, which yields two figures: a lower bound of 3x1079, and an upper bound of 7x1079. All the other figures I was able to find either corroborate these data, or are dramatically lower. The lower boundary is a rough-and-ready approximation of the number of atoms in all the stars, were they broken down to hydrogen atoms (so one helium atom is just two hydrogen atoms), and stars account for well over 90% of the mass in their systems. The upper boundary figure is based on the mean density of the whole observable Universe and its volume. Both of these figures account for all 80 billion galaxies, with the 3 to 7 x 1022 stars therein (in sum, not each). Even supposing that we counted all mass, not just normal atoms, it would come to about 1.75x1081 hydrogen atoms (were all mass converted into hydrogen). Keep in mind that though this is based on the "observable Universe," and there may be very much that we haven't observed, "the observable Universe" is every fucking thing we've seen, ever. Silver Garou suspects that these figures are "hugely off," but I don't think so - these numbers are still mind-bogglingly huge, just not quite on the order of the hugeness of those 450-page books.

Still, let's work in some margins of error. The orders of magnitude of difference here are, themselves, on the order of orders of further magnitude. Like, Creationists think the Universe is 6-12,000 years old when it's more like 15 billion; Bill Gates thought nobody could ever need more than 64K hard disk space and we've got terabytes; and then there's this (to be fair, Garou has simply supposed that there are far more atoms in the Universe than we can even get close to verifying, and by orders of magnitude, but on scales which it is beyond the capacity of the human mind to comprehend - he is not an expert in the field making a terrible prediction, or an asshole trying to shoehorn observed facts into taken-for-granted belief systems). Let's take this supposed number of atoms in the Universe and assume that it's off by the order of magnitude of itself, so:


...keeping in mind that 100100 is 100 times itself, 100 times, this is like taking every atom in the observed Universe and splitting it into a number of atoms equal to the number of atoms in the observed Universe, and repeating the process a number of times equal to the number of atoms in the observed Universe.

Dammit. This also overflowed any calculator into which I put it, but it can't handle powers of more than two digits. But I have a sneaking suspicion it will still be short. I gave the problem to King Douchebag of Fuckhead Hill, and he's going to show it to some math professors tomorrow. We'll see how that goes. (EDIT AGAIN: That still didn't happen. But if anyone wants to correct my mistakes, or explain how to do the steps I'm missing, or even just link me to a page explaining how to do so, then I will happily correct it all!)


Ebonmuse said...

Great minds think alike, D! I did a post on this same topic a while back:

My assumptions were a bit different than yours, but math errors or no, the number I arrived at is actually pretty close to yours (for a given value of "close").

And for the record, you're quite right, and Garou is wrong; there are a lot more books in the Library of Babel than there are atoms in our observable universe, by many, many orders of magnitude. In fact, as I argued in my post, the infinitesimal fraction of Babel Space devoted to just one book and its close variants is hugely bigger than our universe!

D said...

The question is, if Backwards Hamlet invaded Tolstoy Space, would France still surrender?

OK, so now I have an idea for calculating the page of Babel, one front-and-back page of printed text from which all books in the Library could be constructed, and figuring out how many pages it would have to be. But that's like a little project and not something I can do in five minutes before I go to bed.

I'm surprised that nobody mentioned Three Versions of Judas. Maybe I'll go do that now.

Kold_Kadavr_flatliner said...

Isn't that a googleplex? 100X100? Think so.

D said...

No, a google is a 1 with 100 zeroes after it, so 10^100. A googleplex is a 1 with a google of zeroes after it, so 10^10^100 (ten to the power of [ten to the power of one hundred).