TITLE: 4 to the Millionth Power AUTHOR: Eugene Wallingford DATE: September 22, 2005 8:09 PM DESC: A touchy argument led me to the sort of refuge I like best: a program. ----- BODY: First I risk veering off into a political discussion in my review of a Thomas Friedman talk, and now I will write about a topic that arose during a contentious conversation among the science faculty here about Intelligent Design. Oh, my. However, my post today isn't about ID, but rather numbers. And Scheme. Here's the set-up: The local chapter of a scientific research society has invited a proponent of ID to give a talk here next week. (It has also invited a strong opponent of ID to speak next month.) Most of the biology professors here have argued quite strenuously that the society should not sponsor the talk, on the grounds that to do so lends legitimacy to ID. A vigorous discussion broke out on the science faculty's mailing list a couple of days ago about the value of fostering open debate on this issue. That is all well and good, but I'm not much interested in commenting on the debate itself. However, in the course of the discussion, one writer attempted to characterize the size of the search space inherent in evolving DNA. He used as his example a string of size one million, though I think he was going more for effect than 100% accuracy on the size. Now, each element in a strand of DNA is one of four bases: adenine (A), thymine (T), cytosine (C), and guanine (G). So, for a string one million bases long, there are 41,000,000 unique possible sequences. In order to show the size of this number, he went off and wrote a C program using the longest floating-point values available. 4 to the millionth power overflowed the available space. So he tried 4100,000. That overflowed the available space, too, so he tried 410,000. Same result. Finally, 41,000 gave him an answer he could see in exponential notation, a number on the order of 10602. That's big. The next day, my colleague and I were discussing the combinatorics when I began to wonder how Dr. Scheme might handle his problem. So I turned to my keyboard and entered a Scheme expression at the interpreter prompt:
          > (expt 4 1000)
That answer popped out in no time flat. Emboldened, I tried 410,000. Again, an answer arrived as fast as Dr. Scheme could print it. My colleague, a C programmer who doesn't use Scheme, was impressed. "Try 4100,000," he says. This one took a few seconds -- less than five -- before printing the answer. Almost all of the delay was for I/O; the computation time still registered 0 milliseconds. The look on his face revealed admiration. But we still hadn't produced the number we really wanted, 41,000,000. So we tried. Dr. Scheme sat quietly, chugging away. Within a few seconds it had its answer, but it took a while longer to produce its output -- a few minutes, in fact. But there it was, the 602,060-digit number that is 41,000,000. Very nice indeed! I was again impressed with how well Scheme works with big numbers. You can imagine how impressed my colleague was. C compilers produce fast, tight code, but you need something beyond the base language to compute numbers this large. Scheme was more than happy to do the job out of the box. Of course, if all my colleague wanted to know was the order of magnitude of 41,000,000, we could have had our answer much quicker, using a little piece we all learned in high school:
          > (* 1000000 (/ (log 4) (log 10)))
602,060 digits. That sounds about right! -----