TITLE: Refactoring as Rewriting AUTHOR: Eugene Wallingford DATE: October 07, 2009 8:11 PM DESC: ----- BODY: Reader and occasional writer that I am, Michael Nielsen's Six Rules for Rewriting seemed familiar in an instant. I recognize their results in good writing, and even when I don't practice them successfully in my own writing I know they would often make it better. Occasional programmer that I am, they immediately had me thinking... How well do they apply to refactoring? Programming is writing, and refactoring is one of our common forms of rewriting... So let's see. First of all, let's acknowledge up front that a writer's rewriting is not identical to a programmer's refactoring. First of all, the writer does not have automated tests to help her ensure that the rewrite doesn't break anything. It's not clear to me exactly what not breaking anything means for a writer, though I have a vague sense that it is meaningful for most writing. Also, the term "refactoring" does not refer to any old rewrite of a code base. It has a technical meaning: to modify code without changing its essential functionality. There are rewrites of a code base that are not refactoring. I think that's true of writing in general, though, and I also think that Nielsen is clearly talking about rewrites that do not change the essential content or purpose of a text. His rules are about how to say the same things more effectively. That seems close enough to our technical sense of refactoring to make this exercise worth an effort. Striking

Every sentence should grab the reader and propel them forward.

Can we say that every line of code should grab the reader and propel her forward?! I certainly prefer to read programs in which every statement or expression tells me something important about what the program is and does. Some programming languages make this harder to do, with boilerplate and often more noise than signal. Perhaps we could say that every line of code should propel the program forward, not get in the way of its functionality? This says more about the conciseness with which the programmer writes, and fits the spirit of Nielsen's rule nicely.

Every paragraph should contain a striking idea, originally expressed.

Can we say that every function or class should contain a striking idea, originally expressed? Functions and classes that do not usually get in the reader's way. In programming, though, we often write "helpers", auxiliary functions or classes that assist another in expressing an essential, even striking, idea. The best helpers capture an idea of deep value, but it's may be the nature of decomposition that we sometimes create ones that are striking only in the context of the larger system.

The most significant ideas should be distilled into the most potent sentences possible.

Yes! The most significant ideas in our programs should be distilled into the most potent code possible: expressions, statements, functions, classes, whatever the abstractions our language and style provide. Style

Use the strongest appropriate verb.

Of course. Names matter. Use the strongest, most clearly named primitives and library functions possible. When we create new functions, give them strong, clear names. This rule applies to our nouns, too. Our variables and classes should carry strong names that clearly name their concept. No more "manager" or "process" nouns. They avoid naming the concept. What do those objects do? This rule also applies more broadly to coding style. It seems to me that Tell, Don't Ask is about strength in our function calls.

Beware of nominalization.

In code, this guideline prescribes a straightforward idea: Don't make a class when a function will do. You Aren't Gonna Need It. Meta

None of the above rules should be consciously applied while drafting material.

Anyone who writes a lot knows how paralyzing it can be to worry about writing good prose before getting words down onto paper, or into an emacs buffer. Often we don't know what to write until we write it; why try to write that something perfect before we know what it is? This rule fits nicely with most lightweight approaches to programming. I even encourage novice programmers to write code this way, much to the chagrin of my more engineering-oriented colleagues. Don't be paralyzed by the blank screen. Write freely. Make something work, anything on the path to a solution, and only then worry about making it right and fast. Do the simplest thing that will work. Only after your code works do you rewrite to make it better. Not all rewriting is refactoring, but all refactoring is rewriting. Write. Pass the test. Refactor. Many people find that refactoring provides the most valuable use of design patterns, as a target toward which one moves the code. This is perhaps a more important use of patterns than initial design, at which time many of us tend to overdesign our programs. Joshua Kerievsky's Refactoring to Patterns book makes shows programmers how to do this safely and reliably. I wonder if there is any analogue to this book in the writing world, or if there even could be such a book? I once wrote a post on writing in an agile style, and rewriting played a key role in that idea. Some authors like rewriting more than writing, and I think you can say the same thing of many, many programmers. Refactoring brings a different kind of joy, at getting something right that was before almost right -- which is, of course, another way of saying not yet right. I recall once talking with a novelist over lunch about tools for writers. Yet even the most humble word processor has done so much to change how authors write and rewrite. One of the comments on Nielsen's entry asks whether new tools for writing have changed the way writers think. We might also ask whether new tools -- the ability to edit and rewrite so much more easily and with so much less= technical effort -- has changed the product created by most writers. If not, could it? New tools also change how we rewrite code. The refactoring browser has escaped the confines of the Smalltalk image and now graces IDEs for Java, C++, and C## programmers; indeed, refactoring tools exist for so many languages these days. Is that good or bad? Many of my colleagues lament that the ease of rewriting has led to an endemic sloppiness, to a rash of random programming in which students keep making seemingly random changes to their code until something compiles. Back in the good old days, we had to think hard about our code before we carved it into clay tablets... It seems clear to me that making rewriting and refactoring easier is a huge win, even as it changes how we need to teach and practice writing. In retrospect, a lot of Nielsen's rules generalize to dicta we programmers will accept eagerly. Eliminate boilerplate. Write concise, focused code. Use strong, direct, and clear language. Certainly when we abstract the tasks to a certain level, writing and rewriting really are much the same in text and code. -----