TITLE: Version Control for Writers and Publishers AUTHOR: Eugene Wallingford DATE: July 15, 2013 2:41 PM DESC: ----- BODY: Mandy Brown again, this time on on writing tools without memory:
I've written of the web's short-term memory before; what Manguel trips on here is that such forgetting is by design. We designed tools to forget, sometimes intentionally so, but often simply out of carelessness. And we are just as capable of designing systems that remember: the word processor of today may admit no archive, but what of the one we build next?
This is one of those places where the software world has a tool waiting to reach a wider audience: the version control system. Programmers using version control can retrieve previous states of their code all the way back to its creation. The granularity of the versions is limited only by the frequency with which they "commit" the code to the repository. The widespread adoption of version control and the existence of public histories at place such as GitHub have even given rise to a whole new kind of empirical software engineering, in which we mine a large number of repositories in order to understand better the behavior of developers in actual practice. Before, we had to contrive experiments, with no assurance that devs behaved the same way under artificial conditions. Word processors these days usually have an auto-backup feature to save work as the writer types text. Version control could be built into such a feature, giving the writer access to many previous versions without the need to commit changes explicitly. But the better solution would be to help writers learn the value of version control and develop the habits of committing changes at meaningful intervals. Digital version control offers several advantages over the writer's (and programmer's) old-style history of print-outs of previous versions, marked-up copy, and notebooks. An obvious one is space. A more important one is the ability to search and compare old versions more easily. We programmers benefit greatly from a tool as simple as diff, which can tell us the textual differences between two files. I use diff on non-code text all the time and imagine that professional writers could use it to better effect than I. The use of version control by programmers leads to profound changes in the practice of programming. I suspect that the same would be true for writers and publishers, too. Most version control systems these days work much better with plain text than with the binary data stored by most word processing programs. As discussed in my previous post, there are already good reasons for writers to move to plain text and explicit mark-up schemes. Version control and text analysis tools such as diff add another layer of benefit. Simple mark-up systems like Markdown don't even impose much burden on the writer, resembling as they do how so many of us used to prepare text in the days of the typewriter. Some non-programmers are already using version control for their digital research. Check out William Turkel's How To for doing research with digital sources. Others, such The Programming Historian and A Companion to Digital Humanities, don't seem to mention it. But these documents refer mostly to programs for working with text. The next step is to encourage adoption of version control for writers doing their own thing: writing. Then again, it has taken a long time for version control to gain such widespread acceptance even among programmers, and it's not yet universal. So maybe adoption among writers will take a long time, too. -----