TITLE: Using Programs and Data Analysis to Improve Writing, World Bank Edition AUTHOR: Eugene Wallingford DATE: June 06, 2017 2:39 PM DESC: ----- BODY: Last week I read a tweet that linked to an article by Paul Romer. He is an economist currently working at the World Bank, on leave from his chair at NYU. Romer writes well, so I found myself digging deeper and reading a couple of his blog articles. One of them, Writing, struck a chord with me both as a writer and as a computer scientist. Consider:
The quality of written prose should be higher in documents that will have many readers.
This is true of code, too. If a piece of code will be read many times, whether by one person or several, then each minute spent making it shorter and clearer improves reading comprehension every single time. That's even more important in code than in text, because so often we read code in order to change it. We need to understand it at even deeper level to ensure that our changes have the intended effect. Time spent making code better repays itself many times over. Romer caused a bit of a ruckus when he arrived at the World Bank by insisting, to some of his colleagues' displeasure, that everyone in his division writer clearer, more concise reports. His goal was admirable: He wanted more people to be able to read and understand these reports, because they deal with policies that matter to the public. He also wanted people to trust what the World Bank was saying by being able more readily to see that a claim was true or false. His article looks at two different examples that make a claim about the relationship between education spending and GDP per capita. He concludes his analysis of the examples with:
In short, no one can say that the author of the second claim wrote something that is false because no one knows what the second claim means.
In science, writing clearly builds trust. This trust is essential for communicating results to the public, of course, because members of the public do not generally possess the scientific knowledge they need to assess the truth of claim directly. But it is also essential for communicating results to other scientists, who must understand the claims at a deeper level in order to support, falsify, and extend them. In the second half of the article, Romer links to a study of the language used in World Bank's yearly reports. It looks at patterns such as the frequency of the word "and" in the reports and the ratio of nouns to verbs. (See this Financial Times article for a fun little counterargument on the use of "and".) Romer wants this sort of analysis to be easier to do, so that it can be used more easily to check and improve the World Bank's reports. After looking at some other patterns of possible interest, he closes with this:
To experiment with something like this, researchers in the Bank should be able to spin up a server in the cloud, download some open-source software and start experimenting, all within minutes.
Wonderful: a call for storing data in easy-to-access forms and a call for using (and writing) programs to analyze text, all in the name not of advancing economics technically but of improving its ability to communicate its results. Computing becomes a tool integrated into the process of the World Bank doing its designated job. We need more leaders in more disciplines thinking this way. Fortunately, we hear reports of such folks more often these days. Alas, data and programs were not used in this way when Romer arrived at the World Bank:
When I arrived, this was not possible because people in ITS did not trust people from DEC and, reading between the lines, were tired of dismissive arrogance that people from DEC displayed.
One way to create more trust is to communicate better. Not being dismissively arrogant is, too, though calling that sort of behavior out may be what got Romer in so much hot water with the administrators and economists at the World Bank in the first place. -----