January 29, 2019 1:46 PM

Dependencies and Customizable Books

Shriram Krishnamurthi, in Books as Software:

I have said that a book is a collection of components. I have concrete evidence that some of my users specifically excerpt sections that suit their purpose. ...
I forecast that one day, rich document formats like PDF will recognize this reality and permit precisely such specifications. Then, when a user selects a group of desired chapters to generate a thinner volume, the software will automatically evaluate constraints and include all dependencies. To enable this we will even need "program" analyses that help us find all the dependencies, using textual concordances as a starting point and the index as an auxiliary data structure.

I am one of the users Krishnamurthi speaks of, who has excerpted sections from his Programming Languages: Application and Interpretation to suit the purposes of my course. Though I've not written a book, I do post, use, adapt, and reuse detailed lecture notes for my courses, and as a result I have seen both sides of the divide he discusses. I occasionally change the order of topics in a course, or add a unit, or drop a unit. An unseen bit of work is to account for the dependencies among concepts, examples, problems, and code in the affected sections, but also in the new whole. My life is simpler than book writers who have to deal at least in part with rich document formats: I do everything in a small, old-style subset of HTML, which means I can use simple text-based tools for manipulating everything. But dependencies? Yeesh.

Maybe I need to write a big makefile for my course notes. Alas, that would not help me manage dependencies in the way I'd like, or in the way Krishnamurthi forecasts. As such, it would probably make things worse. I suppose that I could create the tool I need.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

January 08, 2019 2:10 PM

Sometimes, Copy and Paste Is the Right Thing To Do

Last week I blogged about writing code that is easy to delete, drawing on some great lines from an old 'programming is terrible' post. Here's another passage from @tef's post that's worth thinking more about:

Step 1: Copy-paste code
Building reusable code is something that's easier to do in hindsight with a couple of examples of use in the code base, than foresight of ones you might want later. On the plus side, you're probably re-using a lot of code already just by using the file-system, why worry that much? A little redundancy is healthy.
It's good to copy-paste code a couple of times, rather than making a library function, just to get a handle on how it will be used. Once you make something a shared API, you make it harder to change.

There's not a great one-liner in there, but these paragraphs point to a really important lesson, one that we programmers sometimes have a hard time learning. We are told so often "don't repeat yourself" that we come to think that all repetition is the same. It's not.

One use of repetition is in avoiding what @tef calls, in another 'programming is terrible' post, "preemptive guessing". Consider the creation of a new framework. Oftentimes, designing a framework upfront doesn't work very well because we don't know the domain's abstractions yet. One of the best ways to figure out what they are is to write several applications first, and let framework fall out the applications. While doing this, repetition is our friend: it's most useful to know what things don't change from one application to another. This repetition is a hint on how to build the framework we need. I learned this technique from Ralph Johnson.

I use and teach a similar technique for programming in smaller settings, too. When we see two bits of code that resemble one another, it often helps to increase the duplication in order to eliminate it. (I learned this idea from Kent Beck.) In this case, the goal of the duplication is to find useful abstractions. Sometimes, though, code duplication is really a hint to think differently about a problem. Factoring out a function or class -- finding a new abstraction -- may be incidental to the learning that takes place.

For me, this line from from the second programming-is-terrible post captures this idea perfectly:

... duplicate to find the right abstraction first, then deduplicate to implement it.

My spell checker objects to the word "deduplicate", but I'll allow it.

All of these ideas taken together are the reason that I think copy-and-paste gets an undeservedly bad name. Used properly, it is a valuable programming technique -- essential, really. I've long wanted to write a Big Ball of Mud-style paper about copy-and-paste patterns. There are plenty of good reasons why we write repetitive code and, as @tef says in the two posts I link to above, sometimes leaving duplication in your code is the right thing to do.

One final tribute to repetition for now. While researching this blog post, I ran across a blog entry of mine from October 2016. Apparently, I had just read @tef's Write code that is easy to delete... post and felt an undeniable urge to quote and comment on it. If you read that 2016 post, you'll see that my Writing code that is easy to delete post from last week duplicates it in spirit and, in a few cases, even the details. I swear that I read @tef's post again last week and wrote the new blog entry from scratch, with no memory of the 2016 events. I am perfectly happy with this second act. Sometimes, ideas circle through our brains again, changing us in imperceptible ways. As @tef says, a little redundancy is healthy.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

January 06, 2019 10:10 AM

Programming Never Feels Easier. You Just Solve Harder Problems.

Alex Honnold is a rock climber who was the first person to "free solo" Yosemite's El Capitan rock wall. In an interview for a magazine, he was asked what it takes to reach ever higher goals. One bit of advice was to "aim for joy, not euphoria". When you prepare to achieve a goal, it may not feel like a big surprise when you achieve it because you prepared to succeed. Don't expect to be overwhelmed by powerful emotions when you accomplish something new; doing so sets too high a standard and can demotivate you.

This paragraph, though, is the one that spoke to me:

Someone recently said to me about running: "It never feels easier--you just go faster." A lot of sports always feel terrible. Climbing is always like that. You always feel weak and like you suck, but you can do harder and harder things.

As a one-time marathoner, never fast but always training to run a PR in my next race, I know what Honnold means. However, I also feel something similar as a programmer. Writing software often seems like a slog that I'm not very good at. I'm forever looking up language features in order to get my code to work, and then I end up with programs that are bulky and fight me at every turn. I refactor and rewrite and... find myself back in the slog. I don't feel terrible all that often, but I am usually a little on edge.

Yet if I compare the programs I write today with ones I wrote 5 or 10 or 30 years ago, I can see that I'm doing more interesting work. This is the natural order. Once I know how to do one thing, I seek tougher problems to solve.

In the article, the passage quoted above is labeled "Feeling awful is normal." I wonder if programming feels more accessible to people who are comfortable with a steady, low-grade intellectual discomfort punctuated by occasional bouts of head banging. Honnold's observation might reassure beginning programmers who don't already know that feeling uneasy is a natural part of pushing yourself to do more interesting work.

All that said, even when I was training for my next marathon, I was always able to run for fun. There was nothing quite like an easy run at sunrise to boost my mood. Fortunately, I am still able to program that way, too. Every once in a while, I like to write code to solve some simple problem I run across on Twitter or in a blog entry somewhere. I find that these interludes recharge me before I battling the next big problem I decide to tackle. I hope that my students can still programming in this way as they advance on to bigger challenges.


Posted by Eugene Wallingford | Permalink | Categories: Running, Software Development, Teaching and Learning

January 02, 2019 2:22 PM

Writing Code that is Easy to Delete

Last week someone tweeted a link to Write code that is easy to delete, not easy to extend. It contains a lot of great advice on how to create codebases that are easy to maintain and easy to change, the latter being an essential feature of almost any code that is the former. I liked this article so much that I wanted to share some of its advice here. What follows are a few of the many one- and two-liners that serve as useful slogans for building maintainable software, with light commentary.

... repeat yourself to avoid creating dependencies, but don't repeat yourself to manage them.

This line from the first page of the paper hooked me. I'm not sure I had ever had this thought, at least not so succinctly, but it captures a bit of understanding that I think I had. Reading this, I knew I wanted to read the rest of the article.

Make a util directory and keep different utilities in different files. A single util file will always grow until it is too big and yet too hard to split apart. Using a single util file is unhygienic.

This isn't the sort of witticism that I quote in the rest of this post, but its solid advice that I've come to live by over the years. I have this pattern.

Boiler plate is a lot like copy-pasting, but you change some of the code in a different place each time, rather than the same bit over and over.

I really like the author's distinction between boilerplate and copy-and-paste. Copy-and-paste has valuable uses (heresy, I know; more later), whereas boilerplate sucks the joy out of almost every programmer's life.

You are writing more lines of code, but you are writing those lines of code in the easy-to-delete parts.

Another neat distinction. Even when we understand that lines of code are an expense as much as (or instead of) an investment, we know that sometimes we have write more code. Just do it in units that are easy to delete.

A lesson in separating concerns, from Python libraries:

requests is about popular http adventures, urllib3 is about giving you the tools to choose your own adventure.

Layers! I have had users of both of these libraries suggest that the other should not exist, but they serve different audiences. They meet different needs in a way that that more than makes up for the cost of the supposed duplication.

Building a pleasant to use API and building an extensible API are often at odds with each other.

There's nothing earth-shattering in this observation, but I like to highlight different kinds of trade-off whenever I can. Every important decision we make writing programs is a trade-off.

Good APIs are designed with empathy for the programmers who will use it, and layering is realising we can't please everyone at once.

This advice elaborates on the quote earlier to repeat code in order not to create dependencies, but not to manage them. Creating a separate API is one way to avoid dependencies to code that are hard to delete.

Sometimes it's easier to delete one big mistake than try to delete 18 smaller interleaved mistakes.

Sometimes it really is best to write a big chunk of code precisely because it is easy to delete. An idea that is distributed throughout a bunch of functions or modules has to be disentangled before you can delete it.

Becoming a professional software developer is accumulating a back-catalogue of regrets and mistakes.

I'm going to use this line in my spring Programming Languages class. There are unforeseen advantages to all the practice we profs ask students to do. That's where experience comes from.

We are not building modules around being able to re-use them, but being able to change them.

This is another good bit of advice for my students, though I'll write this one more clearly. When students learn to program, textbooks often teach them that the main reason to write a function is that you can reuse it later, thus saving the effort of writing similar code again. That's certainly one benefit of writing a function, but experienced programmers know that there are other big wins in creating functions, classes, and modules, and that these wins are often even more valuable than reuse. In my courses, I try to help students appreciate the value of names in understanding and modifying code. Modularity also makes it easier to change and, yes, delete code. Unfortunately, students don't always get the right kind of experience in their courses to develop this deeper understanding.

Although the single responsibility principle suggests that "each module should only handle one hard problem", it is more important that "each hard problem is only handled by one module".

Lovely. The single module that handles a hard problem is a point of leverage. It can be deleted when the problem goes away. It can be rewritten from scratch when you understand the problem better or when the context around the problem changes.

This line is the heart of the article:

The strategies I've talked about -- layering, isolation, common interfaces, composition -- are not about writing good software, but how to build software that can change over time.

Good software is software that can you can change. One way to create software you can change is to write code that you can easily replace.

Good code isn't about getting it right the first time. Good code is just legacy code that doesn't get in the way.

A perfect aphorism to close to the article, and to perfect way to close this post: Good code is legacy code that doesn't get in the way.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

December 31, 2018 1:44 PM

Preserve Process Knowledge

This weekend I read the beginning of Dan Wang's How Technology Grows. One of the themes he presses is that when a country loses its manufacturing base, it also loses its manufacturing knowledge base. This in turn damages the economy's ability to innovate in manufacturing, even on the IT front. He concludes:

It can't be an accident that the countries with the healthiest communities of engineering practice are also in the lead in designing tools for the sector. They're able to embed knowledge into new tools, because they never lost the process knowledge in the first place.
Let's try to preserve process knowledge.

I have seen what happens within an academic department or a university IT unit when it loses process knowledge it once had. Sometimes, the world has changed in a way that makes the knowledge no longer valuable, and the loss is simply part of the organization's natural evolution. But other times the change that precipitated the move away from expertise is temporary or illusory, and the group suddenly finds itself unable to adapt other changes in the environment.

The portion of the article I read covered a lot of ground. For example, one reason that a manufacturing base matters so much is that services industries have inherent limits, summarized in:

[The] services sector [has] big problems: a lot of it is winner-take-all, and much of the rest is zero-sum.

This longer quote ends a section in which Wang compares the economies of manufacturing-focused Germany and the IT-focused United States:

The US and Germany are innovative in different ways, and they each have big flaws. I hope they fix these flaws. I believe that we can have a country in which wealth is primarily created by new economic activity, instead of by inheritance; which builds new housing stock, instead of permitting current residents to veto construction; which has a government willing to think hard about new projects that it should initiate, instead of letting the budget run on autopilot. I don't think that we should have to choose between industry and the internet; we can have a country that has both a vibrant industrial sector and a thriving internet sector.

This paragraph is good example of the paper's sub-title, "a restatement of definite optimism". Wang writes clearly and discusses a number of issues relevant to IT as the base for a nation's economy. How Technology Grows is an interesting read.


Posted by Eugene Wallingford | Permalink | Categories: General

December 29, 2018 4:41 PM

No Big Deal

I love this line from Organizational Debt:

So my proposal for Rust 2019 is not that big of a deal, I guess: we just need to redesign our decision making process, reorganize our governance structures, establish new norms of communication, and find a way to redirect a significant amount of capital toward Rust contributors.

A solid understatement usually makes me smile. Decision-making processes, governance structure, norms of communication, and compensation for open-source developers... no big deal, indeed. We all await the results. If the results come with advice that generalizes beyond a single project, especially the open-source compensation thing, all the better.

Communication is a big part of the recommendation for 2019. Changing how communication works is tough in any organization, let alone an organization with distributed membership and leadership. In every growing organization there eventually comes the time for intentional systems of communication:

But we've long since reached the point where coordinating our design vision by osmosis is not working well. We need an active and intentional circulatory system for information, patterns, and frameworks of decision making related to design.

I'm not a member of the Rust community, only an observer. But I know that the language inspires some programmers, and I learned a bit about its tool chain and community support a couple of years ago when an ambitious student used it successfully to implement his compiler in my course. It's the sort of language we need, being created in what looks to be an admirable way. I wish the Rust team well as they tackle their organizational debt and tackle their growing pains.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Software Development

December 26, 2018 2:44 PM

It's Okay To Say, "I Don't Know." Even Nobel Laureates Do It.

I ran across two great examples of humility by Nobel Prize-winning economists in recent conversations with Tyler Cowen. When asked, "Should China and Japan move to romanized script?", Paul Romer said:

I basically don't know the answer to that question. But I'll use that as a way to talk about something else ...

Romer could have speculated or pontificated; instead, he acknowledged that he didn't know the answer and pivoted the conversation to a related topic he had thought about (reforming spelling in English, for which he offered an interesting computational solution). By shifting the topic, Romer added value to the conversation without pretending that any answer he could give to the original question would have more value than as speculation.

A couple of months ago, Cowen sat with Paul Krugman. When asked whether he would consider a "single land tax" as a way to encourage a more active and more equitable economy, Krugman responded:

I just haven't done my homework on that.

... and left it there. To his credit, Cowen did not press for an uninformed answer; he moved on to another question.

I love the attitude that Krugman and Romer adopt and really like Krugman's specific answer, which echoed his response to another question earlier in the conversation. We need more people answering questions this way, more often and in more circumstances.

Such restraint is probably even more important in the case of Nobel laureates. If Romer and Klugman choose to speculate on a topic, a lot of people will pay attention, even if it is a topic they know little about. We might learn something from their speculations, but we might also forget that they are only uninformed speculation.

I think what I like best about these answers is the example that Romer and Klugman set for the rest of us: It's okay to say, "I don't know." If you have not done the homework needed to offer an informed answer, it's often best to say so and move on to something you're better prepared to discuss.


Posted by Eugene Wallingford | Permalink | Categories: General, Teaching and Learning

December 24, 2018 2:55 PM

Using a Text Auto-Formatter to Enhance Human Communication

More consonance with Paul Romer, via his conversation with Tyler Cowen: They were discussing how hard it is to learn read English than other languages, due to its confusing orthography and in particular the mismatch between sounds and their spellings. We could adopt a more rational way to spell words, but it's hard to change the orthography of large language spoken by a large, scattered population. Romer offered a computational solution:

It would be a trivial translation problem to let some people write in one spelling form, others in the other because it would be word-for-word translation. I could write you an email in rationalized spelling, and I could put it through the plug-in so you get it in traditional spelling. This idea that it's impossible to change spelling I think is wrong. It's just, it's hard, and we should -- if we want to consider this -- we should think carefully about the mechanisms.

This sounds similar to a common problem and solution in the software development world. Programmers working in teams often disagree about the orthography of code, not the spelling so much as its layout, the use of whitespace, and the placement of punctuation. Being programmers, we often address this problem computationally. Team members can stylize their code anyway they see fit but, when they check it into the common repository, they run it through a language formatter. Often, these formatters are built into our IDEs. Nowadays, some languages even come with a built-in formatting tool, such as Go and gofmt.

Romer's email plug-in would play a similar role in human-to-human communication, enabling writers to use different spelling systems concurrently. This would make it possible to introduce a more rational way to spell words without having to migrate everyone to the new system all at once. There are still challenges to making such a big change, but they could be handled in an evolutionary way.

Maybe Romer's study of Python is turning him into a computationalist! Certainly, being a programmer can help a person recognize the possibility of a computational solution.

Add this idea to his recent discovery of C.S. Peirce, and I am feeling some intellectual kinship to Romer, at least as much as an ordinary CS prof can feel kinship to a Nobel Prize-winning economist. Then, to top it all off, he lists Slaughterhouse-Five as one of his two favorite novels. Long-time readers know I'm a big Vonnegut fan and nearly named this blog for one of his short stories. Between Peirce and Vonnegut, I can at least say that Romer and I share some of the same reading interests. I like his tastes.


Posted by Eugene Wallingford | Permalink | Categories: Computing, General

December 23, 2018 10:45 AM

The Joy of Scholarship

This morning I read Tyler Cowen's conversation with Paul Romer. At one point, Romer talks about being introduced to C.S. Peirce, who had deep insights into "abstraction and how we use abstraction to communicate" (a topic Romer and Cowen discuss earlier in the interview). Romer is clearly enamored with Peirce's work, but he's also fascinated by the fact that, after a long career thinking about a set of topics, he could stumble upon a trove of ideas that he didn't even know existed:

... one of the joys of reading -- that's not a novel -- but one of the joys of reading, and to me slightly frightening thing, is that there's so much out there, and that a hundred years later, you can discover somebody who has so many things to say that can be helpful for somebody like me trying to understand, how do we use abstraction? How do we communicate clearly?
But the joy of scholarship -- I think it's a joy of maybe any life in the modern world -- that through reading, we can get access to the thoughts of another person, and then you can sample from the thoughts that are most relevant to you or that are the most powerful in some sense.

This process, he says, is the foundation for how we transmit knowledge within a culture and across time. It's how we grow and share our understanding of the world. This is a source of great joy for scholars and, really, for anyone who can read. It's why so many people love books.

Romer's interest in Peirce calls to mind my own fascination with his work. As Romer notes, Peirce had a "much more sophisticated sense about how science proceeds than the positivist sort of machine that people describe". I discovered Peirce through an epistemology course in graduate school. His pragmatic view of knowledge, along with William James's views, greatly influenced how I thought about knowledge. That, in turn, redefined the trajectory by which I approached my research in knowledge-based systems and AI. Peirce and James helped me make sense of how people use knowledge, and how computer programs might.

So I feel a great kinship with Romer in his discovery of Peirce, and the joy he finds in scholarship.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Personal, Teaching and Learning

November 28, 2018 1:56 PM

If it matters enough to be careful, it matters enough to build a system.

In Quality and Effort, Seth Godin reminds us that being careful can take us only so far toward the quality we seek. Humans make mistakes, so we need processes and systems in place to help us avoid them. Near the end of the post, he writes:

In school, we harangue kids to be more careful, and spend approximately zero time teaching them to build better systems instead. We ignore checklists and processes because we've been taught that they're beneath us.

This paragraph isolates one of the great powers we can teach our students, but also a glaring weakness in how most of us actually teach. I've been a professor for many years now, and before that I was a student for many years. I've seen a lot of students succeed and a few not do as well as they or I had hoped. Students who are methodical, patient, and disciplined in how they read, study, and program are much more likely to be in the successful group.

Rules and discipline sometimes get a bad rap these days. Creativity and passion are our catchwords. But dull, boring systems are often the keys that unlock the doors to getting things done and moving on to learn and do cooler things.

Students occasionally ask me why I slavishly follow the processes I teach them, whether it's a system as big as test-first development and refactoring or a process as simple as the technique for creating NFAs and converting them to DFAs. I tell them that I don't always trust myself but that I do trust the system: it almost always leads me to a working solution. Sure, in this moment or that I might be fine going off script, but... Good habits generate positive returns, while freewheeling it too often lets an error sneak in. (When I go off script in class, they too often get to see just that!)

We do our students a great favor when when we help them learn systems that work. Following a design recipe or something like it may be boring, but students who learn to follow it develop stronger programming skills, enjoy the success of getting projects done successfully, and graduate on to more interesting problems. As the system becomes ingrained as habit, they usually begin to appreciate it as an enabler of their success.

I agree with Godin: If it matters enough to be careful, then it matters enough to build (or learn) a system.


Posted by Eugene Wallingford | Permalink | Categories: Software Development, Teaching and Learning