January 29, 2019 1:46 PM

Dependencies and Customizable Books

Shriram Krishnamurthi, in Books as Software:

I have said that a book is a collection of components. I have concrete evidence that some of my users specifically excerpt sections that suit their purpose. ...
I forecast that one day, rich document formats like PDF will recognize this reality and permit precisely such specifications. Then, when a user selects a group of desired chapters to generate a thinner volume, the software will automatically evaluate constraints and include all dependencies. To enable this we will even need "program" analyses that help us find all the dependencies, using textual concordances as a starting point and the index as an auxiliary data structure.

I am one of the users Krishnamurthi speaks of, who has excerpted sections from his Programming Languages: Application and Interpretation to suit the purposes of my course. Though I've not written a book, I do post, use, adapt, and reuse detailed lecture notes for my courses, and as a result I have seen both sides of the divide he discusses. I occasionally change the order of topics in a course, or add a unit, or drop a unit. An unseen bit of work is to account for the dependencies among concepts, examples, problems, and code in the affected sections, but also in the new whole. My life is simpler than book writers who have to deal at least in part with rich document formats: I do everything in a small, old-style subset of HTML, which means I can use simple text-based tools for manipulating everything. But dependencies? Yeesh.

Maybe I need to write a big makefile for my course notes. Alas, that would not help me manage dependencies in the way I'd like, or in the way Krishnamurthi forecasts. As such, it would probably make things worse. I suppose that I could create the tool I need.


Posted by Eugene Wallingford | Permalink | Categories: General, Software Development, Teaching and Learning

January 08, 2019 2:10 PM

Sometimes, Copy and Paste Is the Right Thing To Do

Last week I blogged about writing code that is easy to delete, drawing on some great lines from an old 'programming is terrible' post. Here's another passage from @tef's post that's worth thinking more about:

Step 1: Copy-paste code
Building reusable code is something that's easier to do in hindsight with a couple of examples of use in the code base, than foresight of ones you might want later. On the plus side, you're probably re-using a lot of code already just by using the file-system, why worry that much? A little redundancy is healthy.
It's good to copy-paste code a couple of times, rather than making a library function, just to get a handle on how it will be used. Once you make something a shared API, you make it harder to change.

There's not a great one-liner in there, but these paragraphs point to a really important lesson, one that we programmers sometimes have a hard time learning. We are told so often "don't repeat yourself" that we come to think that all repetition is the same. It's not.

One use of repetition is in avoiding what @tef calls, in another 'programming is terrible' post, "preemptive guessing". Consider the creation of a new framework. Oftentimes, designing a framework upfront doesn't work very well because we don't know the domain's abstractions yet. One of the best ways to figure out what they are is to write several applications first, and let framework fall out the applications. While doing this, repetition is our friend: it's most useful to know what things don't change from one application to another. This repetition is a hint on how to build the framework we need. I learned this technique from Ralph Johnson.

I use and teach a similar technique for programming in smaller settings, too. When we see two bits of code that resemble one another, it often helps to increase the duplication in order to eliminate it. (I learned this idea from Kent Beck.) In this case, the goal of the duplication is to find useful abstractions. Sometimes, though, code duplication is really a hint to think differently about a problem. Factoring out a function or class -- finding a new abstraction -- may be incidental to the learning that takes place.

For me, this line from from the second programming-is-terrible post captures this idea perfectly:

... duplicate to find the right abstraction first, then deduplicate to implement it.

My spell checker objects to the word "deduplicate", but I'll allow it.

All of these ideas taken together are the reason that I think copy-and-paste gets an undeservedly bad name. Used properly, it is a valuable programming technique -- essential, really. I've long wanted to write a Big Ball of Mud-style paper about copy-and-paste patterns. There are plenty of good reasons why we write repetitive code and, as @tef says in the two posts I link to above, sometimes leaving duplication in your code is the right thing to do.

One final tribute to repetition for now. While researching this blog post, I ran across a blog entry of mine from October 2016. Apparently, I had just read @tef's Write code that is easy to delete... post and felt an undeniable urge to quote and comment on it. If you read that 2016 post, you'll see that my Writing code that is easy to delete post from last week duplicates it in spirit and, in a few cases, even the details. I swear that I read @tef's post again last week and wrote the new blog entry from scratch, with no memory of the 2016 events. I am perfectly happy with this second act. Sometimes, ideas circle through our brains again, changing us in imperceptible ways. As @tef says, a little redundancy is healthy.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning

January 06, 2019 10:10 AM

Programming Never Feels Easier. You Just Solve Harder Problems.

Alex Honnold is a rock climber who was the first person to "free solo" Yosemite's El Capitan rock wall. In an interview for a magazine, he was asked what it takes to reach ever higher goals. One bit of advice was to "aim for joy, not euphoria". When you prepare to achieve a goal, it may not feel like a big surprise when you achieve it because you prepared to succeed. Don't expect to be overwhelmed by powerful emotions when you accomplish something new; doing so sets too high a standard and can demotivate you.

This paragraph, though, is the one that spoke to me:

Someone recently said to me about running: "It never feels easier--you just go faster." A lot of sports always feel terrible. Climbing is always like that. You always feel weak and like you suck, but you can do harder and harder things.

As a one-time marathoner, never fast but always training to run a PR in my next race, I know what Honnold means. However, I also feel something similar as a programmer. Writing software often seems like a slog that I'm not very good at. I'm forever looking up language features in order to get my code to work, and then I end up with programs that are bulky and fight me at every turn. I refactor and rewrite and... find myself back in the slog. I don't feel terrible all that often, but I am usually a little on edge.

Yet if I compare the programs I write today with ones I wrote 5 or 10 or 30 years ago, I can see that I'm doing more interesting work. This is the natural order. Once I know how to do one thing, I seek tougher problems to solve.

In the article, the passage quoted above is labeled "Feeling awful is normal." I wonder if programming feels more accessible to people who are comfortable with a steady, low-grade intellectual discomfort punctuated by occasional bouts of head banging. Honnold's observation might reassure beginning programmers who don't already know that feeling uneasy is a natural part of pushing yourself to do more interesting work.

All that said, even when I was training for my next marathon, I was always able to run for fun. There was nothing quite like an easy run at sunrise to boost my mood. Fortunately, I am still able to program that way, too. Every once in a while, I like to write code to solve some simple problem I run across on Twitter or in a blog entry somewhere. I find that these interludes recharge me before I battling the next big problem I decide to tackle. I hope that my students can still programming in this way as they advance on to bigger challenges.


Posted by Eugene Wallingford | Permalink | Categories: Running, Software Development, Teaching and Learning

January 02, 2019 2:22 PM

Writing Code that is Easy to Delete

Last week someone tweeted a link to Write code that is easy to delete, not easy to extend. It contains a lot of great advice on how to create codebases that are easy to maintain and easy to change, the latter being an essential feature of almost any code that is the former. I liked this article so much that I wanted to share some of its advice here. What follows are a few of the many one- and two-liners that serve as useful slogans for building maintainable software, with light commentary.

... repeat yourself to avoid creating dependencies, but don't repeat yourself to manage them.

This line from the first page of the paper hooked me. I'm not sure I had ever had this thought, at least not so succinctly, but it captures a bit of understanding that I think I had. Reading this, I knew I wanted to read the rest of the article.

Make a util directory and keep different utilities in different files. A single util file will always grow until it is too big and yet too hard to split apart. Using a single util file is unhygienic.

This isn't the sort of witticism that I quote in the rest of this post, but its solid advice that I've come to live by over the years. I have this pattern.

Boiler plate is a lot like copy-pasting, but you change some of the code in a different place each time, rather than the same bit over and over.

I really like the author's distinction between boilerplate and copy-and-paste. Copy-and-paste has valuable uses (heresy, I know; more later), whereas boilerplate sucks the joy out of almost every programmer's life.

You are writing more lines of code, but you are writing those lines of code in the easy-to-delete parts.

Another neat distinction. Even when we understand that lines of code are an expense as much as (or instead of) an investment, we know that sometimes we have write more code. Just do it in units that are easy to delete.

A lesson in separating concerns, from Python libraries:

requests is about popular http adventures, urllib3 is about giving you the tools to choose your own adventure.

Layers! I have had users of both of these libraries suggest that the other should not exist, but they serve different audiences. They meet different needs in a way that that more than makes up for the cost of the supposed duplication.

Building a pleasant to use API and building an extensible API are often at odds with each other.

There's nothing earth-shattering in this observation, but I like to highlight different kinds of trade-off whenever I can. Every important decision we make writing programs is a trade-off.

Good APIs are designed with empathy for the programmers who will use it, and layering is realising we can't please everyone at once.

This advice elaborates on the quote earlier to repeat code in order not to create dependencies, but not to manage them. Creating a separate API is one way to avoid dependencies to code that are hard to delete.

Sometimes it's easier to delete one big mistake than try to delete 18 smaller interleaved mistakes.

Sometimes it really is best to write a big chunk of code precisely because it is easy to delete. An idea that is distributed throughout a bunch of functions or modules has to be disentangled before you can delete it.

Becoming a professional software developer is accumulating a back-catalogue of regrets and mistakes.

I'm going to use this line in my spring Programming Languages class. There are unforeseen advantages to all the practice we profs ask students to do. That's where experience comes from.

We are not building modules around being able to re-use them, but being able to change them.

This is another good bit of advice for my students, though I'll write this one more clearly. When students learn to program, textbooks often teach them that the main reason to write a function is that you can reuse it later, thus saving the effort of writing similar code again. That's certainly one benefit of writing a function, but experienced programmers know that there are other big wins in creating functions, classes, and modules, and that these wins are often even more valuable than reuse. In my courses, I try to help students appreciate the value of names in understanding and modifying code. Modularity also makes it easier to change and, yes, delete code. Unfortunately, students don't always get the right kind of experience in their courses to develop this deeper understanding.

Although the single responsibility principle suggests that "each module should only handle one hard problem", it is more important that "each hard problem is only handled by one module".

Lovely. The single module that handles a hard problem is a point of leverage. It can be deleted when the problem goes away. It can be rewritten from scratch when you understand the problem better or when the context around the problem changes.

This line is the heart of the article:

The strategies I've talked about -- layering, isolation, common interfaces, composition -- are not about writing good software, but how to build software that can change over time.

Good software is software that can you can change. One way to create software you can change is to write code that you can easily replace.

Good code isn't about getting it right the first time. Good code is just legacy code that doesn't get in the way.

A perfect aphorism to close to the article, and to perfect way to close this post: Good code is legacy code that doesn't get in the way.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning