March 16, 2022 2:52 PM

A Cool Project For Manipulating Images, Modulo Wordle

Wordle has been the popular game du jour for a while now. Whenever this sort of thing happens, CS profs think, "How can I turn this into an assignment?" I've seen a lot of discussion of ways to create a Wordle knock-off assignment for CS 1. None of this conversation has interested me much. First, I rarely teach CS 1 these days, and a game such as Wordle doesn't fit into the courses I do teach right now. Second, there's a certain element of Yogi Berra wisdom driving me away from Wordle: "It's so popular; no one goes there any more."

Most important, I had my students implementing Mastermind in my OO course back in the 2000-2005 era. I've already had most of the fun I can have with this game as an assignment. And it was a popular one with students, though challenging. The model of the game itself presents challenges for representation and algorithm, and the UI has the sort of graphic panache that tends to engage students. I remember receiving email from a student who had transferred to another university, asking me if I would still help her debug and improve her code for the assignment; she wanted to show it to her family. (Of course I helped!)

However I did see something recently that made me think of a cool assignment: 5x6 Art, a Twitter account that converts paintings and other images into minimalist 5 block-by-6 block abstract grids. The connection to Wordle is in the grid, but the color palette is much richer. Like any good CS prof, I immediately asked myself, "How can I turn this into an assignment?"

a screen capture of 5x6art from Twitter

I taught our intro course using the media computation approach popularized by Mark Guzdial back in 2006. In that course, my students processed images such as the artwork displayed above. They would have enjoyed this challenge! There are so many cool ways to think about creating a 5x6 abstraction of an input image. We could define a fixed palette of n colors, then map the corresponding region of the image onto a single block. But how to choose the color?

We could compute the average pixel value of the range and then choose the color in the palette closest to that value. Or we could create neighborhoods of different sizes around all of the palette colors so that we favor some colors for the grid over others. What if we simply compute the average pixel for the region and use that as the grid color? That would give us a much larger but much less distinct set of possible colors. I suspect that this would produce less striking outputs, but I'd really like to try the experiment and see the grids it produces.

What if we allowed ourselves a bigger grid, for more granularity in our output images? There are probably many other dimensions we could play with. The more artistically inclined among you can surely think of interesting twists I haven't found yet.

That's some media computation goodness. I may assign myself to teach intro again sometime soon just so that I can develop and use this assignment. Or stop doing other work for a day or two and try it out on my own right now.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Teaching and Learning

March 14, 2022 11:55 AM

Taking Plain Text Seriously Enough

Or, Plain Text and Spreadsheets -- Giving Up and Giving In

One day a couple of weeks ago, a colleague and I were discussing a student. They said, I think I had that student in class a few semesters ago, but I can't find the semester with his grade."

My first thought was, "I would just use grep to...". Then I remembered that my colleagues all use Excel for their grades.

The next day, I saw Derek Sivers' recent post that gives the advice I usually give when asked: Write plain text files.

Over my early years in computing, I lost access to a fair bit of writing that was done using various word processing applications. All stored data in proprietary formats. The programs are gone, or have evolved well beyond the version and OS I was using at the time, and my words are locked inside. Occasionally I manage to pull something useful out of one of those old files, but for the most part they are a graveyard.

No matter how old, the code and essays I wrote in plaintext are still open to me. I love being able to look at programs I wrote for my undergrad courses (including the first parser I ever wrote, in Pascal) and my senior honors project (an early effort to implement Swiss System pairings for chess tournament). All those programs have survived the move from 5-1/4" floppies, through several different media, and still open just fine in emacs. So do the files I used to create our wedding invitations, which I wrote in troff(!).

The advice to write in plain text transfers nicely from proprietary formats on our personal computers to tools that run out on the web. The same week as Sivers posted his piece, a prolific Goodreads reviewer reported losing all his work when Goodreads had a glitch. The reviewer may have written in plain text, but his reviewers are buried in someone else's system.

I feel bad for non-tech folks when they lose their data to a disappearing format or app. I'm more perplexed when a CS prof or professional programmer does. We know about plain text; we know the history of tools; we know that our plain text files will migrate into the future with us, usable in emacs and vi and whatever other plain text editors we have available there.

I am not blind, though, to the temptation. A spreadsheet program does a lot of work for us. Put some numbers here, add a formula or two over there, and boom! your grades are computed and ready for entry -- into the university's appalling proprietary system, where the data goes to die. (Try to find a student's grade from a forgotten semester within that system. It's a database, yet there are no affordances available to users for the simplest tasks...)

All of my grade data, along with most of what I produce, is in plain text. One cost of this choice is that I have to write my own code to process it. This takes a little time, but not all that much, to be honest. I don't need all of Numbers or Excel; all I need most of the time is the ability to do simple computations and a bit of sorting. If I use a comma-separated values format, all of my programming languages have tools to handle parsing, so I don't even have to do much input processing to get started. If I use Racket for my processing code, throwing a few parens into the mix enables Racket to read my files into lists that are ready for mapping and filtering to my heart's content.

Back when I started professoring, I wrote all of my grading software in whatever language we were using in the class in that semester. That seemed like a good way for me to live inside the language my students were using and remind myself what they might be feeling as they wrote programs. One nice side effect of this is that I have grading code written in everything from Cobol to Python and Racket. And data from all those courses is still searchable using grep, filterable using cut, and processable using any code I want to write today.

That is one advantage of plain text I value that Sivers doesn't emphasize: flexibility. Not only will plain text survive into the future... I can do anything I want with it. I don't often feel powerful in this world, but I feel powerful when I'm making data work for me.

In the end, I've traded the quick and easy power of Excel and its ilk for the flexible and power of plain text, at a cost of writing a little code for myself. I like writing code, so this sort of trade is usually congenial to me. Once I've made the trade, I end up building a set of tools that I can reuse, or mold to a new task with minimal effort. Over time, the cost reaches a baseline I can live with even when I might wish for a spreadsheet's ease. And on the day I want to write a complex function to sort a set of records, one that befuddles Numbers's sorting capabilities, I remember why I like the trade. (That happened yet again last Friday.)

A recent tweet from Michael Nielsen quotes physicist Steven Weinberg as saying, "This is often the way it is in physics -- our mistake is not that we take our theories too seriously, but that we do not take them seriously enough." I think this is often true of plain text: we in computer science forget to take its value and power seriously enough. If we take it seriously, then we ought to be eschewing the immediate benefits of tools that lock away our text and data in formats that are hard or impossible to use, or that may disappear tomorrow at some start-up's whim. This applies not only to our operating system and our code but also to what we write and to all the data we create. Even if it means foregoing our commercial spreadsheets except for the most fleeting of tasks.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns

March 03, 2022 2:22 PM

Knowing Context Gives You Power, Both To Choose And To Make

We are at the point in my programming languages course where my students have learned a little Racket, a little functional programming, and a little recursive programming over inductive datatypes. Even though I've been able to connect many of the ideas we've studied to programming tasks out in the world that they care about themselves, a couple of students have asked, "Why are we doing this again?"

This is a natural question, and one I'm asked every time I teach this course. My students think that they will be heading out into the world to build software in Java or Python or C, and the ideas we've seen thus far seem pretty far removed from the world they think they will live in.

These paragraphs from near the end of Chelsea Troy's 3-part essay on API design do a nice job of capturing one part of the answer I give my students:

This is just one example to make a broader point: it is worthwhile for us as technologists to cultivate knowledge of the context surrounding our tools so we can make informed decisions about when and how to use them. In this case, we've managed to break down some web request protocols and build their pieces back up into a hybrid version that suits our needs.
When we understand where technology comes from, we can more effectively engage with its strengths, limitations, and use cases. We can also pick and choose the elements from that technology that we would like to carry into the solutions we build today.

The languages we use were designed and developed in a particular context. Knowing that context gives us multiple powers. One power is the ability to make informed decisions about the languages -- and language features -- we choose in any given situation. Another is the ability to make new languages, new features, and new combinations of features that solve the problem we face today in the way that works best in our current context.

Not knowing context limits us to our own experience. Troy does a wonderful job of demonstrating this using the history of web API practice. I hope my course can help students choose tools and write code more effectively when they encounter new languages and programming styles.

Computing changes. My students don't really know what world they will be working in in five years, or ten, or thirty. Context is a meta-tool that will serve them well.


Posted by Eugene Wallingford | Permalink | Categories: Patterns, Software Development, Teaching and Learning