April 16, 2022 2:32 PM

More Fun, Better Code: A Bug Fix for my Pair-as-Set Implementation

In my previous post, I wrote joyously of a fun bit of programming: implementing ordered pairs using sets.

Alas, there was a bug in my solution. Thanks to Carl Friedrich Bolz-Tereick for finding it so quickly:

Heh, this is fun, great post! I wonder what happens though if a = b? Then the set is {{a}}. first should still work, but would second fail, because the set difference returns the empty set?

Carl Friedrich had found a hole in my small set of tests, which sufficed for my other implementations because the data structures I used separate cells for the first and second parts of the pair. A set will do that only if the first and second parts are different!

Obviously, enforcing a != b is unacceptable. My first code thought was to guard second()'s behavior:

    if my formula finds a result
       then return that result
       else return (first p)

This feels like a programming hack. Furthermore, it results in an impure implementation: it uses a boolean value and an if expression. But it does seem to work. That would have to be good enough unless I could find a better solution.

Perhaps I could use a different representation of the pair. Helpfully, Carl Friedrich followed up with pointers to several blog posts by Mark Dominus from last November that looked at the set encoding of ordered pairs in some depth. One of those posts taught me about another possibility: Wiener pairs. The idea is this:

    (a,b) = { {{a},∅}, {{b}} }

Dominus shows how Wiener pairs solve the a == b edge case in Kuratowski pairs, which makes it a viable alternative.

Would I ever have stumbled upon this representation, as I did onto the Kuratowski pairs? I don't think so. The representation is more complex, with higher-order sets. Even worse for my implementation, the formulas for first() and second() are much more complex. That makes it a lot less attractive to me, even if I never want to show this code to my students. I myself like to have a solid feel for the code I write, and this is still at the fringe of my understanding.

Fortunately, as I read more of Dominus's posts, I found there might be a way to save my Kuratowski-style solution. It turns out that the if expression I wrote above parallels the set logic used to implement a second() accessor for Kuratowski pairs: a choice between the set that works for a != b pairs and a fallback to a one-set solution.

From this Dominus post, we see the correct set expression for second() is:

the correct set expression for the second() function

... which can be simplified to:

an expression for the second() function simplified to a logical statement

The latter expression is useful for reasoning about second(), but it doesn't help me implement the function using set operations. I finally figured out what the former equation was saying: if (∪ p) is same as (∩ p), then the answer comes from (∩ p); otherwise, it comes from their difference.

I realized then that I could not write this function purely in terms of set operations. The computation requires the logic used to make this choice. I don't know where the boundary lies between pure set theory and the logic in the set comprehension, but a choice based on a set-empty? test is essential.

In any case, I think I can implement the my understanding of the set expression for second() as follows. If we define union-minus-intersection as:

    (set-minus (apply set-union aSet)
               (apply set-intersect aSet))
then:
    (second p) = (if (set-empty? union-minus-intersection)
                     (set-elem (apply set-intersect aSet))
                     (set-elem union-minus-intersection))

The then clause is the same as the body of first(), which must be true: if the union of the sets is the same as their intersection, then the answer comes from the interesection, just as first()'s answer does.

It turns out that this solution essentially implements my first code idea above: if my formula from the previous blog entry finds a result, then return that result. Otherwise, return first(p). The circle closes.

Success! Or, I should: Success!(?) After having a bug in my original solution, I need to stay humble. But I think this does it. It passes all of my original tests as well as tests for a == b, which is the main edge case in all the discussions I have now read about set implementations of pairs. Here is a link to the final code file, if you'd like to check it out. I include the two simple test scenarios, for both a == b and a == b, as Rackunit tests.

So, all in all, this was a very good week. I got to have some fun programming, twice. I learned some set theory, with help from a colleague on Twitter. I was also reacquainted with Mark Dominus's blog, the RSS feed for which I had somehow lost track of. I am glad to have it back in my newsreader.

This experience highlights one of the selfish reasons I like for students to ask questions in class. Sometimes, they lead to learning and enjoyment for me as well. (Thanks, Henry!) It also highlights one of the reasons I like Twitter. The friends we make there participate in our learning and enjoyment. (Thanks, Carl Friedrich!)


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

April 13, 2022 2:27 PM

Programming Fun: Implementing Pairs Using Sets

Yesterday's session of my course was a quiz preceded by a fun little introduction to data abstraction. As part of that, I use a common exercise. First, I define a simple API for ordered pairs: (make-pair a b), (first p), and (second p). Then I ask students to brainstorm all the ways that they could implement the API in Racket.

They usually have no trouble thinking of the data structures they've been using all semester. Pairs, sure. Lists, yes. Does Racket have a hash type? Yes. I remind my students about vectors, which we have not used much this semester. Most of them haven't programmed in a language with records yet, so I tell them about structs in C and show them Racket's struct type. This example has the added benefit of seeing that Racket generates constructor and accessor functions that do the work for us.

The big payoff, of course, comes when I ask them about using a Racket function -- the data type we have used most this semester -- to implement a pair. I demonstrate three possibilities: a selector function (which also uses booleans), message passaging (which also uses symbols), and pure functions. Most students look on these solutions, especially the one using pure functions, with amazement. I could see a couple of them actively puzzling through the code. That is one measure of success for the session.

This year, one student asked if we could use sets to implement a pair. Yes, of course, but I had never tried that... Challenge accepted!

While the students took their quiz, I set myself a problem.

The first question is how to represent the pair. (make-pair a b) could return {a, b}, but with no order we can't know which is first and which is second. My students and I had just looked at a selector function, (if arg a b), which can distinguish between the first item and the second. Maybe my pair-as-set could know which item is first, and we could figure out the second is the other.

So my next idea was (make-pair a b){{a, b}, a}. Now the pair knows both items, as well as which comes first, but I have a type problem. I can't apply set operations to all members of the set and will have to test the type of every item I pull from the set. This led me to try (make-pair a b){{a}, {a, b}}. What now?

My first attempt at writing (first p) and (second p) started to blow up into a lot of code. Our set implementation provides a way to iterate over the members of a set using accessors named set-elem and set-rest. In fine imperative fashion, I used them to start whacking out a solution. But the growing complexity of the code made clear to me that I was programming around sets, but not with sets.

When teaching functional programming style this semester, I have been trying a new tactic. Whenever we face a problem, I ask the students, "What function can help us here?" I decided to practice what I was preaching.

Given p = {{a}, {a, b}}, what function can help me compute the first member of the pair, a? Intersection! I just need to retrieve the singleton member of ∩ p:

    (first p) = (set-elem (apply set-intersect p))

What function can help me compute the second member of the pair, b? This is a bit trickier... I can use set subtraction, {a, b} - {a}, but I don't know which element in my set is {a, b} and which is {a}. Serendipitously, I just solved the latter subproblem with intersection.

Which function can help me compute {a, b} from p? Union! Now I have a clear path forward: (∪ p) – (∩ p):

    (second p) = (set-elem
                   (set-minus (apply set-union p)
                              (apply set-intersect p)))

I implemented these three functions, ran the tests I'd been using for my other implementations... and they passed. I'm not a set theorist, so I was not prepared to prove my solution correct, but the tests going all green gave me confidence in my new implementation.

Last night, I glanced at the web to see what other programmers had done for this problem. I didn't run across any code, but I did find a question and answer on the mathematics StackExchange that discusses the set theory behind the problem. The answer refers to something called "the Kuratowski definition", which resembles my solution. Actually, I should say that my solution resembles this definition, which is an established part of set theory. From the StackExchange page, I learned that there are other ways to express a pair as a set, though the alternatives look much more complex. I didn't know the set theory but stumbled onto something that works.

My solution is short and elegant. Admittedly, I stumbled into it, but at least I was using the tools and thinking patterns that I've been encouraging my students to use.

I'll admit that I am a little proud of myself. Please indulge me! Department heads don't get to solve interesting problems like this during most of the workday. Besides, in administration, "elegant" and "clever" solutions usually backfire.

I'm guessing that most of my students will be unimpressed, though they can enjoy my pleasure vicariously. Perhaps the ones who were actively puzzling through the pure-function code will appreciate this solution, too. And I hope that all of my students can at least see that the tools and skills they are learning will serve them well when they run into a problem that looks daunting.


Posted by Eugene Wallingford | Permalink | Categories: Computing, Patterns, Teaching and Learning

April 04, 2022 5:45 PM

Leftover Notes

Like many people, I carry a small notebook most everywhere I go. It is not a designer's sketchbook or engineer's notebook; it is intended primarily to capturing information and ideas, a lá Getting Things Done, before I forget then. Most of the notes end up being transferred to one of my org-mode todo lists, to my calendar, or to a topical file for a specific class or project. Write an item in the notebook, transfer to the appropriate bin, and cross it off in the notebook.

I just filled the last line of my most recent notebook, a Fields Notes classic that I picked up as schwag at Strange Loop a few years ago. Most of the notebook is crossed out, a sign of successful capture and transfer. As I thumbed back through it, I saw an occasional phrase or line that never made into a more permanent home. That is pretty normal for my notebooks. At this point, I usually recycle the used notebook and consign untracked items to lost memories.

For some reason, this time I decided to copy all of the untracked down and savor the randomness of my mind. Who knows, maybe I'll use one of these notes some day.

The Feds

basic soul math

I want to be #0

routine, ritual

gallery.stkate.edu

M. Dockery

www.wastetrac.org/spring-drop-off-event

Crimes of the Art

What the Puck

Massachusetts ombudsman

I hope it's still funny...

chessable.com

art gallery

ena @ tubi

In Da Club (50 Cent)

Gide 25; 28 May : 1

HFOSS project

April 4-5: Franklin documentary

Mary Chapin Carpenter

"Silent Parade" by Keigo Higashino

www.pbs.org -- search Storm Lake

"Hello, Transcriber" by Hannah Morrissey

Dear Crazy Future Eugene

I recognize most of these, though I don't remember the reason I wrote all of them down. For whatever reason, they never reached an actionable status. Some books and links sound interesting in the moment, but by the time I get around to transcribing them elsewhere, I'm no longer interested enough to commit to reading, watching, or thinking about them further. Sometimes, something pops into my mind, or I see something, and I write it down. Better safe than sorry...

That last one -- Dear Crazy Future Eugene -- ends up in a lot of my notebooks. It's a phrase that has irrational appeal to me. Maybe it is destined to be the title of my next blog.

There were also three multiple-line notes that were abandoned:

poem > reality
words > fact
a model is not identical

I vaguely recall writing this down, but I forget what prompted it. I vaguely agree with the sentiment even now, though I'd be hard-pressed to say exactly what it means.

Scribble pages that separate notes from full presentation
(solutions to exercises)

This note is from several months ago, but it is timely. Just this week, a student in my class asked me to post my class notes before the session rather than after. I don't do this currently in large part because my sessions are a tight interleaving of exercises that the students do in class, discussion of possible solutions, and use of those ideas to develop the next item for discussion. I think that Scribble, an authoring system that comes with Racket, offers a way for me to build pages I can publish in before-and-after form, or at least in an outline form that would help students take notes. I just never get around to trying the idea out. I think the real reason is that I like to tinker with my notes right up to class time... Even so, the idea is appealing. It is already in my planning notes for all of my classes, but I keep thinking about it and writing it down as a trigger.

generate scanner+parser? expand analysis,
codegen (2 stages w/ IR -- simple exps, RTS, full)
optimization! would allow bigger source language?

This is evidence that I'm often thinking about my compiler course and ways to revamp it. This idea is also already in the system. But I keep to prompting myself to think about it again.

Anyway, that was a fun way to reflect on the vagaries of my mind. Now, on to my next notebook: a small pocket-sized spiral notebook I picked up for a quarter in the school supplies section of a big box store a while back. My friend Joe Bergin used to always have one of these in his shirt pocket. I haven't used a spiral-bound notebook for years but thought I'd channel Joe for a couple of months. Maybe he will inspire me to think some big thoughts.


Posted by Eugene Wallingford | Permalink | Categories: General, Personal