TITLE: Equality Check Patterns for Recursive Structures AUTHOR: Eugene Wallingford DATE: October 07, 2012 2:50 PM DESC: ----- BODY: I first encountered this trio of programming patterns when writing Smalltalk programs to manipulate graphs back in the late 1980s. These days I see them most often when comparing recursive data types in language processors. Andrew Black wrote about these patterns in the context of Smalltalk on the Squeak mailing list in the mid-2000s.
~~~~
Recursive Equality Check Problem You are working with a recursive data structure. In the simplest form, you might have a list or a record that can contain lists or records as parts. In a language application, you might have a structured type that can have the same type as one of its parts. For instance, a function type might allow a function as an argument or a function as a return value. You have two instances of the structure and need to compare them, say, for equality or for dominance. In the language app, for example, you will need to verify that the type of an argument matches the type of the formal parameter on a procedure. Solution Standard structural recursion works. Walk the two structures in parallel, checking to see that they have the same values in the same positions. When one of the positions holds values the same structure, make a recursive call to compare them. But what if ...
~~~~
Recursive Equality Check with Identity Check Problem An instance of the recursive structure can contain a reference to itself as a value, either directly or through mutual recursion. In the simplest form, this might be a dictionary that contains itself as a value in a key/vaue pair, as in Smalltalk, where the global variable Smalltalk is the dictionary of all global variables, including Smalltalk. Comparing two instances now raises concerns. Two instances may be identical, or contain identical components. In such cases, the standard recursive comparison will never terminate. Solution Check for identity first. Recurse only if the two values are distinct. But what if...
~~~~
Recursive Equality Check with Cache Problem You have two structures that do not share any elements, but they are structurally isomorphic. For example, this can occur in the simplest of structures, two one-element maps:
    a = { :self => a }
    b = { :self => b }
Now, even with an identity test up front, the recursive comparison will never terminate. Solution Maintain a cache of compared pairs. Before you begin to compare two objects, check to see if the pair is in the cache. If yes, return true. Otherwise, add the pair to the cache and proceed. This approach works even though the function has not finished comparing the two objects yet. If there turns out to be a difference between the two, the check currently in progress will find it elsewhere and answer false. There is no need to enter a recursive check.
~~~~
A variation of this caching technique can also be used in other situations, such as computing a hash value for a recursive structure. If in the course of computing the hash you encounter the same structure again, assume that the value is a suitable constant, such as 0 or 1. Hashes are only approximations anyway, so making only one pass over the structure is usually enough. If you really need a fixpoint, then you can't take this shortcut. Ruby hashes handle all three of these problems correctly:
    a = { :first  => :int,
          :second => { :first  => :int, :second => :int } }
    b = { :second => { :second => :int, :first  => :int },
          :first  => :int, }

    a == b             # --> true

    c = { :first  => c }
    a = { :first  => :int,
          :second => { :first => c, :second => :int } }
    b = { :first  => :int,
          :second => { :first => c, :second => :int } }

    a == b             # --> true

    a = { :self => a }
    b = { :self => b }

    a == b             # --> true
I don't know if either MRI Ruby or JRuby uses these patterns to implement their solutions, or if they use some other technique. -----