CS 2530 Intermediate Computing -- Session 27

Session 27

Polymorphism, Domain Objects, and Primitives

CS 2530
Intermediate Computing

Opening Exercise: Don't Say That!

Write a Java class named CensorInputStream. A CensorInputStream is a virtual stream that replaces all instances of one character with another. For instance, we might want to filter a file, replacing all '+' characters with '-' characters:

    InputStream f = new FileInputStream( "expression.txt" );
    InputStream c = new CensorInputStream( f, '+', '-' );

Or we might want to process standard input, replacing all as with *s:

    InputStream c = new CensorInputStream( System.in, 'a', '*' );

You can use LowerCaseInputStream, which we saw last time, as a model.

A Simple Solution

CensorInputStream looks a lot like LowerCaseInputStream. Its constructor takes a data source as an argument, but also must take the characters to remove and insert:

    public CensorInputStream( InputStream s, char remove, char insert )
    {
      dataSource = s;
      charToRemove = (int) remove;
      charToInsert = (int) insert;
    }

It implements read() by delegating to its data source. Of course, it implement its own special behavior by watching for the no-no character before returning:

    public int read() throws IOException
    {
      int c = dataSource.read();
      if ( c == charToRemove )
        return charToInsert;
      return c;
    }

We can use a CensorInputStream to convert additions to subtractions in an expression file:

    > cat expression.txt
    (1 + 2 + ((3 * 4 - 6 / 7) + 2 + 4) * 5)
    > java CensorDemo expression.txt + -
    (1 - 2 - ((3 * 4 - 6 / 7) - 2 - 4) * 5)

But we can also use it to censor the keyboard:

    > java CensorKeyboard a *
    aAAAajoPOaaaaaa-`iaq`sjikq34fij3org
    *AAA*joPO******-`i*q`sjikq34fij3org
    ^D
    >

That seems nice enough. Can we do better?

A "Filter" Virtual Stream

We can. CensorInputStream is too much like LowerCaseInputStream. Both constructors take a data source and store it in an instance variable. Both read() methods delegate to the data source, perform a check on the character that was read, and return a character.

It is worse than that. Both CensorInputStream and LowerCaseInputStream are incomplete. They really should include a couple of other methods, such as available(), because the method inherited from InputStream does not provide the preferred behavior. The available() methods in both CensorInputStream and LowerCaseInputStream should delegate to the stream's data source, so the methods in both classes will be identical.

We saw last time that, by using a different kind of stream, we can often reuse the processing code of an application without making any changes. This makes it attractive for Java programmers to implement virtual streams of their own. But we wouldn't want to have to duplicate the same instance variables, constructor, reading behavior, and other delegated methods over and over again in each new class.

To help us avoid this sort of duplication, Java provides a class, FilterInputStream, that provides the common behaviors. A FilterInputStream holds an instance variable named in and delegates all the messages expected of an InputStream to the instance variable. Programmers then create new virtual streams simply by writing a subclass of FilterInputStream and specializing the appropriate methods. This almost always involves overriding read(), because that is the method that provides the new stream's particular behavior.

We can use FilterInputStream to implement a new version of CensorInputStream. Client code can use this class without any changes. The new CensorInputStream isn't much shorter than the original, but it also provides correct implementations of all the methods defined in InputStream. It also tells the reader explicitly that it is a virtual stream.

Likewise, we can also use FilterInputStream to re-implement LowerCaseInputStream, eliminating the need to store and manage its own instance variable while at the same time delegating all other messages to the data wrapped source.

Generic Decorators

As we saw last time, virtual streams are decorators. FilterInputStream is a generic decorator.

a generic decorator hierarchy

It provides the common behavior of all decorators, in particular managing the helper object and delegating all messages to it. Subclasses of FilterInputStream inherit all this behavior and customize only the parts that make them different.

We can use the same idea to make our BallWorlds easier to extend. We can write a generic DecoratedBall class and have DeceleratingBall extend DecoratedBall. Notice how use of the generic decorator as a superclass simplified DeceleratingBall. It is now nearly identical to the original DeceleratingBall class we wrote as an extension of Ball! By factoring the common behavior out into a superclass, we have made the classes we need easier to write and understand.

For kicks, I wrote another ball decorator, ExpandingBall, to make balls grow as the move. (This can be useful for simulating an object getting closer to the viewer.) ExpandingBall extends DecoratedBall, too, so needs only to override the move() method. Then I modified MultiBallWorldFrame to be able to create instances of decorated balls. Notice the the Case 3 in MultiBallWorldFrame's constructor. It creates an expanding, decelerating, bounded ball. This demonstrates a nice feature of decorators: a DecoratedBall can decorate any Ball -- even another DecoratedBall. Beautiful!!

A Thought Exercise

Suppose we are writing an application that requires sets of objects. (The collection of followers in our Twitter knock-off could be modeled using a set.) A set is an unordered collection of values that responds to a limited set of messages:

add
remove
contains?
empty?
size

How might we implement a set?

It turns out that a Java Vector provides all of these behaviors:

void addElement( Object )
removeElement( Object )
contains( Object )
isEmpty()
size()

The only difference is that a set lets us add an element only if the set does not already contain it.

So, we might decide to define a Set as a Vector:

    public class Set extends Vector {
      ...
    }

But now we can do this:

    Vector order = new Set();

Is a set really a vector? No, because Vectors respond to ther messages, too, in particular:

    int location = order.indexOf(user);
    User next = order.elementAt(location+1);

Ack! We need to make sure that doesn't happen. We change our Set class:

    public class Set extends Vector
    {
      // same code as before, plus:

      public int indexOf( Object object ) {
        System.err.println( "indexOf is not allowed on sets." );
        return -1;
      }

      public Object elementAt( int i ) {
        System.err.println( "elementAt is not allowed on sets." );
        return null;
      }
    }

That's a problem. Read this short section to learn a bit about why.

Instead, we should define a Set as using a Vector to provide its behavior:

    public class Set
    {
      private Vector elements;

      public Set() {
        elements = new Vector();
      }

      ...
    }

We write a bit more code, but the code behaves correctly and does not mislead programmers who use it.

But there's more.

Domain Objects Versus Primitive Objects

There is another option, of course: Don't write a Set class at all. Use a Vector in our application.

    public class Twitter
    {
      private Vector followers;
      ...
    }

This makes our entire Twitter class dependent on the decision to use a Vector. What if we decide later that a Hashtable is a better implementation? Or an array of Users? We have to change all references to followers to use the new type. That code is intrespersed throughout the class, along with code that deals with other aspects of the application. And that makes the changes difficult to make, and error-prone.

But you may be willing to live with that potential inconvenience as a way to get done sooner. Coming out of your intro courses, many students ask: Why, indeed, write Set at all? Or CensorInputStream?

There are some conveniences to doing so, for reading, debugging, and modifying code. And polymorphic variables can magnify the power of a humble class such as CensorInputStream by making it usable in applications that expect only an InputStream.

But there is a more important reason. When we write a program, it should be written -- as much as possible -- in terms of objects from problem domain, not primitive objects from the language.

Primitive types are almost always implementation detail. They don't exist in the world we are modeling. They are the tools we use to simulate the world, to solve a problem or provide a service in that world.

Recall Principle of Continuity: A change that is small in the business sense should be small in the program. This principle is really only the beginning of a more expansive sense of continuity between the problem domain and our solutions in code.

When we write programs in terms of the problem we are solving, we create a set of objects that enable developers to:

talk to the client about problem and solution using a common vocabulary
talk to one another without having to know the details of every class and method has been implemented
change underlying implementations without having to change the rest of the program

Implementation details are much more likely to change than the domain objects. Writing the program in terms of domain objects means having a design vocabulary that doesn't change all that often -- and when it does, it's important.

We had a great example of this in your solutions for Homework 6. Each Cell has zero or more neighbors. Students implemented the neighbors instance variable using a variety of Java types and classes:

individual Cell instance variables
an array of Cells
a Vector of Cells
a Hashtable mapping directions to Cells

Which is best? There probably isn't one correct answer, though using individual Cell instance variables limits us in ways that the others don't.

I don't necessarily know what "the right answer" is, but I do know this: We should be able to change our implementation once we figure it out.

But all of these approaches hardcode the implementation detail throughout the Cell class -- and, for some students, throughout the Pousse classes they wrote for Homework 8!

That makes changing our implementation later a big pain. One of the goals of Homework 9 and its successors is to give you an opportunity to see what happens as an application grows and changes. They give you a chance to live with the consequences of implementation details and to benefit from the choice to create separate classes for the objects in the problem domain.

Back to Homework 6... One student did write code that hid most of the details about whether a Cell had a neighbor or not. The details were encapsulated in a CellPackage class. Instances of this class didn't do much other than hide that one detail.

I did something similar in my solution to Homework 6. I created a Neighbors class to represent the collection of a Cell's neighbors. This enabled me to write a simple implementation (an array of Cells) with the ability to change it to something more sophisticated later (such as a Hashtable mapping directions to Cells) without modifying the Cell class.

Both my Neighbors object and the student's CellPackage objects could have done more, and probably should have. If we had developed our Pousse game further, perhaps with more options and other interfaces, we likely would have made those objects "smarter" along the way. But just creating them in the first place is a giant leap toward making the program more flexible.

Designing classes to use domain objects, not language primitives, results in programs that are easier to modify and extend. Sometimes using this style means simply creating a class like Neighbors that wraps an instance of a language primitive and uses it to implement an idea. This simple step hides the implementation from the rest of the program.

If you go on to do more object-oriented programming after this course, you will almost certainly hear more about this idea. The tendency to write code using language primitives is sometimes called primitive obsession. It is a sign of a program that is more brittle than it needs to be.

So: Create classes to model domain objects. Write your programs in terms of them sending messages to one another. Remove the need to understand their underlying implementation from as much of your code as possible.

(You can even take this idea one step farther and defer thinking about how to implement domain objects as long as possible while writing your program. The results can be surprising!)

Wrap Up

Reading. As always, study the code for this session along with these notes. If you'd like to learn more about the particular Java library classes we used today, follow the links above and browse the Javadoc.
Read this short section about inheritance.

Homework. Homework 10 is available and due at the end of the week.

Inheritance

What?

A mechanism for reusing code in an existing class.
A mechanism for organizing kinds of objects that have the same implementation.

How?

Create a class that extends another class.

Why?

Who wants to rewrite code?
Reuse provides

reliability through continual testing

shorter development time

the ability to build frameworks
(Don't call us...).

You can quickly build an application for demonstration purposes.

In one sense, a subclass is an expansion of its superclass. A subclass can add instance variables and methods.

In another sense, a subclass is a contraction of its superclass. A subclass defines a subset of instances of its superclass.

In Java, all classes are subclasses, whether we say so or not. By default, any class that does not have an extends clause extends the class Object.

Inheritance and Substitutability

An object X is substitutable for an object Y if:

we can use X any place we use Y and
the client code cannot tell the difference.

An example from the pinball game construction kit:

The target vector holds any Object. That is how Java Vectors work.
We put Springs, Walls, Holes, ..., into the vector.
When we retrieve objects from the vector, we treat them as PinBallTargets.
The client code cannot tell the difference.

Other examples, from the cannon game:

Our "fire" Button expects to be given an ActionListener that watches for button events.
We create FireButtonListener that implements the ActionListener interface.
We add a FireButtonListener in place of an ActionListener.
The AWT framework cannot tell the difference.

The common feature in all of these cases -- and the key to substitutability -- is that the objects share a common interface.

They respond to the same messages.

Inheritance and interfaces are two mechanisms that ensure a common interface.

Why write our programs so that they use substitutable objects? They are easier to extend and modify.

Types of Inheritance

Inheritance can be used to achieve several different kinds of legitimate goals.

Specialization. Sometimes we want to create a more specific version of an existing object. In this approach, we create few or no new methods in the subclass. Instead, the subclass methods override inherited methods. Examples include our game worlds and the BoundedBall class. Specialization is common in frameworks like the AWT.

Specification. In this approach, the superclass provides responsibility, in the form of abstract methods and methods with empty bodies, but no behavior. We can even consider an interface as like a superclass that specifies behavior. The subclass implements the interface or extends the (partly) abstract class. Examples include our event listeners, our pinball targets, and the InputStream class.

Extension. Sometimes we want to make an object that adds new behaviors to an existing object. In this approach, the subclass uses most or all of the inherited methods as-is and adds several new methods. Our MovableBall class is an example.

Combination. In some languages, a subclass inherits from two or more classes. This is called multiple inheritance. C++ has multiple inheritance; Java and Smalltalk do not. Java supports combination only through interfaces, as a class can extend one superclass (for any of the purposes described above) and implement one or more interfaces (specification). An example we have seen is the Hole class in the pinball game.

Inheritance can also be used to achieve goals that hurt the design of a program more than they help.

Limitation. In this approach, the subclass consists primarily of methods that override inherited methods, much like specialization. However, the overridden methods are used to restrict an inherited behavior or remove it entirely. An example of the former would be to create a Square class that extends a Rectangle class. An example of the latter is defining a Set as a subclass of Vector. Limitation leads to bad design because it violates the principle of substitutability.

Construction. Sometimes, we want to extend a class solely because it provides a lot of code that we would like to reuse. The subclass extends a superclass to inherit code that we don't want to write, but it is not true that an instance of the subclass is interchangeable with an instance of the superclass. We can find a great example of this bad in the Java standard library: java.util.Stack which is also a subclass of Vector. Like limitation, construction violates the principle of substitutability.

The first four of these techniques help us to design good programs. Use them judiciously.

The last two tempt us to reuse code out of laziness. They lead to dangerous designs. Don't give in to temptation!

An Exercise: Toe the Lines

Suppose that we are implementing a drawing tool for mathematicians. Consider these three geometric objects:

line (infinite in both directions)
ray (fixed on one end)
segment (fixed on both ends)

How would you use inheritance to implement classes for these objects?

Of course, to answer this question, you will need to answer the more fundamental questions:

What behavior does each of these objects need?
What data does each need to do its jobs?

What behavior does each of these objects need? I would guess that we would need the same set of behaviors for each object, such as:

void paint(graphic)
void rotate(angle) -- What is the fixed point of rotation?
void move(distance) -- What is point of orientation?
boolean intersects(another)

What data does each need to do its jobs? This is tougher. We could use one point and a slope to define each rays and lines, but for segments we would need one more piece of information (length). Or we could use two points to define each class. In rays, one point would serve as the distinguished endpoint.

How would you use inheritance to implement classes for these objects? This depends on the answers to the previous two questions. Choosing a data representation will determine how we implement our methods, and the amount of shared behavior is the key to deciding on the proper use of inheritance.

As a mechanism for reusing code, inheritance is an implementation decision. You won't know how to use inheritance for reuse until you have written code.

But remember: Instances of subclasses should be substitutable for instances of superclasses. Inheritance cannot be about *only* code reuse!

Eugene Wallingford ..... wallingf@cs.uni.edu ..... November 27, 2012