Object-Oriented Programming Considered Harmful

My tongue-in-cheek title is a riff on the 1968 Dijkstra essay, “Go-to statement considered harmful.” Dijkstra argued for structured programming, over BASIC spaghetti. Recently one of my friends expressed interest in becoming a computer programmer, so I purchased a copy of Algorithmics: The Spirit of Computing (3rd Edition), 2003, by David Harel with Yishai Feldman, as a gift for my young friend.  The second edition of the book (1987) had inspired me, as a novice, with it’s clear and enthusiastic presentation of the essentials of computer theory.  I told my friend that if the book didn’t also inspire him, then maybe he would not ultimately enjoy programming.

As I skimmed the third edition, curiously looking at the updates for 2003, I was dismayed to find a well-known mistake presented as an example for Object-Oriented Programming.   On page 71, Harel and Feldman introduce inheritance, which ‘denotes the ability of the programmer to define inclusion relationships between classes.’  They proceed to give the horribly flawed example of making a square shape extend a rectangle shape.  This is a perfect example of how very intelligent people can misuse inheritance and even teach incorrect usage.  I’m going to argue that inheritance is completely unnecessary, and even harmful, due to the extremely high potential for misuse.

The ‘correct’ use of inheritance

Liskov’s Substitution Principle must be followed. The Liskov’s Substitution Principle states that if a program module is using a Base class, then the reference to the Base class can be replaced with a Derived class without affecting the functionality of the program module.

I am not aware of any OO compiler that must enforce Liskov’s principle, although automated code inspection software can report violations of it.  Unfortunately, few programmers follow it.  So, why can’t a square class be derived from a base rectangle class?

A Square is Not a Rectangle

It seems intuitive.  A square is a special case of a rectangle, where the height and width are equal, right?  Yes, but, by definition, a rectangle has a height and width that can be different.  A rectangle class will have height and width fields, which could also be mutable.

Harel and Feldman say, “From the point of view of the programmer, squares can handle all messages defined for rectangles…”  They are wrong!  If a square object receives the message setHeight(20), does it also set the width?  If so, it is already confusing, and if not, then if the width is not the same, it is no longer a square.  It makes no sense for a square to have height and width fields.  A square is more constrained than a rectangle.  Hence it cannot be Liskov substitutable with a reference of type Rectangle.  A better idea for an abstract base class for Square might be RegularPolygon, all of which have a number of sides and a length of side.  The method getInternalAngle() would be a fine abstract method for such an abstract base class.  It would also be a good idea to make the number of sides immutable!  Square could extend RegularPolygon, and could have a constructor taking only lengthOfSide, and internal fixing the number of sides to four and return 90 degrees from getInternalAngle().  But now you begin to see all the considerations involved in making a class designed for inheritance.  I’m not even going to talk about the diamond inheritance problem encountered in languages like C++ which have multiple inheritance.  In any case, it’s easy to see that a square is not a rectangle.  The Is-A relationship is violated.

Inheritance isn’t needed at all

It’s common to find the advice, ‘prefer composition to inheritance’, given to OO programming students.  Well, then why even have inheritance at all?  Simply because we can!  Somebody thought of it.  And somebody thought of GO TO, as well.  There is no programming problem that requires an Object-Oriented language, let alone inheritance.  As Harel and Feldman are keen to point out, all programming languages are equivalent.  The differences are ‘pragmatic’.

Language aesthetics

It comes down to the fact that languages are chosen for their applicability to specific problems, but also for their aesthetic appeal.  A book called Beautiful Code came out a few years ago, but it got mixed reviews and I can’t comment on it, having not read it.   The idea, though, is that a programming language has to express the ‘theory’, the mental model, that the programmer is building.  It has to support abstractions for data structures and behavior (algorithms).  Harel and Feldman ask (3rd ed., p. 58), Why Not an Algorithmic Esperanto?  Why not have just one programming language for everything?  As one answer to that, all you need to do is look into Domain-Specific Language (DSL).  A DSL is a purpose built ‘language’ syntax, usually based on an underlying dynamic language, such as Ruby, for simplifying a specific task.  If there are hundreds of languages, there may be thousands of DSLs.  Each DSL improves the speed at which the specific programming task can be completed, reducing labor, increasing quality, blah, blah, blah.  Aesthetics are marshaled in support of productivity. This is why you need to learn new languages and DSLs.  It’s pragmatic.

Jettison inheritance

If you’re pragmatic, you will ‘prefer composition to inheritance’.  But why not simply not use inheritance at all?  In an OO language, then you’ll avoid all problems of masked (hidden) fields, overridden methods, and deep inheritance hierarchies.  When choosing frameworks, sniff out inheritance in the libraries.  Are you forced to use it to adopt the framework.  Move along to the next one.  Why is this pragmatic?  Because it will simplify one of the many dimensions along which complexity can develop.

Mutable state considered harmful, too

Inheritance is one of the problems for OO, but mutable state is perhaps worse.  Inheritance can impact dynamic runtime behavior (by virtual method selection).  But shared, mutable state has become a serious problem for concurrent programming.  I’ve previously posted about the challenges of concurrency on the Java Virtual Machine.  So, now, why even use in-memory shared mutable state?  In fact, sharing mutable state in persistence systems (file systems, databases) is no longer a good idea, either.  How about just write and read data?  Imagine how much is to be gained by immutable data.  It is much easier to reason about values that don’t change over time (and possibly even carry time stamp information), and to write correct programs that assume no mutable data.

Conclusion

We’re stuck with OO languages and compilers, as much as banks are stuck with COBOL.  However, that does not mean we have to keep using the ‘bad parts’.  And yet, it seems unavoidable.  Because there still exist too many bad examples.  And if such bright authors as Harel and Feldman can repeat such a flawed example as ‘square derived from rectangle’, who can help us, but ourselves?

Peter Naur – Programming as Theory Building

I was re-reading appendix B in Alistair Cockburn’s book, Agile Software Development, 2nd edition.  He has posted the entire appendix on his own blog.

Peter Naur thinks it important to consider the sort of activity that programming is.  Because if it is misconceived, we will not be as successful at it as we could be.

… the main point I want to make is that programming in this sense primarily must be the programmers’ building up knowledge of a certain kind, knowledge taken to be basically the programmers’ immediate possession…

Naur calls this kind of knowledge a ‘theory’, in the sense that the philosopher Gilbert Ryle used the term. A theory in this sense is a kind of tacit knowledge that a person can acquire, that – here is the strange part – cannot be expressed, i.e. cannot be put down as so many rules or principles.  However, one in possession of a theory about a program can, by working closely with others, induce them to acquire a similar theory, one that is serviceable for carrying on the development of a computer program in accord with the theory of it.

What does this mean for practice?  It is often recommended that a team of programmers who have developed a program remain closely connected with it throughout its life.  In Naur’s sense, they possess a theory of the program that other people cannot acquire by reading the program source code and any amount of documentation, no matter how well written.  There will be objection to this idea.  It once was, and still is in some quarters, a standard practice to hand over a program to a maintenance team after the program sees production.  The idea is that the original designers, presumably expert developers, should move on to new projects where their skill in, shall we say ‘theory building’, can best be utilized.  Meanwhile, the maintenance team, usually comprised of members with less skill and experience than the original developers, will take responsibility for program extensions and bug fixes.  A major point of Naur’s essay is that this is misguided.  Without a period of transition, where at least some of the original developers work on the maintenance team, the group picking up a program having only the source and documentation will be trying to ‘resuscitate’ a dead program, a program whose original development team has dispersed and the theory forgotten.  They will be attempting to build a new theory of the program and how it relates to the world.  And it is almost guaranteed they will get it wrong.  This is not a diatribe against documentation.  But it must be realized that the point of documentation is to be an aid to memory in the building of a theory of the program.  Using a document alone is not sufficient to acquire an adequate theory of the program.  Getting that requires speaking directly – and at length – with someone in possession of it.

You will have to digest the Naur essay and see if you agree or not.  I believe that Naur is right, and that a tacit knowledge, a theory of a program and its relation to the world (domain) is created by the programmer.  This is the real activity of what we call computer programming.  It is about gaining an understanding of a certain kind and becoming an expert in that program and its domain.  This turns the programmer, as person, into a value asset for the employer.  Naur concludes that as such, programmers should be recognized for what it is they actually do, which is to become knowledgeable.  This, and not the texts they produce as source code and documents, constitutes their real contribution.  This is why programmers should be kept together as working teams and kept working on software they have developed.  Because they are the ones in possession of the program’s theory and cannot give it easily over to others, but only over time in working with new team members.