Subtype Proliferation MythUpdated 5/23/2002Back to OOP Criticism | Main | Bottom One thing that OO is reasonably well optimized for (probably at the expense of other stuff), is extensions or changes by subtype. A typical example given is a Shape class with subtypes such as Rectangle and Ellipse. The problem is that I don't see a large proliferation of clear, stable subtypes at all. This is closely related the fact that most real world hierarchies or taxonomies are too dynamic to fit well into classical OO hierarchies. (This often applies equally to any scheme which divides things into mutually exclusive sub-divisions.) The Lockstep MythFor the most part, variations don't grow and change in a clean tree-wise way. For example, OO fans sometimes bring up employee subtypes as a business example. The problem is that employees can be subdivided along multiple non-mutually-exclusive facets. This is sometimes referred to as "orthogonal aspects". Subtype division candidates include:
Further, some of these may totally change due to new laws, etc. You can pick one of these sets, but not all. And, any one choice is fairly arbitrary. It is like being forced to decide which of your many children you will assign inheritance to in your written will, assuming only one child is permitted. A "role" or "strategy" pattern may be more appropriate than sub-typing here. Some OO fans have also suggested multiple inheritance. A potential problem with multiple inheritance is that it may not model mutually exclusive features very well. For example, one may risk inheriting both "exempt" and "non-exempt" classes. It may also make persistence tougher. See OOSC2 Chapter 24 Critique for more notes on multiple inheritance and employee examples.My experience is that management wants features combined in dynamic and unpredictable (at design time) ways. This "feature recombination" view does not fit hierarchical sub-typing very well. (See the Bank example for more discussion on feature recombination versus sub-classing.)
1. Good Candidate for Sub-typing: a b c d 1 A R K U 2 A R K I 3 C X M U 4 C X M P 2. Poor Candidate for Sub-typing: 1 A R K U 2 C R M I 3 A X K U 4 A X M PIn this example, the rows represent variations (subtypes) and the columns represent "features" in the Meyer-ese sense (methods or attributes). The first set suggests a strong possibility for creating formal subtypes. Note how rows 1 and 2 share a pattern, as does rows 3 and 4. This suggests that rows 1 and 2 can safely belong to one subtype/subclass and rows 3 and 4 to another. Subtypes will generally share several features that change "in lockstep" like this. In this case, columns a, b, and c change in lockstep in the first set. The break-even point for how many lock-stepped features are required to qualify for subtype division depends on many issues, including personal judgment and the likelihood that the lockstep pattern will last. (In the shape example, the lockstep pattern appeared in the form of empty (null, blank, or zero) columns for shapes that do not use certain measurements.) The second example is what I find much more often in the real world. There is either very little permanent commonality, or if there is commonality, then new variations tend to be rare. If new variations are rare, then the OO sub-classing mechanism is of little help. Being allegedly "change-friendly" is of little help if there is no change. In other words, the more dynamic something is, the less likely it is to fit a hierarchy or mutually-exclusive divisions (sub-types). Thus, if you want a change-friendly system, then abandon or downplay the notion of sub-types. For example, "real estate" can be subdivided into land, residential housing, and commercial buildings (at least). It is not that likely that a new real estate type will come along in any given year, let alone a decade. Thus, picking a paradigm just to handle such rare changes might be mostly in vain, especially if the new variation is significantly different from the existing ones. If there is very little commonality, then perhaps the hierarchy is only nominal. See Inheritance or Something Else? for more on this. Publications almost seems like a clean hierarchy, but not on closer inspection. It may be that some of these things that initially look like a taxonomy are better served with things like the role pattern (including the procedural/relational versions of patterns), where the features are set-based rather than hierarchical or mutually exclusive. [Even though...] it has been known since 1847 that classifications are dependent on the purpose of the classification, people continue to believe that it is possible to create a classification system that is context-independent. (Haim Kilov on comp.object, 6/01. Note that I consider sub-types to be "classifications".)
See Also:
Messy Nested Structure Claims (more employee examples) People Types - A Thought Experiment on Sub-typing Alternatives to Trees Presidential Politics AnalogyOne of the problems I have with picking a U.S. presidential candidate is that I prefer some of the stances of one candidate, and some of the stances of the other(s). Perhaps it would be less of an issue if I could vote on the issues independently. (California and other states already have issue-based ballots for state-wide issues.) I would then not have to pick a mutually exclusive "lump" that on the average better fits my preferred combination by small margins.This is essentially the same kind of problem I see with subtyping used for business modeling. Were the founding fathers OO fans by chance?
See Also:
Customer Feature Plans Dubins and RucksLet's take a different look at the lockstep issue.class A childof BirdX feature 1... // cell A1 feature 2... // cell A2 feature 3... // cell A3 feature 4... // cell A4 endClass class B childof BirdX feature 1... // cell B1 feature 2... // cell B2 feature 3... // cell B3 feature 4... // cell B4 endClass ... // define two instances A Duck B RobinHere, "cells" are simply implementations of features. Thus, cell B2 is the implementation of feature/attribute 2 of subclass B. (If you can think of a better name than "cell", please let me know. Meyer's "Feature" seems to apply to the method name, not necessarily the implementation variations themselves.) Note how something belonging to the BirdX family has to be either A or B. A dichotomy has been created here. (For the sake of discussion, let's assume that class BirdX is only a "template" class, and cannot instantiated.) This kind of arrangement assumes either all of A's cells, or all of B's cells. Something belonging to the BirdX family must be either A or B. However, most of the real (business) world has "Dubins" and "Rucks". A new variation or instance may need cells A1, B2, A3, B4, etc. It does not make much sense to add a new subclass for every possible combination of the 4 features. For numerous methods or variations, the combinations grow astronomically. The association of features (cells) is quite often temporary in custom business applications. Thus, the "clustering", "linking", or "binding" of features as often done in OO subclasses poorly models the business world for the most part. A subclass is poor encapsulation of actual relations.
The above diagram shows the difference between common OO thinking and the approach more appropriate to most business organizations. Subclasses improperly assume that features fit into groups.
Note that we are purposely blurring the distinction between instances and "variations" (such as subclasses). The issues of when one becomes the other is separate issue that will not be addressed here. I find it better to treat each feature as being independent. This makes it easier to reassign features to instances without worrying about "partner" features. Subclasses just get in the way with their faulty assumptions about long-term feature association. (Note that our coined term "cell" may be more appropriate than "feature", as described above.) Dicey OO SolutionTo "solve" this, many OO fans suggest splitting methods up into tiny pieces and putting most or all of them in the parent class. Thus, subclasses and/or instantiations simply call the pool of feature variations (cells) of the now hefty parent. This may "work" for the most part, but it is simply not superior to procedural/relational approaches, and may create an ongoing chore known as "refactoring", among other problems. For related information on this, please see Boundaries of Change and Variation. Time and SpaceTypes can be divided into roughly 4 categories:
Subdividing category #1 is obviously a mistake. Subdividing #2 may seem like a good idea at first, but can lead to messy trees that spread like a vine because one may have to keep overriding parent classes in order to keep up with the changes. This has been tagged inheritance buildup elsewhere on this site. (The link also describes how it is tough and risky to keep starting over or clean up for new hierarchies. This was dubbed the "Fragile Parent Problem.") Now #3 seems like a good use for OO sub-classing. However, if additions are generally rare, then perhaps building in sub-classing into the language may not be a productive idea. Regular procedural techniques can handle these just fine. All the fears about accidentally denting nearby methods (code blocks) when adding subtypes are nearly moot since adding is by definition rare. (Related to Meyer's "Single Choice" principle.) Category #4 seems to be the one that OO is all geared up for. If #4 was common, then OO would be godsend. Unfortunately, I find it quite rare in the programming I do (small and medium custom business applications). Perhaps if you write device drivers for a living, then OO inheritance would be quite nice. But, what about the rest of us? (Certain types of variation proliferation that may or may not qualify as #4 will be discussed later.)
Even things that can fit fairly well to OO sub-typing like
GUI's and graphics are often over-hierarchitized by
overzealous OO fans. For example, I once found a Java
class that generated and re-sized JPEG images.
The class builder had to subclass a GUI panel to get
access to certain graphical operations. However, this was
to be a batch (command mode) process that did not need
GUI's. Because it opened GUI components,
the process did not shut down properly
in some cases because of a bug in Sun's GUI classes.
Thus, the JPEG class designer was sucked up onto an irrelevant
branch and subjected to GUI bugs in order to "get at" certain
operations in a non-GUI operation. (I am not really complaining
about the GUI bugs here, other than an example of side-effects
of having to swim through irrelevant classes.)
Perhaps it could have been rewritten to avoid opening GUI
objects with some effort, but Java's API's seem
influenced by an over-eagerness to take "advantage" of inheritance.
Thus, one is forced or heavily encouraged to make and
use sub-classing, whether appropriate or not.
One can argue that it is the users (Java designers) and not
the paradigm that is at fault. However, if something is
infrequently needed and often misused, perhaps it should
be yanked from the mainstream.
I wish that computer scientists would do a better job
at documenting real-world subclasses before devising
abstraction gimmicks to prematurely solve poorly
mapped problem spaces. For example, many OO academics completely
ignore operational expansion (new operations for existing
subclasses) at the expense of type-wise expension.
See also the Shape example.
I would recommend more foot work to comb the countryside
building a taxonomy and analysis of what, where, who, how
often, and under what conditions subtyping occures.
I realize that this is less intellectually interesting
than devising abstraction games, but this is what is
really needed. Excessive time up in the ivory
tower has perhaps clouded judgement.
The late astronomer Carl Sagan speculated that ancient Greek technology
did not advance as far as pure math did in that culture
because Greek citizens then
believed that manual effort was for slaves and the poor. Thus,
they did not want to get their hands and feet dirty by
experimenting with the actual world. They believed that the
Universe could best be figured out by thinking; not by doing
and observing.
I suspect that software engineering research has run into
a similar "Couch Scholar" phenomena with regard to OOP.
Scholars seem unwilling to dig through real projects
looking for real patterns. Even when they do look, it
is often in scientific applications instead of the more
common business applications.
When I ask what sort of things OO fans tend to use subclasses with, I usually get very vague answers. However, there are 3 areas that seem to be the most common source of sub-classing among OOers: 1. Extending library/package classes for an application 2. Internal conversions ("adapters") 3. User Interface Extending libraries and components can be just in procedural/relational (P/R) programming as well. Rather than "override" a method, one simply uses their own function in P/R. (Sometimes one may still call a library function within the new variation, but with some pre- or post-processing.) One can perhaps argue that this is inferior self-documentation and less protection; however, the P/R approach does generally provide the same abilities to "extend" libraries. Note that I have not inspected enough samples to see if OO sub-classing is really beneficial or simply using OO philosophy for the sake of conformance. (A related issue is the tying of complex objects or types.) The second one on our list, adapters, is less of an issue in P/R because the association between data and operations is not as tight. Thus, less formal or no conversion is often needed. The third stated use, UI building, is a complex topic. P/R GUI building is under-explored IMO. However, one of my favorite approaches to UI organization is Data Dictionaries. They are a subset of Control Tables (introduced above) that store information about UI and database fields. They are especially useful when the number of fields numbers in multiple dozens or more. (Not all screen fields are one-to-one mappable to DB fields, and special techniques are often needed to handle such "virtual" fields.) [Note that I plan on providing more details and suggestions regarding Data Dictionaries soon.] One advantage of the parameterized (tabled) variation extension approach is that such collections can be shared with other paradigms and languages. Take animal classifications, for example. If they were hardwired into an OOP language, like many training materials suggest doing, then only that language can use the data. (OODBMS perhaps can be used, but this may make it hard to share with non-OO paradigms, among other problems.) However, parameterizing that information and putting it into a RDBMS allows many different applications, paradigms, and researchers to share species information. OOP tends to be excessively memory-centric, and paradigm centric with regard to data. It likes to hide it's light under a barrel. The melding of algorithms with data can often result in such "selfish" problems. Are Patterns a Solution?Sometimes "Patterns" are proposed as a way to solve some of the sloppy propagation of variations (subclasses) found in the real world. However, the OO versions of many of these patterns are unnecessarily complicated. For one, they often have excessive "middlemen" classes. Aside from extra complexity, these middlemen often violate Meyer's "Single Choice Principle". (Note that I consider a list of operations just as important as a list of subtypes when applying this principle. Meyer also hints of operational lists with his text editor menu command example on page 62 of OOSC2.)Further, some OO fans say that the real benefits of many OO patterns do not show up until several years after the birth of a project. This may conflict with standard investment accounting time discounting theories. Also, many patterns better relate to the field of data and relational modeling. It is my opinion that data modeling not be too strongly mixed with algorithm modeling because it reduces the sharing ability of data among multiple paradigms and languages, which is a common need in business. Also, tasks and data (nouns) often change their relationship between each other. Thus, it is not good to heavily couple them. There has not been enough research and effort put into non-OO versions of many patterns. Thus, there is not enough to compare at this time. I am the only documenter of such that I know of. In some ways reliance of OO patterns can have drastic consequences to code structure (Meyerian continuity principle) if you need to change from one pattern to another. Procedural/relational design is often more immune to pattern changes. For example, suppose we have a table with attributes such as "isMgr" and "baseRate". if emp.isMgr and emp.baseRate > threashold then giveRaise(emp) end ifLater, we wish to use a role-like pattern. The code above will not have to be shuffled around, only have minor changes. if hasRole(emp, "manager") and emp.baseRate > threashold then giveRaise(emp) end ifFurther, patterns may be a particular viewpoint, and given sections may need different orthogonal perspectives. Thus, patterns should perhaps be viewed from a "has-a" perspective instead of an "is-a" perspective. The procedural/relational viewpoint is often to try to farm out patterns to a temporary query instead of a physical code structure. It is nearly impossible to have multiple code structures overlap to give the needed pattern for different viewpoints, but often easily do-able via relational queries. It is easier to change a relational or Boolean expression than it is to change the physical arrangement of code. (Some argue that OOP code is also change-able by rearranging class associations, but if you use inheritance in the pattern structure, then you are usually limited to just one parent (aspect); and if you use delegation, then you have nothing over procedural/relational.) Not all pattern views can be generated via queries, but we still need to have code that can adapt to changing, morphing, and hybrid business patterns. This is where the IF statement expressions come in. Rather than move our code to fit today's pattern, we can often just change the IF criteria without having to move code; at least not a lot of code. It is easier to learn and find code if it does not move around a lot. It is a philosophy of Table Oriented Programming to control as much of relationships and behavior dispatching using Boolean expressions or expressions in general as possible. Tables-Of-Rules are a lot easier to change and manage and view than programming code in most cases. It is true that dynamic IDE's, such as the one's that have made Smalltalk famous can do some of this; but they are often re-inventing techniques that are already covered in relational table management techniques/tools. Why re-invent one wheel for data and another for code? Why not focus on perfecting table browsers instead? Another problem with OO patterns is they the encourage over-engineering for potentially fleeting patterns. A pattern fanatic will see 3 items that fit a pattern, and then arrange the code to ease "mass production" of "more of the same". Thus, if these 3 items grow into 20 of the same pattern, then there indeed may be a net gain. However, if the original 3 was mostly a coincidental alignment, then "engineering for more of the same" was a bad investment. Thus, vote K.I.S.S. (simplicity) unless you are sure of the stability and prominence of the pattern. (Prominence is mentioned in case there are overlapping, orthogonal pattern view needs, as described above.) In some domains where natural laws or slow-changing international standards dictate a stable pattern, then code-structure-based patterns may indeed be the way to go. However, in the dynamic business culture, tread with care.
SummaryObject Oriented Programming is well-optimized for managing frequently expanding subtypes. The problem is that features perhaps rarely expand in a clean, clear-cut hierarchical way in the real world.Although there is unfortunately insufficient research to know how real type variations actually do expand, my impression is that OOP inheritance has added a boatload of uncessesary complexity and other organizational sacrifices in order to solve something that is a small problem to begin with in most business software. OOP sub-typing may be building better umbrellas for fish.
Split Ends The Driver Pattern Publications Example Overlapping or Orthogonal Aspects Boundaries of Change and Variation Back to OOP Criticism | Main © Copyright 1999, 2000, 2001 by Findy Services and B. Jacobs |