Change Patterns

Updated 7/22/2002

To make software more change-friendly, you have to first become a "student of change".

In many debates......um, discussions with OO proponents, one common theme in our differences is real or perceived differences in the patterns of changes. Both our solutions to sample problems are often well-optimized for the future changes as we each perceive them. However, our assumptions about the patterns that future changes take on can vary widely.

Rather than focus on solutions and solution patterns, perhaps we need to back up and study possible change patterns. We will never agree on our solutions if we disagree about what problem we are solving. The "problem" in this case is engineering our application for future changes (i.e., making them "change-friendly"). I often find that there are important tradeoffs when selecting a solution. There is rarely a "free lunch" when choosing which change patterns to prepare for. Thus, assumptions about which ones will be the most common can greatly affect how change-friendly our applications actually turn out to be.

In an attempt to improve communication about techniques for making software more change-friendly, below is a preliminary attempt to classify change patterns. Note that these are not necessarily mutually exclusive nor meant to be an exhaustive catalog. More will likely be added or amended in the future.

Genetic Tree

This is modeled after biological classifications; however, that does not necessarily mean that it is only found in biology. (The names chosen here often reflect a common use, but not necessarily the only use.) A key feature of Genetic Tree is that features usually don't "hop nodes" across branches. For example, a Tiger node is not very likely to suddenly "grab" features from the Bird nodes and grow feathers. (A flying tiger is a frightening thought indeed. I doubt mosquito-repellent would help.)

In biology it is possible for infectious organisms, such as viruses and bacteria, to inadvertently transmit DNA from one species to another very different species. Thus, even in the genetic world, the tree is not fully "clean". It has some graph-like links across branches.

One must be careful not to force this pattern onto something that is not really a fit. Although a tree can be used to represent many patterns if you try hard enough, it is not always the most appropriate fit. (Note that tree-ness is generally a continuous metric, and the threshold for where tree-ness is no longer a good fit may be a bit subjective, being that some people "think in trees".) It is my opinion that trees and "nestedness" is over-used in computer science. They sometimes work well on smaller scales, but not larger, more complex ones. Also, one should treat a tree-view as one of many possible views of the same items, in many cases.

Driver

The name comes from the fact that this pattern is commonly found in device driver implementations. The two common features of this pattern is that first, mostly only the protocol is shared between implementations, not implementation; and second, the protocol does not vary much between the different implementations.

Sub-atomic

This is where variations or changes often happen to chunks smaller than our original divisions. See the Boundaries discussion for more detail. This change pattern is somewhat controversial in that some claim that if you know the domain well enough or "think long enough" that one can usually pick better divisions of items to avoid this pattern. I don't fully agree with that claim. I have seen too much creativity or capriciousness from those who dream up changes.

Cross-Cutting Aspect

This is where a new aspect comes along that affects many parts of the existing system. Examples include adding security features to various operations of an existing system. See Structural Notes for more on this. The key feature is that there is no easy way to isolate most of it to a single spot/unit/module/class.

Random Features

This is where a new thing uses many existing features, but with a different, hard-to-predict pattern to the features. In our tiger example above, a new tiger might indeed have feathers. See customer plans for an example.

Plurality of Relationship

Sometimes the quantity of relationships between things changes. For example, something might go from being one-to-one, to being one-to-many. A "Customers" entity may start out with one contact (address), but later we may need multiple contacts, such as a sales contact, product support contact, billing contact, etc. Thus, there would then be many contacts per customer.

Notes

Most of these patterns involve adding or inserting something new rather than changing or deleting. For some reason, adding seems to end up being the primary focus when comparing paradigms and their behavior under different change scenarios. Why this is so, I don't yet know.

People Types

A Thought Experiment about Sub-typing

Sometimes to study a concept, I envision exaggerated versions of it. The exaggerated version can suggest ways of looking at things that a more mild example may not make obvious. If you find an answer to the extreme version, then often it can be applied to the less extreme version also. (Of course, it takes judgement to know what can be safely extrapolated from the experiment and what can't.) Consider an extreme example of dividing "people" into a bunch of mutually-exclusive or inheritance-based "subtypes". In other words, what is the ideal taxonomy of people? If you think about this for a while, you will run into some sticky problems. For example, you may see that:

There will be multiple, orthogonal or semi-orthogonal (independent) candidate division criteria: gender, age, occupation, education level, marriage status, IQ, home ownership, hair color, blood type, etc. (This is sometimes called "multiple dimensions")
Some of the categories that you originally saw as mutually exclusive will not be so. For example, if you divide by occupation, you may discover some moonlighters who have multiple occupations. And, multi-colored hair may be the new rage. The U.S. Census form once had no category for "multi-racial". Although they now have it, it still does not tell us much. The proper way would be to make the list non-mutually-exclusive. Catch-all categories like "multi" may be a quick fix, but not always the most informative. (Some designs start to list multiple combinations with class names like "BookAndVideo". But, this usually does not scale well, and is a sign of sloppy design.)
The "types" you want or need to see are relative to what you are after at a given time or task. Thus, what are useful types to one person or situation are not relevant to another. If you are running a dating service, then "gender" is an important division criteria. (Often such services are more generous to women because they are less likely to join such services, and there is pressure to keep an even balance of genders.) However, if you are looking for a tax attorney in an occupational directory application, then gender will be (or should be) a useless division.
Traits that you thought were related may turn out not to be related after all. For example, you may at first assume that all sales people (occupation) are paid by commission, because most of your sales friends are paid that way. But later you may discover that some salespeople receive flat pay. (Didn't Saturn car dealerships claim this once?)

Now, I assert that most non-trivial entities (things) in the business world have similar kinds of issues in the longer run. Maybe for simple examples a single list of mutually-exclusive "subtypes" will work. However, as things grow more complex or dynamic, the same issues I list above start to bite, and bust "clean" polymorphism and subtypes.

Some object oriented programming proponents already realize this, but their "fix" are convoluted pattern structures. However, many OOP fans still cling to a subtype-centric view of things.

Typing proponents look at this People experiment and say things like, "well, people are very complex and diverse. Most things are not as complicated as people."

I have yet to find an OO proponent who would use subtypes to model people in general. Some of them may do such for limited contexts, such as employees. But employees suffer similar issues.

That is indeed true. However, I allege that the change patterns for even simple entities still encounter the same kinds of issues that I listed above. The probability that the next given change will "break" clean polymorphism or types does not depend on the complexity or size of the current entity. (The repair effort might, however.)

For example, lets say that the probability that a new important orthogonal division criteria (#1 above) will pop up is a 1 in 10 chance (a fictional number for this example only). Whether our application is a mom-and-pop store, or an international chain does not affect the probability (1 in 10 in this case) that the next change will be of the orthogonal kind.

It is like dropping ink into clear water. The direction of the ink spread is generally not related to the size of the current boundary of the inked area. It follows the changing flow of the water regardless of where it has spread in the past. Entropy rates generally do not depend on the quantity of prior entropy activity.

Generally more factors are added to business rules over time in my experience, so the quantity of factors and rules goes up over time. Sometimes some factors stop being relevant over time, but these are often left in the software code just in case the situation occurs again. For example, a special discounting promotion may have to be wired into a commerce system. When the promotion is over, it is removed or deactivated. (If you have ever read the legal writing on some coupons, you will realize how complex discounting algorithms can be.) However, usually nobody knows for sure if or when it will be later activated. In some rare cases, like Y2K introspection (test) code, one can safely get rid of features after their intended use is over, but these are the exception.

The fact that there is a tendency toward more factors and more complex business rules does not necessarily mean that the business itself is growing more complex. Usually it means that more features are being added to the software to handle situations that were not foreseen in the original specifications. The software is simply fitting the business better and in more detail over time. The uncertainly of removal permanence (above) is also a contributing factor toward increased complexity.

Even in simple applications, if one thinks too much, one often can envision all kinds of extra options and what-if scenarios. The what-if scenarios for small applications are not significantly different in structure than those of large applications, in my observation. (At least the ones that become true.)

Thus, one cannot dismiss the change patterns uncovered during this thought experiment as "extreme". If you disagree, I challenge you to identify the forces of nature or human behavior that alter the probability for more complex entities differently than for less complex ones. In other words, find and articulate the alleged factors that make the "drift" behave differently depending on entity size or complexity.

I will agree that some organizations out of tradition or convention will try to find ways to have a clean list of mutually-exclusive categories for things. However, bosses and owners may retire or quit or sell the company to new owners, or new political or business trends may pop up out of nowhere to scramble the status-quo.
I will also agree that domains that rely on the laws of nature can make more sub-typing assumptions because those laws don't really change. (If they do, you better start praying.) The rules of nature are more stable than rules of humans. I find modeling the minds of managers, owners, and marketers much more unpredictable and dynamic than modeling natural processes.

It is true that sometimes the above change patterns will not happen. Things might indeed stay in a pattern conducive to sub-typing. I won't dispute that. However, the problem is that one realistically cannot know in advanced that such will happen. Being that we don't know, it is safer to bet our design on the assumption that it will happen (unless you can show it rare on average). If you find a mine in a battle-field, it is best not to assume it is a dud.

The extra "cost" of a procedural/relational approach even if change patterns fit into clean subtypes is still usually minimal or non-existing over a sub-typed version, again dependant on what the future beholds. OO fans like to point out that adding new sub-types may require visiting multiple subroutines to add new Case or IF statements for the new subtype. However, the reverse is true if adding new operations (methods) to each subtype. See Meyer Single Choice Principle or Shapes Example for more on this. Generally if there are frequent additions of subtypes, then these are usually factored into tables anyhow, so that new additions are a new record instead of new code. For example, product categories are often managed via a Category table instead of programming code like Case (Switch) statements.

I assert that OOP subtype structures require more code rework ("refactoring") if the clean subtype pattern is violated somewhat often. In procedural/relational programming, the code is not grouped by subtypes for the most part. Thus, a breakdown of the subtype pattern is not going to affect the code structure as much. The changes will tend to be "local", or at least on a smaller structural scale. (The links below provide some illustrative examples.)

Some IF statements may change, or our Join criteria for an SQL statement may change, but generally things remain where they are because things are grouped by task, not by a taxonomy of sub-entities. In sub-typed OO, you often have to move things around from class to class, or even split classes to adjust to the mentioned change patterns.

I count moving code to other named units (methods, classes, modules or routines) as more disruptive than changing IF or SQL formulas/expressions.

Some OO fans have disagreed with this, but were not clear why. The fact that a code processor (IDE) can simplify some of the class and method moving and splitting is not sufficient justification. You still have to live with and memorize a very different structure. Having a machine to easily move the furniture around does not prevent one from tripping over it in the middle of the night because they forgot the new layout. Second, code processors can be made for procedural/relational code also. (See also Machine Reliance and IDE's)

Further, formulas are more adaptable in my observation than physical class and method structures. Multiple factors and dimensions can participate in a formula, but physical code structures tend to have to favor a single dimension over others. Formulas are more virtual, local, and compact in my opinion. In addition, one can use the existing powers of relational engines to link things up as we want to see them. No need to "hand-index" our classes together. (True, they may make some things harder, but average out better overall.) OOP tends to rely on the physical position or "ownership" of something within the code to control behavior. Formulas are simply less position-dependant. If you rely on the physical position for dispatching or behavior, then you will also have to change position more often when the factors coupled to position change or become less relevant.

Some OO fans suggest that p/r increase the quantity of spots that have to be changed. Thus, their criticism is not directed at the magnitude of a given change (such as moving versus changing), but the quantity of different "spots" that have to be changed. However, I have yet to see a realistic example of p/r "spreading related stuff all over the place" more so than competing OOP code. Usually either they are poor p/r programmers, or forget to consider aspect tradeoffs.

Main | Metrics