Abstraction

OOP's flavor of abstraction falls short
Updated: 9/12/2002

The most common definition of "abstraction" is the removal or hiding of irrelevant details. This makes dealing with the problem easier since there are less details in the way to block or slow the management or study of items.

For example, the biological DNA structure is made up of chains of proteins. One can represent a DNA strand by the individual chemicals that compose it and the 3D helix structure that it makes. However, it greatly simplifies things to factor the repeating chemicals into four proteins represented by the letters A, C, G, and T. This is a lot more compact and easier to find patterns in than the chemical helix. DNA can then be represented with string patterns that resemble something like "TCCTCGAGTTTAGCGTGGTACACGTCTCCTCGA...." Such letter sequences are an "abstraction" of the actual DNA molecule. For most biological industry uses, this representation is plenty sufficient.

In business modeling, we try to do similar things with software: look for structures and representations that can simplify our view of things in order to help us manage them and see patterns.

I often find that real world "abstraction" needs to be relative (at least with regard to custom business software). The "irrelevant details" for one "viewer" may be very relevant to another. I see nothing inherent in OOP that assists this process more than other paradigms. If you do "see it", then please point it out. OOP modeling tends to assume that relevancy is global. On the other hand, procedural/relational handles this issue by making abstraction be more ad-hoc using queries, and generally limiting an abstraction to a particular task or event.

The Tale of the Electrician and the Interior Decorator

An electrician and an interior decorator will look at the same building each in two different ways. The decorator will probably view the building primarily as a series of rooms. However, the electrician will view it as a series of walls and floors. Walls are in both their views, but serve a different purpose, have a different priority, and relate in different ways.

Thus, the "abstraction" used by the decorator may be rooms which are composed/comprised of walls and floors/ceilings. But the electrician's view may have the walls be the primary abstraction, with room information incidental, perhaps only to locate the proper wall when walking around.

Further, in the electrician's view, walls are a complex 3D structure with their own little world inside, while the interior decorator may view them mostly as flat panels. Both have walls in their mental abstractions, but view and relate them differently.

OOP's abstractions tend to be global. One-interface-fits-all and/or one-abstraction-fits-all. In procedural/relational, on the other hand, abstractions tend to be "local", in other words, as-needed. There may be a Wall table and a Room table, and the relationships between them in the database. However, the global model does *not* place one over the other. If we need to see rooms composed of walls, then we create that *view* (query) for the task or person that needs such a view. The "base" model only says how rooms are related to walls, but does not "elevate" one over the other. If elevation (ranking or nesting) is needed, then a query is created to temporarily give us (or something) such an elevated view.

OOP will typically start by saying, "Okay let's see. Rooms are composed of walls and floors.", and create a nested composition into the code model. However, that model may be completely inappropriate for the electrician. The decorator's perspective has been hard-wired into the model because the OOP designer happened to talk to the decorator first. The electrician would possibly have "room" be a small side attribute of "wall". In other words, nest "room" within "wall".

OOP modeling tends to be "nested" and/or hierarchy based. Procedural/relational abstractions tend to be graph-based (networked) and set-based. Graphs are a more generic modeling tool than nesting. A particular nesting can be a temporary or virtual view under relational or Boolean graphs. IOW, under P/R you can have your cake and eat it too. It makes fewer "global" design decisions about relationships, especially with regard to a ranking such as high or low level or inside/outside. These are usually locally defined as needed under p/r. This way we are less likely to prematurely favor the electrician's view or the decorator's view in a global way.

Another Vehicle Example

An example commonly found in OO textbooks and OODBMS promotional material is vehicle part breakdowns.

Car
- Engine
  - Engine Block
    - Cylinders
    - Spark Plug
    - Cam Shaft
    - Etc....
  - Radiator
  - Battery
  - Air Conditioner
- Chassis
- Exterior
- Interior
  - Seats
    - Driver's Seat
    - Front Passenger Seat
    - Etc....
  - Dash Panel
  - Carpet
- Etc....

This kind of relationship breakdown is sometimes called "aggregation". It is indeed a useful relationship for some operations; however, one must be careful not the make it the only viewpoint of "parts".

An engineer or mechanic my indeed enjoy this view, because it maps directly to the physical construction of the car. But, there are many tasks or operations that may require a completely different view. For example, an accountant may not really care about the physical nesting of parts. They are concerned with the cost. They may want to sort by cost, price variability, shipping costs, etc.

The Logistics Department may want to group by supplier and/or manufacturer, weight, dimensions, etc. Materials people have to worry about total weight (for fuel economics), hazardous materials regulations and fire reduction standards. Most of their operations may group/search/organize by the composition or chemistry of the parts.

A typical business will need to be able to provide many viewpoints of any given item. The aspect issues that "muck up" the nice inheritance view also muck up the nested aggregation view.

Some will argue that the physical breakdown structure is still the "primary" structure, and thus should get primary modeling and/or indexing priority. In some domains this may perhaps be the case. However, in my experience for the domain I am most familiar with, it is my opinion that one should not hamper the addition of or the raising of importance of future, unforeseen viewpoints.
Aggregation-based databases were well-tried in the 1960's and 1970's in the form of IBM's IMS database family. They usually ended up with many "cross indexes" in practice, and thus the "tree" essentially became a graph (network) in the end
It should be noted that the algorithm structure of procedural code tends to be nested. This appears to contradict my anti-nesting stance. I don't know why, but algorithms seem to suffer less of the problems that data structures seem to suffer under hard-wired nesting and hierarchies. I would note, however, that I don't subscribe to the heavy top-down nesting that was popular in the late 1970's. The "tighter nesting" tends to be at the lower levels in my designs. The lower levels are often specific to a given task, and thus don't need to be concerned about varying contexts. If it is something that can be widely shared, then it would not be considered "low level" anyhow.

Note that OOP "Encapsulation" is mostly a nesting operation of a sort also, and thus it inherits (pun) much of the problems related to nesting-based abstractions.

There are some caveats to avoidance of nesting and trees. First of all, I will agree that when nesting or hierarchies are needed, P/R may not handle them quite as "slick" as OOP. But, being flexible is usually worth that price. An SUV might not ride as smooth when on paved roads, but if you often need to go off-road, then an SUV is a better bet than a road car because it can do both reasonably well. One does not know whether the future of any given feature will continue to be nested or hierarchical, or fly off in some other tangent. The latter is the better bet in my experience. There are very few forces in the real business world which herd structures and patterns into clean nesting and hierarchies. (Sure, you can force things into trees if you try hard enough, but the result has much to be desired.)

The second caveat is "grokkability" (understandability). Some people may indeed naturally think better under hierarchies and/or nesting. I won't challenge this because I don't wish to dictate to people how to organize their own neurons, assuming one even has a choice. However, this may bring one to the problem of picking between a design that is more brain-friendly (to an individual) or more change-friendly. I personally have no problem thinking in graphs and sets (although I need to take some formal set theory courses I think, which were oddly lacking form my college cirriculum), and feel that these are a better representation of the real world and its change patterns than hierarchies and nesting.

Some OO fans are likely to say, "OOP does not impose hierarchies and nesting on designs and developers. Classes can be related purely by a graph of relations." This may be partly true, but it goes against much of the OOP philosophy as written down.

Further, it is tradition to recreate such graphs and structures using the programming language in OOP, whereas p/r tends to use the *existing* relationships defined in the database. In other words, it uses the database's structure instead of recreate/echo it within a given application language. Thus, there is duplication and waste in such OOP designs. OOP modeling tendencies and RDBMS tend to fight over territory in such circumstances. (OODBMS pretty much are a dead technology, except for a few niches, so one needs to live with RDBMS whether they like them or not.)

If you remove hierarchies, nesting, and schema-echoing from OOP, you will probably have a rather procedural/relational design. In other words, fixing the sins of OOP turns it into procedural/relational more or less.

See Also:
Business Modeling
Invoice Detail Example

OOP Criticism