Core Philosophical DifferencesGetting to the root of the differencesbetween OOP and procedural/relational Updated: 3/21/2003 Every now and then I try to see if I can better isolate and simplify the core differences between the object oriented paradigm and relational or relational-centric techniques. This is my latest attempt at this tricky endeavor. This is not my first attempt, and probably will not be the last. Often times these attempts simply create different ways of saying more or less the same thing. However, it is still a useful exercise because it gives different perspectives to ponder things from. The fundamental philosophical disconnect between OO modeling and relational-centric modeling creates continuous problems or time-consuming translating layers. In my opinion, either one or the other must go, at least as far as domain modeling itself, such as business modeling. (Other OO techniques, such as small utility API's can probably live okay in a relational-centric application, since they don't fight over territory as much as business modeling itself does between the two paradigms.)
One of these root differences centers around the
perceived nature of taxonomies, and the other
around the differences between OO "records" and
relational records.
For a long while I could not find any reasoning
for why they tend to drift. I just had to state
to OO fans that the duplication goes against my
observation of the real world application code
and the way it changes over time. Then one day I
encountered a passage that said something like,
"Philosophers have generally discovered that
taxonomies are relative to specific needs,
persons, and/or viewpoints." A one-size-fits-all
hierarchical classification of just about
anything is an artificial construct that may not
be appropriate for all uses.
This observation describes the case statement
drift rather well: each procedural "task" is a
specific viewpoint with specific needs. In other
words, the case statement structure describes a local
taxonomy. The best structural decomposition
for one task may not be the best for another. OO
design tends to assume global taxonomies,
or at least taxonomies that are wider in scope
than the "task" scope usually found in
procedural/relational code. A global taxonomy
is generally flawed design thinking in my frank opinion.
Relational techniques further help with this
"relativism of viewpoint". The view you create of
the data is governed by relational expressions
(queries) rather than the "shape" of the code
structure. It is a "computed or virtual
structure" rather than one we have to hand-build
in code. (Some say that relational techniques
don't deal with hierarchies very well. Whether
this is a flaw of relational theory or of
specific implementations is hard to say.
Regardless, I often find that trees are over-used
anyhow in many designs.)
It is true that some people tend to "think in
trees" and/or in terms of universal sub-type
taxonomies.
However, this is not a universal trait of all
humans,
and somewhat problematic if the real world does
not self-organize or change in a tree-wise
fashion in practice.
Network CharacteristicsThe OO approach's basic building block is more or less a dictionary array which has two "columns," often known as the "key", and the "value". In OO lingo the key is the method or attribute name, and the value is one of:
A dictionary array may go by other names such as "associative array", or "record". I will lean toward "record" in this discussion, mostly because it is the shorter name. Note that our definition of record includes fields which may potentially store algorithms (methods), or at least references to algorithms, not just data. A "class" is kind of a "static object" that can only be changed before compile or run time. Dynamic or interpreted OO languages tend to make little or no distinction between objects and classes. Static (compiled) languages tend to have more rules that restrict the scope and usage of records, but conceptually are the same otherwise. Overloading is just a more complicated form of the "key," where parameter definitions become part of it. There are generally two ways to view inheritance under this definition of "object". Some OOP languages perform inheritance similar to the way some cellular biological organisms do: by cloning, which is making a duplicate copy of an organism. Using this technique, inheritance is simply cloning one record to get a second. The attributes or pointers of the copy can then be changed (overridden) as needed. (Sometimes this cloning technique is called "prototyping".) Other OOP languages perform inheritance by providing a "search path" to find methods or attributes (keys) that are not in the current record. This "search path" can simply be considered a "special" key-value pair in our dictionary array. Think of it as an attribute (key) called "parent" or "parents", depending on whether multiple inheritance is permitted.An OOP application will thus tend to look like a network of records (objects). Some of the links between records will be due to inheritance (our "parent" key as described in above footnote), and others will simply be references to other records, such as one might find in an OO "Strategy Pattern". This network of records is similar to the "network databases" (NDB's) of the 1960's, and object databases tend to share many characteristics with them, both the good and bad. Relational, on the other hand, is based on the concept of "tables". Since tables are a larger-scoped structure than records, you can do more powerful, larger-scale reasoning and operations with them than with a web of dictionaries (OO) in my opinion; at least at this point in history. There has yet to be a Dr. Codd of the NDB world, but I cannot rule out the possibility that some kind of "dictionary algebra" will someday be created or discovered to rival the power of relational algebra. But at this stage, relational appears to have its symbolic manipulation act better together. (Failed OO attempts are described later.) Navigating objects in the application or queries often requires following the links in the records one-by-one. For this reason OO-like and pre-relational databases are sometimes called "navigational databases". You will often have operations like "next", "previous", "first", "up" (parent), "down", etc.; or use a "path" along a graph (network) in order to traverse and navigate the structure of records. On the other hand, relational uses logic expressions to find stuff. You ask, "give me a result set that satisfies the following conditions....". You generally don't have to explicitly iterate or traverse through records or pointer chains to filter, find, and/or cross-reference information. (Although, it is true that you may have to iterate through the result set one result record at a time in application code. But this is using the result, not making it.)
![]() This illustrates the basic differences between tables and objects. The connection between tables is shown dotted because they are generally "calculated" instead of actual links. Although indexes may be added to speed commonly-used table operations, these indexes are generally not something that users (app developers) see. Although tables are a higher-level construct, the drawback is perhaps that they do not handle non-consistent "records" as well as the NDB's. These are records in which the fields (keys) may be different per record. Relational requires a kind of "master column list" per table. Any "slot" or "cell" used in a record must be in this master list of columns. Objects generally don't have this requirement. Each object can usually have its own set of keys (methods or attributes) that differ from all other objects. (In static OOP languages, this may not be true of objects, but it will be of classes.) Trying to use varied columns on tables tends to lead to lots of sparse (empty) columns for records that don't happen to use a given column, or leads to skinny "attribute tables", which are essentially tables that act a lot like a dictionary. They may have an Attribute column and a Value column, for example. (Empty columns do not necessarily take up more space in modern databases. Thus, memory or disk consumption should not be considered a drawback.) I personally think the higher-power logic overrides the drawbacks of dealing with non- consistent records, but after long debates I have to concede that the preference may be subjective. There is no math or metric that says one is universally better than the other. Relational queries do tend to be shorter than the NDB equivalent for data that fits well into table-shaped structures, but the difference is murkier for queries on more varied records. To me, the benefits that relational offer for the good table-fits outweigh the drawbacks of dealing with poor table-fits. Relational techniques can still be used reasonably well under poor-fit situations. Perhaps there is a way to get the best of both worlds. Maybe fixed-structure records and variable-structure records can be made to live in harmony somehow. Although I have not seen a decent instance of such a tool or protocol yet, I won't discount the possibility that someday it might grow practical. It appears difficult to optimize performance for both approaches, though. Another difficulty is making it possible for a field to "move" from being a "fixed" column to a variable/dynamic column, and visa-versa, without changing existing query code. This is a ripe area for further research. See Dynamic Relational.Tables just work better for my mind. I really dig the power of relational algebra and the simplicity of presentation that tables provide for me. NDB's just lack a consistent larger-scale structure beyond the granularity of a record (dictionary) to grasp onto to guide me on a forest level instead of just a tree level. Trees (inheritance) have been tried as a solution to the larger-scale structure gap of OO. But in my opinion trees don't fit the change-patterns of the real world very well, at least not on a larger scale, as described above. GOF-like OO Patterns have also been proposed as a higher-level structure for OO, but they are not formally enforced by the paradigm. Further, the relational equivalent of most of such patterns (if there is one) is usually superior in my opinion. It also is sometimes said that OOP better integrates behavior and data. However, this is mostly due to the overly-long tradition of hierarchy-based file systems. If such file systems were replaced with a relational file system, things would probably be different. SummaryThe base philosophical differences I seem to have with the OO paradigm seem to boil down to the appropriateness of trees, the appropriateness of global taxonomies compared to local or ad-hoc taxonomies, and the network-database-like structure of OO versus relational.
© Copyright 2003 by Findy Services and B. Jacobs
|