Structural Analysis Notes

More comments and observations on procedural block structures, IF statements, Case/Switch statements, and dispatching.

Updated 7/14/2005

Nested Block Proliferation Claims

Some OO fans have complained that procedural/relational (p/r) programs often "degenerate into long, deeply nested IF statements".

Although I have indeed encountered p/r code that was a mess of nested IF and/or CASE statements, in the vast majority of cases it was do to poor programming or poor design and was not an inherent fault of the paradigms involved. The most common error is the lack of basic repetition factoring. The second most common reason was failure to break a big routine into smaller routines.

However, even with basic repairs, sometimes there is indeed still some unpleasant nesting of IF statements. One such pattern is combinatorial explosions of dispatching. But, I don't see how OOP is the solution to such cases.

For an example, let's look at the employee classification example again. To keep the example simple, we will look at three Boolean attributes: manager/non-manager, exempt/non-exempt, and part-time/full-time.

The various combinations of these attributes may result in code that looks something like this:

  sub calcPay(emp)    // Calculate pay subroutine
    if emp.isManager     // field reference
       if emp.isExempt
          if emp.isPartTime
             foo_1
          else
             foo_2
          end if
       else  // non-exempt
          if emp.isPartTime
             foo_3
          else
             foo_4
          end if
       end if
    else   // non-mgr
       if emp.isExempt
          if emp.isPartTime
             foo_5
          else
             foo_6
          end if
       else  // non-exempt
          if emp.isPartTime
             foo_7
          else
             foo_8
          end if
       end if
    end if
  end sub

Here we have 8 different "cells" based on all possible combinations of the 3 attributes. (Note that some cells may be error messages for invalid combinations.)

Variations of this structure are not uncommon. Thus the question: how does one "fix" it?

One approach is to convert every IF block into a separate subroutine.

  sub calcPay()
    if emp.isManager
       doManager()
    else   // non-mgr
       doNonManager()
    end if
  end sub
  //-----------------
  sub doManager()
    if emp.isExempt
       doMgrExempt()
    else  // non-exempt
       doMgrNonExempt()
    end if
  end sub
  //-----------------
  sub doMgrExempt()
    if emp.isPartTime
       foo_1
    else
       foo_2
    end if
  end sub
  //-----------------
  sub doMgrNonExempt()
    if emp.isPartTime
       foo_3
    else
       foo_4
    end if
  end sub
  // etc.....

Note that it is assumed that the "emp" table or result-set handle has been made into a regional-scoped variable so that we don't have to keep passing it as a parameter. If and how different procedural languages support regional variables varies greatly.

Such an approach indeed does reduce the nesting levels. If somebody took this approach, then another could not complain about "messy nested if/else blocks". However, it creates a lot of small routines that may require one to jump around in the code in order to follow. Some OO fans suggest breaking OO methods into small chunks to avoid change boundary granularity problems among other potential problems, but I find that approach problematic for reasons described on the cited page.

I do not see how OO techniques solve this dilemma either. The problem can best be described in my opinion as a hyper-cube or hyper-grid where each axis is the aspect or trait that affects the selection of the appropriate behavior (cell). In this example we would have a cube with a manager/non-manager axis (X), an exempt/non-exempt axis (Y), and a part-time/full-time axis (Z).

Although our example uses only Boolean traits, often there may be several mutually exclusive traits (A.k.a. "strategies") per aspect. For example, perhaps instead of full-time/part-time, the labor laws may divide employees into full-time (40 hours), major-part-time (21 to 39 hours), and minor-part-time (less than 21 hours). In that case, one would perhaps use nested case statements instead of nested IF's. We are still going to use the Boolean version for most examples after this diagram.

Anyhow, a hyper-grid can potentially have a lot of cells. There are two approaches to managing these cells. The first is to find ways to manage all the potential cells themselves. The second is to reduce the number of cells that have to explicitly be dealt with. In other words, find patterns for defaults or shortcuts to selecting those that are the same for several cells.

The first approach presents us with the sticky issue of managing multiple dimensions in what is essentially one-dimensional (linear) programming code. I don't see any easy way out of this one in the code domain. OOP is still linear code.

One technique is to either not use program code, or find better ways to browse or view the code. One browsing approach is to convert the structure into a hierarchy, and then use a tree-browser. However, since the real structure is not a hierarchy, this is kind of artificial in my opinion, and our original approach (nested IF's) already is a tree structure of sorts. (One can also envision a "block browser" that would give us a tree-view or collapsible outline view of our original approach. I don't see such if/case browsers in code editors very often, but I see nothing that precludes their existence.)

My favorite is to use a relational table browser and a form of control table to manage the structure. You would have one column per aspect. You can see some examples in the multiple dispatch section of the p/r patterns page. Putting the information into a table allows us to perform all kinds of view and query acrobatics to find and view exactly the cells that we want to.

     Mgr  Exempt  Part-time   Implementation
     
      Y     Y        Y        [.....]
      Y     Y        N        [.....]
      Y     N        Y        [.....]
      Y     N        N        [.....]
      N     Y        Y        [.....]
      N     Y        N        [.....]
      N     N        Y        [.....]
      N     N        N        [.....]

However, not all languages support code-containing control tables very well. There are non-code versions of control tables, but these still require the cells to be in code, giving us the same problem that we started with.

But, the layout of such a table does suggest an alternate coding arrangement:

  sub calcPay(emp)
    var mgt = emp.isManager   // make some short-cut variables
    var exempt = emp.isExempt
    var part = emp.isPartTime

    if mgt and exempt and part
       foo_1
    end if
    if mgt and exempt and not part
       foo_2
    end if
    if mgt and not exempt and part
       foo_3
    end if
    if mgt and not exempt and not part
       foo_4
    end if
    if not mgt and exempt and part
       foo_5
    end if
    .....etc.....
  end sub

Note how we used local variables to simplify our condition statements. This technique can greatly help readability. Often times programmers will repeat a long data access expression or some other long expression over and over again, making the structure look more frightening that it actually is. Basic factoring like this for repeated structures and phrases can often clean up code considerably.

This also flattens our nesting. It also allows each block to be treated more-or-less independently. If we move things around, we don't have to be concerned about nesting levels.

One downside is that this approach is not as efficient because every condition of every block must be evaluated.

Now on to the issue of simplifying the selection process. Often times there is a pattern to the structure such that large chunks can use a default value or default behavior. If the pattern of defaults or repeats fits a simple geometry in the hyper-grid, then our first approach can often handle it nicely. For example, if all managers had the same calculation, then the first half of our original example may resemble:

sub calcPay(emp)
    if emp.isManager     
       foo_mgr
    else   // non-mgr
       if emp.isExempt
          if emp.isPartTime
             foo_5
          else
             ....etc....

This would chop our code-size in nearly half. However, the patterns of defaults or same-ness are not always this friendly in the real world. (Note that sometimes one has to change the nesting order to in order to take advantage of such patterns.)

Overall, there are plenty of procedural/relational options to deal with combinational dispatch. I will not say which is best because it may be subjective and depend on the nature of the particular problem domain. However, if deeply-nested IF blocks bother you, there are multiple remedies along with their various side-effects.

I do not see how OOP offers any magic solution to this problem type in general. Sub-classing is often not appropriate because the structure is not really a tree. Picking one aspect as the sub-class division criteria may reduce the nesting level by one, but that is more or less what our second procedural solution is capable of. (We did not have to turn all blocks into subroutines in our example.)

Another problem with sub-classing is that you risk artificially elevating the status of one of the aspects by sub-classing it and not others. If the labor laws change, for example, then the criteria that the sub-classing depended on my suddenly disappear, rendering a lot of your code useless.

As far as I know, all of the non-sub-classing OOP solutions resemble the procedural/relational solutions already shown above.

Some suggested using an Expert System for the situations where there are a bunch of relatively independent rules. However, this may only work well if a major portion or cleanly-divided sections fit the pattern. In many cases the pattern is intermixed with more conventional patterns, making it harder to use a dedicated Expert System engine. See also Game Example under Challenges.

Pre-Calculated Type?

Someone suggested that each combination could be turned into a sub-type class or code so that the calculation only has to be done when new employees are added or their classification changed instead of for each paycheck. The biggest problem I see with this is that in many cases the decisions need to be made based on dynamic information. Although in this case the decision criteria information is not very dynamic, we still have to remember to update the type code or classification for employees if the factors it depends on change. This complicates the design and risks errors if the "update triggers" are missing, bad, or accidentally broken during another program maintenance operation. We have to remember to update them also if we add new factors into the "type" calculation. I would tend to only consider a pre-calculated type-code if performance is proving problematic due to the more dynamic approach.

They also suggested that such a type-code could be used for other calculations. However, in my experience a non-trivial "taxonomy" used for one task rarely matches that needed by another task. Philosophers have generally discovered that taxonomies are relative, and this fits my experience also. Further, the uniformity needed for typing is often lacking.

Lack of Uniformity

The actual structuring in practice is usually not nearly as uniform as the above example, but the general pattern is often still there in various forms. For example, some sections might be nested 2 factors deep, but others 5 deep within the same task. A factor not relevant to one portion of the logic tree may become relevant in another. Example:

  if factorA
    if factorB
      blah1
    else // not B
      if factorC
        if factorD 
          if factorE
            blah2
          else
            blah3
            if factorG
              blah4
            else
              blah5
            end if
          end if
        else  // not D
          blah6
        end if
      else // not C
        blah7
      end if
    end if
  else // not A
    if factorH
      if factorI
        blah8
      else
        if factorD  // note repeat usage of factor
          blah9
        else
          blah10
        end if
      end if
    else // not H
      blah11
    end if
  end if

As you can see, some sections are deep, and some are shallow. Also, sometimes factors (D in this case) may repeat in diverse sections of conditional code. Such non-uniform structures make it tough to create higher abstractions to isolate or simplify patterns. Even if a pattern exists at the start of a project, it may disappear on the next change request. I call these "soft patterns" because one cannot heavily rely on them staying around. If one builds their code to take advantage of them, they may be in for a big surprise when the pattern fades or morphs.

If we plot the active versus non-active combinations of factors, our hyper-grid may resemble:

The active combinations shown against inactive combinations will generally be random with some soft patterns here and there. We basically have Swiss Cheese with some grouping of holes or tunnels where soft patterns appear. Visually, these patterns will usually take on familiar geometric shapes such as lines, rectangular blocks, and triangular wedges (diagonal boundaries). Over time some of the soft patterns will stay and some will disappear or grow into new kinds of patterns.

I am not suggesting that one entirely ignore soft patterns in their code; only that they not over-rely on them, which is what polymorphism does. Polymorphism will elevate a soft pattern into something formal and code-intensive such that undoing it or rearranging it is labor-intensive and error-prone.

General Pattern

After observing a vast number of procedural/relational application structures, I generally conclude:

There are often interweaving factors involved in algorithm dispatching. These factors are often orthogonal or semi-orthogonal to each other.
In practice such conditionals often lack uniformity. Some sections may be shallow, and others deep, for example. This lack of uniformity is the main reason template-based or meta-structures cannot be readily applied to significantly clean up code.
The "condition tree" (nested if/case statements) that dispatches behavior in a given task tends not to be consistent from task-to-task (under decent designs). Each task will have a different condition tree. There are often some similarities, but it is rarely identical, at least over the long-run. (If it is identical over the long run, then it is often a sign that some other arrangement is warranted.)

Some OO fans have disagreed with these characterizations. I don't know what to say other than I call them as I see them.

I think interweaving orthogonal factors is the biggest cause of hard-to-manage complexity in the real world. You cannot simply isolate factors into modules, objects, etc. that only handle that particular factor. To get real functionality, the interaction between these factors needs to be specified. Custom business software is heavily tied to such relationship management. This is probably because anything that can be nicely packaged into one spot would become an off-the-shelf package or library. Left over after this, is the nitty gritty relationship connections.

Security Access Example

Another example of cross-competing aspects resulting in nested or repeated IF statements can sometimes be found in security access management frameworks.

When anything starts to be replicated all over the place, it is time to do some serious re-study of the design. Often such a thing can be factored together somewhere, or at least simplified somehow.

However, sometimes it is unavoidable because there are two (or more) strongly competing aspects. Thus, you have to pull one aspect together at the expense of another.

OO does NOT fix this. Some OO proponents have agreed with me on this. They just claim that their compromise on this issue is better than a procedural/relational compromise (by authoritarian reasoning only so far).

For security access program code, you might have to check to see if the user is authorized for a particular operation (or sometimes entity access).

One might have code like:

  if hasAccess(userRef, "billing") then ....

A more complicating variation is to check "levels" or "types" of access, such as read, change, delete, etc. I have indeed seen code with these peppered all over the place.

  if hasAccess(userRef, "billing", "change") then ....

(The different ways to arrange and classify access is a rather complex topic that won't be delved into here.)

If you see these all over, then you might start thinking, "I wonder if I can bring this all together into one spot or collection?"

You can. However, you would probably have to bring what they act on together also. That is not likely desirable. Diverse things may need security, and we would not want to bring them together just because they happen to use security.

One might be able to subclass things based on their security needs in a language that supports multiple-inheritance, but the code needed for each unit is not likely to be significantly smaller than an IF statement. It might also introduce granularity problems if the class or method is not a fine-grained enough unit to package security needs around. A unit size tuned to one aspect may not necessarily be the ideal unit for another aspect.

Thus, managing multiple aspects is going to be a sticky problem in both paradigms. The best I can suggest is to design the API's with great care to make them as flexible, yet non-intrusive as possible. For example, rather than having:

  if Not hasAccess(userRef, "billing") then
     raiseErr("Sorry, you do not have access to billing")
  else
     [regular stuff]....
  end if

You could make the access handler display the error message by itself:

  if hasAccess(userRef, "billing") then
     [regular stuff]....
  end if

If the user does not have access, then the handler can display the error message itself (internal code).

That way you don't have to repeat the message display code all over.

Perhaps have an optional named-parameter switch to turn off the message if you wish to handle it yourself. This is kind of a p/r version of "overriding" the message.
if Not hasAccess(userRef, "billing", message=off) then
   raiseErr("Sorry, you do not have access to billing")   // custom message
....  

OR

if Not hasAccess(userRef, "billing", #noMsg) ....   // 'L' dialect
Also note that providing a user reference may be redundant in some setups. The access routine(s) can perhaps find out the current user on their own. Thus, we may be able to reduce the number of parameters.

Why Case Statements Don't Repeat Often

OO proponents often talk about how case statement lists often end up being duplicated all over the place in procedural programming. Aside from the the issue of duplicating methods lists in exchange, in reality I don't see rampant case duplication in custom business applications.

The main reason is that there is usually a weak one-to-one correspondence between the execution choices and subclasses or subtypes. I have been saying over and over that business objects divide poorly into subtypes for the most part. Lack of case symmetry is yet another manifestation of this.

Put another way, the choices available per "task" tend to be different for each task. If we mapped this pattern onto the famous shapes example, (chosen for familiarity instead of realism) it would resemble:

  sub draw(sh)
     select on sh.approach1   // field of sh
     case A
       ...
     case B
       ...
     otherwise
       ...  // default
     end select
  end sub

  sub rotate(sh)
     select on sh.approach2
     case C
       ...
     case D
       ...
     case E
       ...
     end select

  end sub

  sub findCenter(sh)
     select on sh.approach3
     case F
       ...
     case G
       ...
     otherwise
       ...  // default (aka inheritance)
     end select
  end sub

Notice how the "selectors" are different for each operation. OO version of procedural examples often make the selector the same for each operation:

  sub draw(sh)
     select on sh.type  
     case A
       ...
     case B
       ...
     case C
       ...  
     end select
  end sub

  sub rotate(sh)
     select on sh.type
     case A
       ...
     case B
       ...
     case C
       ...
     end select
  end sub

  sub findCenter(sh)
     select on sh.type
     case A
       ...
     case B
       ...
     case C
       ... 
     end select
  end sub

Or, more commonly expressed as:

  sub draw(sh)
     select on sh.type
     case "polygon"
       ...
     case "circle"
       ...
     case "rectangle"
       ...
     end select
  end sub

  sub rotate(sh)
     select on sh.type
     case "polygon"
       ...
     case "circle"
       ...
     case "rectangle"
       ...
     end select
  end sub

  sub findCenter(sh)
     select on sh.type
     case "polygon"
       ...
     case "circle"
       ...
     case "rectangle"
       ...
     end select
  end sub

Using the same selector for different operations is simply rarer than many OO proponents believe or imply. The reason is that the selector lists (case values) often do not "tie" tightly with any one "type" or attribute. They are more likely to be related to a specific task (routine) or feature, and thus vary from task to task. As mentioned above, philosophers have generally concluded that taxonomies are relative, not absolute (global). (If you disagree, you are welcome to show me actual examples from business applications.)

The selector fields can be called strategies after the GOF pattern of the same name. Sometimes they are simple Boolean flags, sometimes names, such as "quarterly", "yearly", etc., and sometimes ID numbers of strategies. (Whether to use names or numeric ID's depends on many factors, and will not be covered here. We use mostly names here for readability of examples.)

Countries Example

An OOP fan brought up the example of taxes for different countries. For example, there may be sales taxes and employee taxes. The claim is that each country would need a different code section for each country for both types of taxes. (In many ways, this characterization fits closely to a device driver pattern.)

  // Procedural
  sub calcSalesTax(sale)
     select on sale.country  // field of 'sale'
     case "Mexico" {...}
     case "China" {...}
     case "Germany" {...}
     ...
     otherwise {raiseErr("missing country!")}
     end select
  end sub
  sub calcEmpTax(emp)   // Employee tax
     select on emp.country
     case "Mexico" {...}
     case "China" {...}
     case "Germany" {...}
     ...
     otherwise {raiseErr("missing country!")}
     end select
  end sub

  // OOP
  class Country abstract
     method calcSalesTax() {}   // interface templates only
     method calcEmpTax() {}
  end class
  class Mexico inherits Country
     method calcSalesTax() {...}
     method calcEmpTax() {...}
  end class
  class China inherits Country
     method calcSalesTax() {...}
     method calcEmpTax() {...}
  end class
  class Germany inherits Country
     method calcSalesTax() {...}
     method calcEmpTax() {...}
  end class
  ...

Viewing it their way, it perhaps may be one of those rare cases where OO would do a slightly better job because of the symmetry of case lists.

However, after thinking about it, the symmetry and one-operation-per-type model may not hold much water after all. First, some countries may have such rare sales with our company that writing tax calculation code may not be worth it. Thus, they may have a quick estimation approach, and then a manual calculation at the end of each year. The estimation strategy may be a simple percentage and shared with other rare buyers.

Second, rather than manually write and maintain code for zillions of countries, a firm may choose instead to farm the calculations off to regional specialists. For example, they may hire a French firm to calculate all European taxes and a Tai firm to calculate all Asian taxes. The calculations then become strategy selectors instead of a dedicated code segment per country. Each country would have the name or code of the strategy in a corresponding Country table field.

The contracted firms that do the employee taxes may also be different firms than those that do the sales taxes. Thus, their strategy (case) lists may be different. Having different strategy lists for different aspects is quite common in my experience. (This would also make factoring into TaxCalculator subclasses more difficult.)

  sub calcEmpTax(emp)
    select on emp.empTaxStrategy
    case "emp_firm_1" {....}
    case "emp_firm_2" {....}
    case "emp_firm_3" {....}
    otherwise {raiseErr("no such strategy")}
    end select
  end sub

  sub calcSalesTax(x)
    select on x.salesTaxStrategy
    case "s_firm_2" {....}   // the only shared firm for both
    case "s_firm_4" {....}
    case "s_firm_5" {....}
    case "s_firm_6" {....}
    case "s_estimate_1" {....}
    otherwise {....}
    end select
  end sub

Strategy Table Example

CountryID	CountryName	SalesTaxStrategy	EmpTaxStrategy	isActive
23	Argentina	s_firm_2	emp_firm_2	Yes
17	Bolivia	s_firm_6	emp_firm_2	Yes
18	China	s_estimate_2	emp_firm_3	No
94	Mexico	[blank]	emp_firm_1	Yes
87	Paraguay	s_firm_4	emp_firm_2	No
21	Uganda	s_estimate_2	emp_firm_3	Yes
...	...	...	...	...

A table makes it easier to see strategy patterns

Finally, even if there are no strategies, if we look at the "forget duplicate" patterns that OO books often dwell on when bashing case statements, the benefits are not so clear cut. (It is said that if you add a new country, you might forget to add the country name to one of the case lists.) In reality, one would put every known country into the list so that they don't have to keep updating if new ones come on line frequently. (In our example we have only two lists, one for CalcSalesTaxes and the other for CalcEmpTaxes.)

Note that this example does not show the relational "joins" to the Country table. Both tax strategy names would be in the Country table most likely.

Thus, only new countries would need list updates (along with the implementation). New countries are not that common, so it is not a major task.

In the OO versions one may have to remember to add all methods for each country. If a new method comes along, then even existing countries may need the new method. Thus, a programmer may have to visit about 250 subclasses to add the new method. Visiting 250 subclasses is obviously a lot more work than visiting 2 subroutines. (Sure, a fancy IDE may automate some of this, but p/r tasks are not immune to automation.)

It is true that some OO languages can potentially warn a programmer at compile time that an implementation is missing. However, you waive your ability to have defaults if you do this, and this ability is language dependant.

Some OO fans claim that they have seen frequent p/r code with lots of duplicate case statements. Without inspecting actual code, it is hard to see if they were justified. It just may be bad p/r code. Bad code can be found in any paradigm. Often the problem is insufficient use of tables. Sometimes the programmer just is not use to using tables properly, or the particular language and/or API's make tables a pain.

The bottom line is that such a one-to-one repeated pattern is rare in my experience. For that OO-book pattern to occur and be worse to p/r updates than to OO updates (such as new methods), these would have to be true at the same time:

Multiple duplicate case lists
Few or no shared implementations for list items
Frequent additions to list but rare additions of new operations (methods)

The chances of all 3 occurring at the same time is statistically rare. We cannot verify the first #2 (beyond anecdotes) without looking at actual software, but we know that #3 would not be a problem because new countries are not that historically common; at least not more common than new operations (methods). (Assuming here that the programmer will provide fillers or default case cells for all known countries early on.)

I applaud that OO fan's creativity for dreaming up an example that almost fits the case-bashing pattern in OO books, but this amazing effort still has some "reality holes", like many of the bashing efforts in OO examples. Even if the pattern occurs say one percent of the time, adding OO just to handle those cases flunks the Blue Moon Principle, which basically states, "spend complexity on the common stuff, not the rare stuff".

Tablized Version of Country Taxes

Some complain that having a big case statement makes it harder or riskier for country-specific experts to maintain specific tax calculation code. I don't see how fiddling with methods inside of classes is less evil; blocks is blocks; but for the sake of completeness of debate, I will show a tablized version.

We could have the case statement call separate subroutines, but some complain that this requires two routines to need re-compiling instead of one when a country is added. Note that we could also make the naming be automatic by having the routines be named something like "SalesTaxCountryX" where X is the country ID. In that case we would not need specific fields for the routine names since the name would be automatically generated. I am not showing that approach here. I will leave it as a reader exercise.
One might also claim that adding a new country requires two places to be updated in the table version versus one in the OOP version. However, the tabled version makes it easier to view, list, and query the country table/info. To get the same functionality out of the OOP version, some sort of collection would have to be constructed. This may also require two lists, depending on the OOP language. In the future, the tax program code may actually be stored in the table itself, disarming such criticism all-together. However, such practice is not yet acceptable, partly because infrastructures, such as IDE's don't currently recognize this practice.

The Tablized Example:

  Table: Countries
  ----------------
  countryID    (int)
  countryName  (string)
  SalesTaxRoutine  (string)
  EmpTaxRoutine    (string)
  isActive     (yes/no)
  etc....

  sub calcSalesTax(rs)         // rs = recordSet
     var callme = trim(rs.SalesTaxRoutine)
     append(callme, "(rs)")    // append parameter
     if not eval(callme)       // assume returns status indic.
         errRaise("Error calling " & callme)
     end if
  end sub

No more case statements!

Dispatching Issues

The OO proponent who proposed the above country tax example claimed that his OO design offered better or "automatic" dispatching. However, on further inspection the dispatching issue is very inconclusive.

For the sake of comparison, I will assume putting the country-specific code inside the Country table itself. Chuck (an alias) claimed that I had to do a record look-up to get the Country-specific code and that he didn't. Although technically true, he still has to do a record lookup, as will be described. I am just piggybacking on that lookup since it is a necessary step in both designs (paradigms).

Unless he stores all country information[1] in code (classes), his version has to query the Country table also. Chuck had code in the parent class and one or two lines in each country sub-class to map the sub-class to an "external storage" such as a RDBMS. (He used a "country ID" as the key.)

I claimed that such mapping code is roughly equivalent to my table (and code) lookup. He disagreed, but mostly via the definition of "dispatching", which excludes data lookups in his opinion. To me this is splitting hairs of words.

Since we are comparing apples to oranges, I suggested that we compare the overall maintenance effort of each design instead of trying to say this section of code is equivalent to that section. Looking at the final counts washes out part-to-part comparisons which can get hairy and testy otherwise. In other words, the final effort count is a cleaner metric. Overall, the OO solution is not less code and not less measurable maintenance.

One problem with his design is that he has two similar structures that have to change in lock-step when new countries are added. He has country sub-classes that mirror the Country table rows. In my opinion, this is a violation of the single choice principle. He disagreed in that duplication is sometimes needed as an interface layer, which is what he called his mapping code. He claimed that this was the tradeoff paid to decouple his code from a relational model (if something else is later needed). In my opinion that is not worth it unless better competitors to relational databases gain market share. Preparing to decouple from relational is like planning the decoupling from the OO model, in my opinion.
[1] Chuck also disagreed about the likely-hood of needing more country information besides just the tax methods. In my opinion, having more info being attached to such tables is common. For example, the country name and an "isActive" flag can be stored in this table. These can be changed without programmer intervention if they come from the table instead of sub-classes. Storing attributes in program code is often limiting in this way. It might be good job security for programmers, but bad design otherwise. Some argue that languages such as Smalltalk make it easier for attributes to be dynamically changed. However, this tends to limit the access of data to just Smalltalk. Smalltalk is not a full-blown database, and treating it like it is may result in continuity problems if one later switches to a real database system.
Chuck also claimed that my "program code in table" approach is really "OO". See the top section of "P/R Patterns" for a discussion on this issue.
In addition, Chuck said that his approach can have more compile-time checking. I tentatively agree, but perhaps at the expense of needing a re-compile or re-bind per new country.

Another Dispatching Approach - "Calculated Scripts"

Another approach may be to use a "calculated file". We have two dispatching dimensions: country, and operation. Thus, we could have a bunch of scripts with names like x_y.prg where x is the country (ID or name), and y is the operation. The dispatching code might then look something like:

  sub countryOperation(countryID, operationID) {
    scriptFileName = countryID & '_' & operationID & '.prg';
    execute(fileAsText(scriptFileName));
  }

Of course, static languages may have a harder time with such. ("&" is string concatenation in the above example. Error-handling not shown here.)

Case Statement De-Casing

Another reason why case statements don't cause the problems that OO proponents claim they do is that they often evaporate, or morph into something else.

For example, the choices may grow no longer mutually exclusive, and thus become IF statements. Procedural code usually allows the criteria for the different case or IF blocks to change without having to move code around to different classes. Only the IF criteria changes. See polymorphism notes for more on polyUNmorphism.

OOP Criticism