OOP Challenge 2 - Generic Printing


Originator: Chris Twiner, et al.

Although this is a small example, I decided to include it because variations of it keep popping up and it is a good example of how OO looks good on paper, but sort of simple-minded or artificial when applied to the real world.

A portion of Chris's original message (lightly edited):

  case statement :
  switch(Type) {
    case 'employee' : printEmployee; break;
    case 'manager' :  printManager;  break;
    case 'director' : printDirector; break;

An OO approach (given person as a base class):
Difficult wasn't it.  The functions still have to be maintained
somewhere. If you add a new type of person the
previous code will [still run]. Only the object (and whatever
creates it) need be created/altered.

[end of quote]

My Response

First of all, one may not need a separate function for "printEmployee" et al. You might be fine doing it within the case blocks. I see case blocks as no more evil than method blocks. Blocks is blocks to me. ("break" is silly C-specific thing. Other languages don't need "break", BTW.)

Regardless, the equivalent code for both paradigms is going to be roughly the same. Point 1: Case statements are not more code than OOP generally.

The rest gets into where these blocks are best kept. Enter again the noun-centric proximity versus verb-centric proximity battle. (See the Shapes Example for more. Short version: There are significant tradeoffs to each grouping with no clear objective benefit of one over the other.)

Note that Chris separated "Manager" from "Employee", etc. Such subtyping of employees was found to be suspect in the subtypes document. However, I realize that his example could apply to other things besides employee subclasses.

Back to Reality

In practice, generic printing would only be used for debugging or "quickies"; and not something that you give to the boss except in an emergency rush. Usually there are more than one report for primary entities. Therefore, a single Print method would not suffice. (I suppose you could designate one of them as the default report per entity, but then you have to crystal-ball the intended usage for multi-entity requests.)

And, there are often reports that involve multiple entities. Thus, association with only one entity is a little artificial in many cases. See Aspects Document for more on this. Short version: Encaspulation is not as pure as often made out to be because there are often multiple legitimate association candidates. Does an Employee-By-Department report belong with the Employee class, the Department class, or a class by itself? How much of each entity does a report have to refer to before it gets it's own class? There is a lot of potential for continuity problems and rather arbitrary decisions.

One could replace the example with something like:

  sub recordDump(recordHandle)
     for each fld in getFieldNames(recordHandle)  // for each field name
        printLn fld & ": " & recordHandle[fld]
     end for
  end sub
I have actually made such utilities before. The output resembled:
     FirstNameMI: Bob K.
        LastName: Jones
            Dept: 42
        PayGrade: 16A
       WorkPhone: 123-456-7890 x31
       HomePhone: 123-373-8383
           Hired: 12/19/2000
(Right-alignment of the field titles makes it much easier to read than left-alignment in my opinion. However, most vendors seem to prefer left alignment because it looks prettier. Form over function? Note that I have not shown the code for performing alignment of any kind.)

We could also make one that would do something similar with any given SQL statement that returned a result set. That way we can supply it with more complicated lookups (joins) and conditions.

The same function could do both if some sort of switch is given as parameter.

I suppose an OO fan would prefer polymorphism to select which one, but then you have to create a bunch of cluttery classes and make sure everything is the right type before it can be used by our generic reporting tool. That brings up the question of how something can be generic across applications if it expects a specific type. The chance of diverse applications all sharing the same "SQLstatement" class/type is almost nil. This is a classic fat wire issue, also known as "protocol coupling". Sure you could write adapter classes, but why bother?

I know, Smalltalk can probably do roughly the same thing in some circumstances, but since a RDBMS is already around for most business applications, why not use it rather than hope your OOP language reinvented a half-baked DB from scratch? Note that our tool works no matter which language wrote to the database. You have to pipe everything through Corba or the like to do such in OOP.

A More Open-Ended Approach

Rather than limit such a device to just one entity/object/class, I often use (and sometimes even make) a tool that can do:
  showQuery("select * from foo, bar where foo.id=bar.id")

  showQuery("select * from Sales where amt > 25000")

  showQuery("select * from Sales where regionID in (14,23,82,99)")

This displays the query result in tabular form. (It usually either creates an HTML table, or fills a Grid control on a GUI form.) It takes only about 20-lines of code to make a basic HTML version.

An OOP class implementing its own Print or ToString method cannot easily do these kinds of things.

Medical Example

Someone at the comp.object newsgroup brought up a medical application example where there were a series of medical measurement "types". Example output may roughly resemble:
  Data for patient 12345 on 12/22 8:15am (Visit# 146)

  Mulse:     12.33 cc
  Triroid:   14 k (13) 4
  Hamptom:   143cc, 13pp, 312.32rg
  Yardiac:   573 KLM - 5
  Bulse:     428.3, 17, B
  (Hypothetical measurements only)
They bragged that each measurement sub-class "knew how to print itself". One could of course use a case (switch) statement in a procedural version. I see nothing wrong with a case statement so far based on the requirements given. (See Meyer's Single Choice Principle for more on case-statement issues and tradeoffs.)

However, let's explore a Control Table version.

Table: Measurements
Abbrev Descript FmtExpression
MULS Mulse rs.p1 + " cc"
TRIR Triroid rs.p1 + " k (" + rs.p2 + ") " + rs.p3
HAMP Hamptom rs.p1 + "cc, " + rs.p2 + "pp, " + rs.p3 + "rg"
YAC Yardiac rs.p1 + " KLM - " + rs.p2
BULS Bulse rs.p1 + ", " + rs.p2 + ", " + rs.p3

Table: PatientData
VisitRef AbbrevRef p1 p2 p3 p4 p5 p6
146 MULS 12.33          
146 TRIR 14 13 4      
146 HAMP 143 13 312.32      
146 YAC 573 5        
146 BULS 428.3 17 B      

A printing function may then resemble:

  subroutine printMeasurements(visitID)
    sql = "select * from PatientData, Measurements "
    sql += "where abbrev = abbrevRef "
    sql += "and visitRef = " + visitID
    rs = getRecordSet(sql, driver=std)
    while DBgetNext(rs)
       printLine rs.Descript + ": " + evaluate(rs.fmtExpression) 
    end while
  end subroutine
The "evaluate" function executes a string expression as code. Variations of it are found in many scripting languages.

There are other related approaches, but this gives an idea of what can be done. Note that it even allows new measurement "types" to be added without changing a single line of code (except for the formatting expression).

An Eval-Free Version

A more formal version that does not need evaluate( ) could be built with tables similar to such:
    Table: MeasurementParts
    AbbrevRef  (f.key to Measurements table)
    P  (int)   (1, 2, 3, etc.)
    Suffix     (" cc" for first row of example)

    Table: PatientData
The key (no pun) to this solution is the "Prefix" and "Suffix" fields. They allow simple string appending to create the result instead of evaluating expressions. The code to put them together may look something like:
    rs = getRecordSet(....)
    while DBgetNext(rs)
       result += rs.Prefix + rs.TheValue + rs.Suffix
    end while
    printLine result

This solution is probably superior from a relational purist viewpoint, but would be harder to set up without a custom user interface.

Somebody complained that this would not allow much custom formatting, such as controlling the number of decimal places. I assumed that the value was formatted before being saved (the field is a string). However, if really needed (not likely IMO), then we could still have a "fmtFunc" field in the MeasurementParts table:

        while DBgetNext(rs)
            temp = rs.TheValue 
            if not blank(rs.fmtFunc)
               temp = eval(rs.fmtFunc & "(" & temp & ")")
            end if
            result &= rs.Prefix & temp & rs.Suffix
        end while
Of course, if this was needed, it would bring us back to using Eval(). However, it would mostly be used for rare exceptions. If something grows common, then it should perhaps be turned into a table flag of some sort. (I used "&" for concatenation instead of plus here to avoid confusion with math operations.)

If one wanted something similar, but without using Eval(), then you could do something like this:

        while DBgetNext(rs)
           result &= rs.Prefix & CustomFmt(rs) & rs.Suffix
        end while
        function CustomFmt(rs)
           result = rs.useValue     // default
           select on rs.abbrevRef & "." & rs.p
           case "GLRG.2"
              result = zork(result)
           case "FLOG.1", "SCCR.3"
              result = dork(result)
           end select
        end function
The nice thing about this approach is that all the exceptions (oddities) are in one spot. If we did OOP divisions by "subtype", then such oddities would be scattered among the "normal stuff". Grouping by oddities allows one to better see patterns to factor into the mainstream if certain approaches grow more common. We can also see that FLOG and SCCR share a common implementation. Spotting the similarities and moving them together would be tougher in OOP subtype-based grouping.

Notes and Enhancements to Medical Example

Challenges Intro | OOP Criticism