Reuse Issues

REUSE

Comments on Reuse, Wiki discussion, and "Copy-and-Paste"
Updated: 3/1/2005

The Wiki Discussion Link:
http://c2.com/cgi/wiki?OoHasFailed (subject to change)

Comments and Miscellaneous Extracts From Above Link

My comments are in regular font while the quotes are in italics.

I don't understand the top discussion but what I feel is: OO is good. Noone is disputing that. It is better way to design most software. Its abstractions match much closer to real-world problems.

We have already seen that OOT does not model the real world better.

Where OO has failed (and I and others were making that point in another topic) is re-usability. This is not news. It's been written about for years.

I would like to get references to those studies.

So if you ask me and many others, OO has failed. That's why COM, CORBA, and JavaBeans/RMI/EJB were invented. Many CIOs have NOT seen their OO investments paid off. Many (including me) think that components are the answer. That is true reuse. Plug a component in and have the container deal with it. I believe that the wave we are entering now is Component Based Programming, which will supersede in some ways OO, as well as work with it (you use OO to create many components).

This is one reason why VB-ish stuff sells so well: there are more components for it than anything. (They don't all work well, but there are often alternative vendors.)

Also note that OOP is not the only way to create components. I used to use FORTRAN graphics components (non-OO) quite successfully. (And FORTRAN is not the most flexible procedural language.)

"OO Is Not All That Bad" said my round table colleague (an OO expert). He said, "I just joined a project written by classic C programmers, and... (he related several interesting and horrible experiences)... and it has been a really long time since I have seen code written that way. And there is a lot of code out there written that way. So I just want to remind myself out loud, that OO is Not All That Bad. In fact, I really like it."

This person may be setting up a false dichotomy. The "bad C code" he saw may not the be only way to solve a problem. (And, using C as a standard for procedural programming is like using Yugos as a standard for cars, IMO.) I find that many OO fans have not been sufficiently informed on or weak at some procedural techniques. Compare good procedural to good OO, not bad procedural.

Also, the last sentence about "really liking it" sounds more like a personal preference than any concerns about whether OO is delivering benefits to all. It reflects my "mindfit" claim (software engineering is a subjective art, not a science).

I am concerned that the programming community hasn't learned Parnas' 1972 lesson about encapsulating design decisions, and are whacking at the keyboard with the latest rage, Java in this case, programming the way they did before. The failure of OO, to me, is in the minds and hearts of 10s of thousands of programmers, who haven't gotten it, and probably at this point, still won't for at least the next decade. So the industry has "moved on" without incorporating the dominant lesson of OO - encapsulation of design decisions.

This harks to our "population fit" argument, and "design cars for the average driver, not Richard Petty (a racing whiz)."

Further, there is unsufficient feedback mechanisms to reward long-term planning, even if "properly learned". Knowledge and execution of knowledge are not the same.

Actually, I think OO is hard. In procedural programming you can sort of start at one end and meander around and sooner or later maybe answers are coming out. Objects make that kind of coding harder, not easier. .... Ordinary COBOL is a good comparison. You might write a subroutine paragraph from time to time, if you think of it, but mostly you just go along from beginning to end until done. Doing the same thing in objects, you get some big method and all the OO guys tell you you have to break it down. You break it down and they tell you you have to break up your class into separate classes. You do that and lo and behold they make no sense (because objects are hard) and you have a program that is harder to write and understand than the big blob would have been. .... It has always been hard to write really good code. Objects make really good code easier to write. IMO, objects make it harder to write decent code than it was in procedural. Most programmers are in the decent range, so OO doesn't help them.

Interesting viewpoint.

Actually, unless a developer has solved a very similar problem before, the first run will almost always have poor abstraction no matter which paradigm is used. Decent abstractions only come about with experience with the subject matter. Thus, unless you are likely to do a similar application again, trying to cleanly abstract the first round beyound a rudimentary level will often be a waste of time. (This does not mean the first round is not functional, just that it will have what turn out to be akward structures for the given domain.)

Software reuse is a management problem, not a technical problem. Always has been, always will be.

[from "re-use" link]

Amen Brother!

The Reuse Dilemma

Problems with inter-application reuse

Reuse is one of those Holy Grail's of software engineering. Everybody agrees that it is good to have or do, but in practice it usually ends up trickier than originally planned.

Reuse can be divided into "copy reuse" and "reference reuse". Copy reuse is where you copy parts of an another application or code base and change it to meet local needs. Reference reuse on the other hand requires that the same code base be used by all clients (users) of the shared component.

Copy reuse is much simpler to perform than reference reuse (RR). However, it is often frowned upon because fixes or improvements don't automatically propagate to the multiple copies. This article looks at why RR is tough.

One of the reasons is the fragile parent problem. If we accidentally break the shared component, then all clients of our component "inherit" the bug.

Another problem is maintaining backward compatibility. We have to support all existing features regardless of what new features we add. Copy reuse generally does not have this problem because we can alter the copy to suit our new needs.

But, the largest problem seems to be interface bloat. The features used by one client of a component may not be the same features used by another. To satisfy more clients, we have to add yet more features. The component has to carry the total set of all possible features used by any and all of the clients.

Take an email component. Our initial requirements might be satisfied with a simple interface.

   to = "bob@here.com"
   from = "notif@inc.com"
   title = "re: meeting"
   msg = "The meeting has been canceled. Thanks."
   sendMail(to, from, title, msg)

However, future clients of the component may want multiple recipients, CC's, BC's, attachments, MIME, form letter templates, virus scanning, etc. No single client will want all these. (Most embedded or mass email messages fit a fixed pattern. For example, medical appointment reminders.)

In fact, the majority may want only the original set of features shown above. But, to make it a "generic email component", it has to possess feature which may not be used very often.

Well, to avoid this, one might say, "I just won't include some of the lesser-used features". However, if somebody's requirements demand say BC's ("blind copies"), then they may have to find a completely different email component that does support them. 95 percent is often not good enough. That 5 percent can disqualify it from being considered "generic" in the eyes of the potential client. This is the nice thing about copy reuse: if it is missing something, we can simply add it to the copy without interfering with other users of our code, because there are none at the point of copying.

E-mail is probably not a very good example on a larger scale because it is based on a rather widely-accepted Internet standard with many implementations and is not likely to change significantly any time soon. On the other hand, custom-made software is far more subject to interface overhauls and additions because there is only one implementation and usually fewer committees to go through.

It thus follows that the more generic the component is, the more features it will have. Having lots of features is a mixed blessing. Although it may support a lot of features that we may need without new coding, so many features also makes it harder to learn, understand, and/or navigate the interface.

In a popular word-processing package I once accidentally hit a key that put a line (border) into the middle of a simple letter. But, I found out that I could not get rid of that line by simply selecting it with the mouse and pressing Delete or Backspace or Cut. All the common editing approaches did not work. I was almost ready to use white-out on the printed page. After a bunch of digging it turned out to be a "paragraph border". I had to select special, rarely-used (by me) commands to get rid of it. This is an example of being overwhelmed by the size of the interface or feature base.

AF = Available Features
UF = Used Features Per App.

This graph shows the approximate relationship between potential features and features actually used in a given application by a generic-intended component. To satisfy all the clients (apps using it), we have to put more and more features in as more applications use it over time, or as we target more applications. However, the quantity of features used by any single given application is relatively constant. (I assumed it increases slightly because more features are available.)

One final problem with reference reuse is that even if there are lots of features, it still may be missing that one key feature that we really want. Thus, we still may be forced into a big workaround, to use another component, or possibly even into copy reuse.

In summary, cross-application reuse has some inherant drawbacks that should be weighed into the decision of whether to use reference reuse or copy reuse.

On Copy-and-Paste

I don't like the concept copy-and-paste (copy reuse), but the alternatives require a skill and experience level higher than many companies want to pay for. If staff's skills cannot handle decent factoring, then one is often stuck with copy-and-paste. One place I worked for got on my case for refactoring a situation with a duplication factor of 8. They had some developers in the past who had problems reading and modifying highly factored code (not mine) and it caused some major stirs. I reluctantly have to agree with them somewhat. There are several risks or downsides to refactoring:

Other readers sometimes cannot handle the higher abstraction or indirection, requiring higher-skilled (more expensive) developers. Companies want Plug Compatible Interchangeable Engineers. Skilled people are hard to find, not necessarily because there are so few, but because they are hard to detect. And once found, a department may have difficulty holding on to them because of HR's wage restrictions and other office politics.
Good abstractions and refactorings are often hard to get right. Factoring can be messy too if not done skillfully. I have to admit to occasionally making software that was hard to modify because I over-factored the "wrong" things. I was unable to anticipate the kind of changes that later kicked the design in the nuts. I've improved my estimation skills to reduce these, but it is hard to train others for such. That knowledge generally comes from experience alone and is hard to document and explain.
The nature of change is that sometimes it will just not favor some kinds of factorings (Life is a Big Messy Graph) creating Discontinuity Spikes (big overhauls). Although it may reduce the average modification effort, the occasional spikes make it more difficult to estimate the time and cost of change. Managers don't like uncertainty.
It may require strong domain knowledge to know what to factor when, but there is the complex issue of Why Is Domain Knowledge Not Valued.
Offshoring has made labor cheap enough that communication is the bigger bottleneck, not the effort to find and make the same change to multiple spots.
Languages or tools may not be conducive to good factoring. For example, languages that lack named parameters may require finding every instance of a function call to add new parameters, or result in the need to pass objects and/or associative arrays instead of the simpler positional function call in order to have parameter flexibility.
Future Discounting. The cost of tomorrow's mess is "discounted". I cannot technically argue against this popular finance principle.

As a developer, excessive copy-and-paste bothers me. But from a manager's or owner's perspective the cost of dealing with duplication may perhaps still be cheaper than the alternatives. The nature of office politics and limits on staffing flexibility play a role in this. I will generally factor my own code, but will not be militant about insisting on it with others.

See Also:
Software Engineering Stories
Goals and Metrics

Back