Web Development Stifled

Updated: 5/9/2002


Many B-to-B and intranet applications are being built to replace or take the place of client/server (C/S) applications that used to be built with the likes of Visual Basic, Delphi, and Power-Builder. This is mostly because web-based solutions are easier to deploy. However, they are about 2 or 3 times more complicated to develop and manage from a developer's perspective than an equivalent C/S GUI application in my experience.

For simple forms, traditional web-centric techniques are fine. However, business is demanding more complex web forms. In my opinion, web programming for data-driven business applications is unnecessarily complex compared with the client/server approaches.

The biggest complication is the infamous "state management". In web-based apps one has to spend a lot of code on state management that C/S didn't have. The state management solutions often supplied are unfortunately usually just Band- Aids rather than a total rethink. (More on "Band-Aids" later).

This makes one think: what if we *kept* the state between web submits (CGI calls)? Why the heck do we have to keep dumping our program state and re-creating it? Well, there are ways to do this. However, the program state is only part of the issue. "View state management" is probably a more important issue than program state. After all, even in the event-driven world of C/S GUI's, you often don't know what the state is either when entering an event. You have to ask the system. Events are not supposed to be context-sensitive.

The "view state" we are concerned with here is the web page as seen by the user. It is the combination of forms, widgets, arrangement, and content. In C/S applications, changes to the view are generally incremental. If you make a change to the view, it *stays there* unless explicitly changed. Web pages usually have to be redrawn in order to show any application- induced changes. (There is the DOM+JavaScript approach, but it is very client version sensitive, kludge-filled, and buggy.)

The redraw step is indeed a technical requirement for most web apps. However, why should it be something that the application programmer has to manually manage?

Let's consider a UI situation. Let's say we have a form field that we want to appear (show) only if certain conditions (business rules) are met. Below is pseudo-code for a C/S version of this:

....
if AssigneeType = "Employee" then
   formx.EmpNumBox.Hidden = False
end if
....
This is pretty strait-forward. The code for a web version may actually look similar. However, there is one important difference. In the web version, the equivalent code would likely be nested within all the *other* page generation code, while in the C/S version, it would likely be in "event" code, or at least code dedicated to a task other than managing the visual appearance of the form. The web version typically resembles:
....
draw widget 3
draw widget 4
draw widget 5
if AssigneeType = "Employee" then
   draw widget 6
end if
draw widget 7
draw widget 8
....etc....
Another web alternative is to set a flag. It may resemble:
// processing code
....
if AssigneeType = "Employee" then
   EmpNumBox_Hidden = False
end if
....

// display code
....
draw widget 3
draw widget 4
draw widget 5
if Not EmpNumBox_Hidden then
   draw widget 6
end if
draw widget 7
draw widget 8
....etc....
With this approach, a global or semi-global flag is set on one part which is read by another part, the display logic. This can keep the code of the display section simpler, but requires potentially managing a bunch of script-wide variables and mode indicators. It is like having to leave a message to the night-shift to turn off the lights in a given room instead of just switching it off yourself.

A further difficulty has to do with making sure the same flags or conditions are set when a redraw is done. For example, if there is an error, and we display the error message along with the current in-progress form contents. However, those same flags and conditions must be communicated to the next script execution somehow. Sometimes this is done by passing the flag as a URL parameter or hidden field, other times it may be done by re-executing the same calculations that set the flags.

It is often suggested that one should "separate business logic from presentation". This is good advice, but often easier said than done in web applications. Part of the problem is that generating HTML is pretty much a linear process. You can't easily just jump into the middle of the document and insert or change something already generated. Thus, in practice one ends up either putting the business logic close to the rendering logic, or using messy schemes to have non-display sections communicate with display sections. We need to think about how to make web development more like GUI development. What is the "essence" of the difference?

C/S GUI programming interfaces are usually more "random access" than HTML generation, which is basically a sequential, linear protocol. In good C/S GUI frameworks, you can change just about anything anywhere on any screen at any time. This is key to their power and flexibility. Due to the HTML browser paradigm, we cannot directly do this in a smooth way with web apps (see footnote). However, there are ways to emulate some of this directness.

One approach is to have a persistent "working copy" of the HTML form images or views on the server for each user. (By "image", I don't mean a graphic bit-map, but a high-level model.) Rather than concern ourselves with recreating the form, plus any changes, for each web transaction; we would only have to change our existing (virtually) persistent view, and then simply echo the updated version back to the client. In other words, we "only change the changes". Plus, we can "address" the form elements in a more random-access way. A web transactions would be more like a C/S GUI "refresh" operation. Just because HTML has to be sent sequentially, does not mean it has to be generated and changed sequentially.

This would allow us to have server-side code resembling:

event submitButton4_pressed 
    ....
    if AssigneeType = "Employee" then
        formx.EmpNumBox.Hidden = False
    end if
    ....
    send(formx)
end event
(I don't actually propose a formal "event" block be added to a language. This is only pseudo-code.)

The reference, "formx.EmpNumBox.Hidden" would be a reference to our server-based web form model. We would set up the model upon initialization, instantiation, or development, and then change the user's copy as we go along processing events.

There are many benefits to this approach. For example, suppose the user submits a form, and we want to send them a warning message with a Yes/No choice asking, "Do you wish to continue?" If they continue, then we re-display the form (along with any changes). Managing such an interruption of an in-progress form is messy to program the old web way; you had to find a way to save the current form state and then put it back later such that the form does not know that there was an interruption. (One can put the message on the same page as the form, but this is not always the best interface.)

There are three top candidates to implementing such a framework. The first approach is to use XML (tag-based) documents as the between-session "images", and then have an API to alter those images as events are processed.

The second approach involves somehow persisting (or keeping) a RAM-based object model of the document images between web transactions. This solution is more language- dependant than the first and third approaches.

The third approach, which is my favorite, is using relational tables (or at least multi-index-able tables). A master template is kept of the document construction. We will call this the "virgin document". A copy of the virgin document is made upon log-in or form initialization and stored with the user-ID as one of the table fields so that we can find and change it during later transactions (events) without interfering with other users' page/form images. The final step of a given web transaction is to call a rendering engine to render our table-based image model to an HTML document. This HTML result is what goes to the user's browser. (See Table GUI Workings for a similar description, including a flow diagram. See also Data Dictionary Examples.)

There is a fourth approach: store the view state in hidden, compressed HTML form fields to be re-loaded into RAM on the next pass. The problem with this approach is that we cannot easily "interrupt" one form or screen with another, and then come back. Therefore, I will not discuss it any further.
The app developer may never even have to deal with direct HTML any more (other than text decorations, like bold and italics, perhaps). They may use C/S GUI-like API's.

Actually, this approach is not that much different than the first approach, except that we are using tables instead of tag-based documents (like XML) as the persistence or state-storing mechanism. With well-designed API's, the application programmer may never know the difference, and may even switch as needed without changing API's.

Although, with relational tables, we can issue powerful database or query commands for certain group actions, like disabling sets of fields based on looked-up of user access-level or role settings. Also, one may find it trickier to manage widget ordering and insertions in XML frameworks. You have to practically re-invent a database anyhow to solve such issues. Databases are generally more random-access-friendly to API builders. It is also easier to add custom columns to the rendering database, such as to indicate which fields to save to the database, which to hide from certain user levels or roles, etc. In other words, the field table can be used for more than just display issues. This can reduce the need to repeat field name lists over and over in application code.
A procedural API call based on our prior example may resemble:
setAttrib(formx, "empNoBox", "Hidden",  False)
Or, with fancier internal parsing:
setAttrib("formx.empNoBox.Hidden",  False)
There are many ways to arrange the API's and their syntax; these is just some example possibilities. One advantage of the non-object approach is that different languages and tools may be able to alter or read the same document specifications, because the "language" of the documents is not in any particular application programming language. Your UI designer tool may not need know anything about your application programming language.
You could copy the document specifications into RAM-based object structures if you really want more OOP, but it is more overhead. One interesting, but risky approach is to mirror or model DOM on the server, and then re-issue the same methods or attribute settings at the client.
Regardless of the implementation and API decisions, the key concepts of this alternative approach are:

1. A persistent view model of HTML forms or documents (per user).

2. Random access addressing of the model widgets, content, and settings.

(My pet acronym for this approach is "RAPV", for Random Access Persistent View.)

If this approach is so great and brings back some of C/S's simplicity, then why has it not been widely adopted yet? Am I simply smarter and ahead of the times in thinking this way? Perhaps. But a more palatable and believable reason has to do with the politics of hardware performance.

Here is an excerpt from Microsoft's "Web Applications in the .NET Framework" (ISBN 0-7356-1445-8):

"Having this [.NET] sequence for [web] page processing may seem convoluted compared to GUI [client/server] development. However, this process of minimizing object lifetime makes it possible to create Web applications that are very fast and scale to large numbers of simultaneous users."
Here Microsoft is admitting in black-and-white that they chose efficiency over programming productivity when designing .NET ("dot-Net"). Microsoft has given much thought to making web development more like Visual Basic development, but stopped short due to performance concerns.

The problem is that many B-to-B, and especially intranet applications *do not need* the massive performance scaling that say Walmart or Ebay need. Such corporate applications may peak at say 20 transactions per minute, while Walmart may have 2,000 a minute. Even if some parts of business applications by chance do have a lot of traffic, usually these parts are limited to a small subset of the entire application. The speed-sensitive parts could still be coded the old-fashioned way without requiring the entire app to be coded that way.

(Note that this is not to say view-based solutions are necessarily "slow". It is just that they don't scale to thousands of users as well. They will "max out" at a lower level of transactions. 20 "normal" transactions per minute is not going to overwhelm a typical server though. True, if you have a lot of such applications, you may need more servers, but even during the "Tech Recession", programming time is more expensive than more servers.)

Why hold *all* web development to the speed-at-any-costs standard, when many applications would better benefit from programming-enhancing frameworks?

Part of the problem is the "brochure contest". Benchmarks for scalability and speed are more objective and easier to measure than programmer productivity benchmarks. Thus, it is easier for managers to get reliable information on machine performance than programmer performance when making a purchasing or framework decision. In other words, machine performance makes for better brochure points than programmer performance.

But, just because something is easier to measure does not mean it is necessarily more important. For example, in manufacturing it is easier to count the number of products produced and shipped than it is to measure quality. However, poor quality can make a company go bankrupt in the end if you ignore quality in order to increase quantity.

Just because the big-name vendors are trapped in the "brochure contest", does not mean every developer is or should be. The trick is to develop and perfect open-source frameworks for view-based solutions. If enough buzz builds up about programmer productivity and code simplicity, then more shops will allow developers to use RAPV frameworks, and commercial vendors may start to hop on the bandwagon. Microsoft might even release "dot-RAPV". (However, "dot" in any name now brings back memories of sock puppets running out of VC.)

GUI development was also slow to catch on in the business world in the 1980's for hardware and performance reasons. If history is any guide to the future, then you might as well get used to RAPV-like solutions anyhow.

Note that I think web-based solutions could also be more like client/server-GUI's on the client side if protocols like SCGUI, an HTTP-friendly XML GUI-Browser draft protocol, were adopted. But, that is another topic for another day. I am mostly addressing the server side here, not the client side.)


Main