A discussion in the comp.objects newsgroup got me thinking. Can OOP and TOP be merged? A table record actually is very similar to an OOP object. Fields can be seen as properties in many cases.
I am assuming here that the actual (physical) representation of the persistent (stored) objects in OOP uses familiar relational tables and indexes. (If there is a better way, I would like to know about it.) This does not mean that programmers use relational techniques and syntax, but that low-level storage and retrieval mechanisms are basically indexed tables.
Apparently, converting table data into objects and back from objects requires some sort of mapping between tabled records and fields into memory objects. These memory objects are organized based on OO techniques instead of relational techniques.
This mapping obviously requires extra planning and programming. (Some say it can be made automatic, but have yet shown how.) The big question is whether or not this "mapping" step is worth the alleged organizational improvements handed to us by OO techniques. Thus, the mapping allows one to transcend the primitive organization of relational tables.
Before exploring this question more, lets first review our alleged advantages of keeping structures in more table-oriented organization. Thus, before leaving Alaska, we will review the niceties of Alaska.
Now we are going to look and a typical Object Oriented database. The properties are thus:
Class ADDRESS Name, type text Section, type text (dept., suite, etc.) Street_Addr, type text City, type text State, type text (Assume U.S. for simplicity) Zipcode, type text Class CONTACT Name, type text Phone, type PhoneNum Address, type Address Class CLIENT CompName, type text (company name) Primary_Address, type Address Sales_Contact, type Contact Billing_Contact, type Contact Class VENDOR CompName, type text Primary_Address, type Address Primary_Contact, type ContactAssume this is a fairly large company, or better yet, assume that we want our system to be able to scale well for very large companies. Until a better technology comes along, we will need to use indexed tables to represent the physical data.
Although there are many ways to organize the tables, we will make a dichotomy of two organizational types. One type will be "element-oriented" (EO), and the other will be "packet-oriented" (PO). EO tends to break things into logical chunks, while PO tends to keep data in self-contained chunks.
The EO table would closely mirror our Object-Oriented structure:
Table ADDRESS Address_ID, type primary key Name, type text Section, type text Street_Addr, type text City, type text State, type text Zipcode, type text Table CONTACT Contact_ID, type primary key Name, type text Phone_id, type key to Phone Table Address_id, type key to Address Table Table CLIENT Client_ID, type primary key CompName, type text (company name) Primary_Address_ID, type key to Address Table Sales_Contact_ID, type key to Contact Table Billing_Contact_ID, type key to Contact Table Table VENDOR Vendor_ID, type primary key CompName, type text Primary_Address_ID, type key to Address Tbl. Primary_Contact_ID, type key to Contact Tbl.Basically all we did was add primary keys to each table and use these as "pointers" to the corresponding entries in the element tables. (Users of relational tables should be familiar with this process.) Note that the "Name" in the Address table is being ignored, or overridden, by the Contact Table.
Our Packet Oriented (PO) Tables would look like:
Table CLIENT Client_ID, type primary key CompName, type text Primary_Address_Section, type text Primary_Address_StreetAddr Primary_Address_City Primary_Address_State Primary_Address_Zipcode Sales_Contact_Name Sales_Contact_Phone Sales_Contact_Section Sales_Contact_StreetAddr Sales_Contact_City Sales_Contact_State Sales_Contact_Zipcode Billing_Contact_Section Billing_Contact_StreetAddr Billing_Contact_City etc... Table Vendor Vendor_ID, type primary key CompName, type text Primary_Address_Section Primary_Address_StreetAddr etc...The PO organization has only two tables here. It is more "flat". The EO tables are more structured in that repeating portions are referenced instead of actually repeated. It provides some nice features. For example, if we wanted to add a fax number field to all the contacts, then we would only have to change the Contact table. With the PO setup, we would have to add at least 3 fields.
Is it possible, however, that the EO layout is overly structured? For example, to retreive "company name of all cleints who are billed in Chicago" would require going through three tables. This makes our system more fragile--if there is an indexing or referencing error, we are more likely to get mismatched address and contact info.
Traditional relational techniques are also tougher to use with EO. For example, under PO the SQL syntax for our sample query might be:
Select client_name from client where billing_contact_city = "Chicago"In our EO setup, this would look more like:
Select client.name from address, contact, client Where address.city = "Chicago" and address.address_id = contact.address_id and client.billing_contact_id = contact.contact_idIf SQL was more EO-friendly, it would perhaps handle a syntax more like:
Select client.compname from client where client.billing_contact.contact.address.city = "Chicago"The links between the various tables would be automatically followed by this great new OO-SQL. Since the Contact table "inherits" the Address table fields, perhaps this can be shortened a little bit to:
Select client.compname from client where client.billing_contact.address.city = "Chicago"This would be much nicer than our four-line, traditional SQL example. Unfortunately, this type of SQL is not (yet) in common use.
Note that perhaps a compromise between EO and PO could have been made -- put all the address attributes in the Contact table so as to reduce the hierarchy from three to two. Having a separate Address table seems to us a bit carried away.
Our problem is that the common OOP languages do not directly support EO tables (or any tables) to represent objects, even though they can closely, and perhaps automatically, map to OO structures.
If the built-in syntax and methods of OOP directly handled these types of tables, then the benefits of table-oriented thinking can perhaps be shared with OOP languages. In addition to the "OO-SQL" syntax example above, commands like:
client cli = new client("Bank of Clouds") cli.address.city = "Miami"are equivelent to adding (appending) a new record to a persistent table. I am not sure what the ideal syntax should be. However, some good hard thinking needs to be done to stop seeing tables and objects as separate things.
Current OOP languages make persistent storage a separate and manual step. This does not seem necessary to us. The common OOP languages are, it seems, "memory-centric".
Perhaps our beef with OOP is not so much OO features such as inheritance, polymorphism, and encapsulation, but this memory-centricity that we keep seeing.
Give us automatic table mapping, multiple table types, and control tables, and we may just stop bashing the common OOP languages.
It is true that the OODBMS can be "packed" to put the related parts physically together, but this "packing" can happen with relational tables as well so that related items are together in memory cache.
Thus, OODBMS does not solve the 1-to-many insertion problem as some claim.