Codd's Rule in simple language

As the benefits of the relational approach become more widely perceived, vendors of DBMSs increasingly often claim that their products are 'relational', not always with justification. In 1985 Codd produced the following set of 'rules' by which systems should be judged:

4.1. Information.

All information is represented at the logical level and in exactly one way - by values in tables.

4.2. Guaranteed access.

Each datum (atomic value) in a relational database is guaranteed to be logically accessible through a combination of table-name, primary key value and column-name.

4.3. Systematic treatment of null values.

Null values (distinct from the empty character string or a string of blank characters, and distinct from zero or any other number) are supported in a fully relational DBMS for representing missing information and inapplicable information in a systematic way, independent of data type.

4.4. Database description.

The database description is represented at the logical level in the same way as ordinary data, so that authorised users can apply the same relational language to its interrogation as they apply to the regular data.

4.5. Comprehensive sub-language.

A relational system may support several languages and various modes of terminal use (for example the fill-in-the-blanks mode). However there must be at least one language whose statements are expressible through some well defined syntax as character strings, and that is comprehensive in supporting all the following items:

Data definition
View definition
Data manipulation (interactive and by program)
Integrity constraints
Authorisation
Transaction boundaries (begin, commit and rollback)

4.6. View updating.

All theoretically-updatable views are also updatable by the system. (A view is theoretically updatable if there is a time-independent algorithm for unambiguously determining a single series of changes to the base tables, having as their effect precisely the requested changes in the view.)

4.7. Insert and update.

The capability of handling a base table or a derived table as a single operand applies not only to retrieval of data but also to insertion, updating, and deletion. (This allows the system to optimise its execution sequence by determining the best access paths. It may be important in obtaining efficient handling of transactions across a distributed database, avoiding the communications costs of transmitting separate requests for each record obtained from remote sites.)

4.8. Physical data independence.

Application programs and terminal activities remain logically unimpaired whenever any changes are made in either storage representations or access methods. (There must be a clear distinction between logical and physical design levels.)

4.9. Logical data independence.

Application programs and terminal activities remain logically unimpaired when information-preserving changes of any kind that theoretically permit unimpairment are made to the base tables. (This rule permits logical database design to be changed dynamically, e.g. by splitting or joining base tables in ways which do not entail loss of information.)

4.10. Integrity independence.

Integrity constraints must be definable in the relational data sub-language and storable in the catalogue, not in the applications program. Certain integrity constraints hold for every relational database, further application-specific rules may be added. The general rules relate to:

Entity integrity : no component of a primary key may have a null value.
Referential integrity : for each distinct non-null 'foreign key' value in the database, there must exist a matching primary key value from the same domain.

4.11. Distribution independence.

A relational DBMS has distributional independence - i.e. if a distributed database is used it must be possible to execute all relational operations upon it without knowing or being constrained by the physical locations of data. This must apply both when distribution is originally introduced, and when data is redistributed.

4.12. Non-subversion.

A low-level (single record-at-a-time) language cannot be used to subvert or bypass the integrity rules and constraints expressed in the higher-level (multiple record-at-a-time) language.