Help Design The New ITOP Programming Language of the Next Era

A Table-Oriented Language

Draft 1.009 - 5/2000

Note: this is intended as only a brainstorming document, NOT a specification.

The OOP fad shafted table and RDMBS-oriented language standards for too long. Now you can help remedy that.


We are designing the syntax and constructs of a new open ITOP language for the next programming era. It borrows heavily from (but improves on) XBase and incorporates many ITOP features. Unlike languages such as Java, C++, and Visual Basic, table access is built into the language instead of stapled on as an API afterthought. It would provide easy access and manipulation of all levels of tables, from small file-based tables to RDBMS. Listed here are only suggestions. We welcome any comments and suggestions you may have. (An E-mail link is at document bottom.)

Note that we are also involved in a parallel effort to categorize features of general programming languages. There may be some overlap with the discussion here.

There is also another approach to a table-friendly language. This is providing a flexible syntax, but not defining (in the language) specifically how to talk to tables. A draft of such a language is dubbed "L". Under this scenario, the following document can serve as more a collection/table API design guide instead of a language design guide. I doubt the L approach can ever be as "intimate" with tables as built-in syntax, but it might be a decent compromise. For example, things like "select from tablex where wages > myfunc()" may not be directly possible with L.

Notable improvements over XBase presented here are optional Block Contexting, "alias to" temporary tables that are not limited to memory (unlike most OOP components), integration with Data Dictionaries, better SQL integration, optimistic updates (pressed), and other "gifts". Enjoy!

Note that this language is targeted toward custom software. The requirements for custom software languages are a bit different than mass-distribution software, such as wordprocessors. Mass-production software generally needs to be more efficient and strict with type checking and so forth. This is not a pattern that we made up; it has been an industry trend for quite a while. This is why most custom software is written in Visual Basic and most commercial software is written in C++ today, for example. Unlike Java, we are not attempting to be everthing to everybody. However, custom business software is a large niche. There are many standard systems programming scripting languages, but not many table-oriented standard languages.

Context Simplification

It is important that a programming language provide mechanisms to reduce redundant code and references, so that only the relevant or significan information need be presented for any given section.

XBase used "Select" and "Set" statements to provide context simplification. We will call these "Forward Contexting" because they set the context for everything from the command and forward (time-wise) until changed. Although this is a very useful method for context simplification, it should be supplimented (not replaced) with a more formal method, which we will call "Block Contexting". Example:

  with select translog  
    .user = curuser
    .application = transact.misc
    transact.lookedat = true
    .time = curtime()
    .date = curdate()
    translog.date = curdate()  // can still use explicit
  endwith
In this example everything between the with...endwidth pair is assumed to have "translog" as the default alias (table handle). Note that the dot indicates a field, with an optional table specifier on the left side of the dot. In XBase it was sometimes difficult to distinquish between memory and field variables. Other possible ways of distinquishing will be presented later.

If you want to reference the default alias, then no alias is needed before the period. After the "endwith", the default alias is whatever is was before the "with". Thus, the defaults can actually be nested. The above example would be equivalent to:

  savealias = alias()
  select translog
  .user = curuser
  .application = transact.misc
  transact.lookedat = true
  .time = curtime()
  .date = curdate()
  translog.date = curdate()  // can still use explicit
  select (savealias)       // back to the way it was

Contexts can also be used with "set" commands. Here is an example comparing the forward contexting commands with block contexting:

  set caseignore on     // ignore case when comparing
  with set caseignore on    // block contexting
    dostuff()
  endwith
  set deleted off          // forward contexting
  with set deleted off     // block contexting
    dostuff()
  endwith
  set datadict to "dict_4"
  with set datadict to "dict_4"
    dostuff()
  endwith
Don't you think providing both types of context specifications is such a cool idea? I am proud of it. (If somebody thought of it before, then I am still proud to promote it.)

For some reason OOP proponents seem to prefer this 'With' approach to the 'select' and 'set' approach presented below. We are not quite sure why. They seem to feel that With statements are less abusable or confusing to read than the later approaches.

Another alternative that I am warming up to is to specify the scope of the setting. Example:

  set caseignore on scope routine
  block       // defines a block
    set caseignore off scope block
    select inventory scope block
    set blahblah on scope global
  endblock
There are basically 3 levels: Block, Routine, and Global. (The default is global if scope is not specified.) The scope lasts from the point in time it is specified up to the end of the scope. Thus, if "Set X on scope routine" is specified in the middle of the routine, it would take effect in the middle of the routine, and it would "lose its influence" when the routine is done.

A block can be an 'If' statement, loop, etc. The 'Block' structure shown above can be used if there is no appropriate 'If' or 'Loop' nesting.

Note that unlike the prior approach, this approach allows multiple settings per block. It also leaves open scope possibilities for groups of routines, libraries, etc.

Distinguishing between Fields and Variables

XBase provided very few practical methods to distinguish between field names and variable names. Although I rarely found it to be a problem, it generated a lot of criticism. Here are some possible methods to distinguish between them.

  1. Use a period before any field name. Samples of this have already been presented. Although this is consistent with the alias separator (also a period), it makes nesting a little hard to read because periods tend to make code look indented even if it is not.

  2. Use dollar signs for variables and nothing for field names. This idea is borrowed somewhat from Perl. It could be the other way around, but we prefer that field names be kept simple so that expressions are easier to write and consistent with SQL.

  3. Require that all variables be declared. This is our preferred approach at this time. To facilitate this, variable declarations should also allow an assignment statement (sometimes called an initializer). Examples will be given later under variable types. Also, declarations should be allowed in any part of the routine, not just the beginning.

  4. Require a table name or alias and a period for all fields. Although this may sound cumbersome, simple table aliases like "t" could be used to reduce eye clutter. Examples of this are given later.

  5. Use a SET statement to pick one of the above methods or a programmer-selected character. We believe this to be a little extreme at this point.

Opening Tables

Let's start with an example.
  open "foo" errorto ustat alias table3 access all
We suggest "open" instead of "use" to follow common programming conventions. The optional Errorto clause assigns the open status to the specified variable ("ustat" in this case). If no Errorto clause is specified, then a regular run-time error is triggered if there is a problem. Example:
  open "foo" errorto istat
  if istat > 0
    msgbox "Error on open; error number:" + istat
    cancel
  endif

Thus, unlike Java, trapping I/O errors is optional. For quick, ad-hoc scripts, forced error handling gets in the way. More on error "objects" is presented later.

Note that unlike XBase, a new (empty) work area is always assigned to the alias in the Open command. If you want to close a prior work area, then use something like:

  close table3

Just like XBase, the order of the clauses usually does not matter; the Alias clause could have been before the Errorto clause if we chose. The Access clause can have one or more of the following settings seperated by commas: "read, change, append, delete, all". This example allows two types of access:

  open "table3" access append, delete
The standard network access clauses (Shared and Exclusive) can also be included, but are not part of the Access clause. Example:
  open x access read, delete exclusive
The open clause can also take SQL calls as table sources using special functions. Example:
  open sql("select * from vendors") alias vendors
This allows almost any type of data source or driver can be specified. Note that even joined tables can be opened this way in order to provide a virtual table. Tables do not have to be individual files or any particular file format (although some features may not be available for some table sources.)

Other alternatives for SQL could look like this:

  Open cursor("cleints") _alias cursor _where active = "Y" _fields this,that,other
  Open ODBC("blah blah blah") _alias stuff
  s = "select * from x"
  Open  _sql s  _as stuff

The Cursor( ) function returns a temporary file name. The _fields clause is similar to SQL's "Select" clause. An _Orderby clause could also be present. (The underscores for identifying clauses will be described later.)

Another clause of the Open command is the "press on" or "press off" clause. If "press on" is specified, then updates to the current record only happen when the "Pressed" command is issued. The Pressed command is similar to the Update command of Visual Basic; it writes all the pending information to the record. "Press off" makes updates behave more like traditional XBase. Here is an example:

  Open "table3" alias vendors press on
  set datadict to "dict3"
  appendblank()
  .company = "Bigshot Bank"
  .revenue = 1500000
  if not pressed()      // press and validate
    msgbox "Validation Error"
  endif 
  close vendors
This example uses a data dictionary called "dict3" to speficy the validation functions (which are stored in the Data Dictionary table). The Pressed() function indicates if there was a validation problem; if not the record is saved. Perhaps have a seperate Validate() function to check validation before pressing. The Validate function should be able to take both an alias and a field name as a parameter to tie it in with GUI builders. Example:
  if not validate(table3,"field2")
    ...
  endif
  // or perhaps this:
  if not validate(table3.field2)
    ...
  endif
  if not validate(table3)   // checks all fields
    ...
  endif

Commands that return lists should be able to place the list into a temporary table. Example:

  Directory to alias dirtable   // generate a directory list
  ? "File Count: " + reccount()    // number of files in directory
  copy to file "dirfile.txt" delim
  copy to alias anotherone         // A temporay table
  copy dirtable to alias anotherone  // same as prior in this case
Commands with a "to alias" clause open a temporary table. (It may actually be physically stored on disk, but under a unique name that is not normally known by the user or programmer. It will also be automatically erased when it's alias area is closed.) This provides an easy way to generate temporary tables without worrying about if, where, and when to store them. An SQL query result can also be stored this way (since query results are often not kept.) Is this "tably cool" or what?

The Open command can optionally specify a data dictionary that would over-ride the default dictionary when fields for that table are referenced. (If a field is not found in the data dictionary, the next-higher level dictionary is looked into. Perhaps a Java-like "super" marker can be used in the table items to specify deference to next level.)

Other tables would still use the default dictionary unless they have selected an explicit dictionary. Here are some example commands related to selecting data dictionaries.

  set datadict to "thing.dic"  // set default dictionary
  open "foo.db" alias tablex datadict "pig.dic"
  select tablex     // just to be clear
  set datadict to "another.dic"   // new default
  // The tablex dictionary does not change
  set datadict alias tablex to "thisone.dic"
  set datadict alias to "thisone.dic"  // same thing in this case
  ? showset("datadict")  // shows "another.dic"
  ? showset("datadict alias")  // shows "thisone.dic"
Thus, "Set Datadict" and "Set Datadict Alias" are two independant commands.

Since this can get somewhat complicated, perhaps a data dictionary should be associated with an opened alias instead. You would either associate the DD with an alias handle when you open a table, and/or assign it to a handle later on.

Control Structures

Although control structures are secondary to the importance of table access features, we will still deal with them.

Block statements will generally follow the XBase convention of having the "end" clause use the first word of the starting clause. Thus, we have pairs like if...endif, with...endwith, loop...endloop, and so forth. In addition, curly brackets will be allowed as an alternative for those used to them; although, we think the Endx approach is superior to brackets. (We will not use that silly semi-colon at the end of each statement, although it will still serve as a continuation marker.) Examples:

  loop while x < 12
    dostuff()
  endloop

  loop while x < 12 {    // bracket variation
    dostuff()
  }

  loop until x >= 12 
    dostuff()
  endloop

  loop for i = 0 to 20 step 2
    dostuff
  endloop
  
  with select vendors {
    .title = "Bank of Mars"
    .revenue = 1200000.00
  }
We suggest using a colon or a colon in parenthesis to put multiple statements on one line:
  i = 3 : j = 2 : mytime = time() : dostuff()
  loop for i = 1 to 3 (:) dostuff() (:) endloop
  // or perhaps
  i = 3 (:) j = 2 (:) mytime = time() (:) dostuff()  
Which do you prefer? Also, should multiple statements on a single line be limited only to assignments and function calls (restricting control structures)? We think they should not be allowed with loops, but still be allowed with if...endif structures. The naked colons may also get confused with Data Dictionary calls (described later).

The way that XBase passes parameters is generally very simple and flexible. If a parameter is missing from the calling routine, the receiving parameter is simply given the value of a new variable (boolean "false"). This can be used for a sort of polymorphism with regard to parameters. However, it would be nice if there was a Paramcount() function to more easily determine the number of parameters sent. It would also be nice to specify a parameter as being non-passable (by value). Here is one possible way to do this:

  Sub routinex( thingy, <dingy>, ringy)
    // do stuff
    return result    // optional
  EndSub
The angle brackets tell that the parameter is to be passed by value. We also recommand that "Sub" be used in place of Function or Procedure, similar to Perl. The "Parameter" clause should still be an option.

There may be a simple way to allow procedure calls with "clauses", like the "exclusive" clause of the Open command. Let's start with an example:

  Eat "x.db" _trigger on _meal dinner, 3  _veget

  sub Eat
    parameter filename
    if clause("trigger")
      ongot = clause("trigger",1)   // 1st parameter of clause
      if empty(ongot)
        ? "Missing parameter for Trigger statement!"
      endif
    endif
    mealtype = clause("meal",1)  // "dinner" in this case
    peoplecount = clause("meal",2)  // "3" in this case
    if clause("veget")
      ? "No Meat, we are vegetarians"
    endif
  endsub
Or, for an example a bit more familiar to XBaser's:
  mycopy _from table1 _to table2 _for ".rate > .maxrate"
  mycopy _for ".rate > .maxrate" _to table2 _from table1  // same

  sub mycopy   && copy from one table to another using a criteria
    sourcetable = clause("from",1)
    destination = clause("to",1)
    criteria = clause("for",1)
    if empty(criteria)
      criteria = "true"  // default is to copy all
    endif
    if not empty(sourcetable) and not emtpy(destination)
      dostuff(sourcetable, destination, criteria)
    else
      reporterror("missing parameters")
    endif
  endsub
This allows one to build routines with easier-to-remember parameter setups. This is especially useful for building groups of routines with similar parameter types. For example, in XBase you often knew to use the "for" clause (similar to the SQL "Where" clause) to specify a record selection criteria, but you did not have to remember the order of the clauses because it did not matter.

If the Clause( ) function has one parameter, it returns True or False based on whether the clause was specified. If Clause( ) has two parameters, then it returns the clause parameter specified by the position number of its second parameter. A simple, yet powerful idea wouldn't you say?

In XBase the child routine generally inherits all the variable scope of its parent. Although this is often quite useful, there are times when one wants to build generic routines that do not depend on or want to be influenced by the parent's scope. An optional clause, such as "Isolate" should be available to provide such isolation:

  sub foo(this, that) isolate    // don't inherit parent's scope
    // do stuff
  endsub
Or, perhaps "set isolate on scope routine" will do the trick.

Miscellanious

Relavent functions should accept alias names as optional parameters. If no alias is given, then the current default alias is assumed. Examples:
  gototop(table3)    // to first record of table3 alias
  appendblank(table3)
  skip(table5)
  skip()         // acts on default area

Function and procedure calls don't need parenthesis unless there are no parameters given.

  routine1()
  routine1 stuff, this, that    // 3 parameters
  routine1(stuff,this,that)     // same

Rather than naming each field in clauses that require field names, the group names from the current data dictionary can optionally be used. Example:

  copy to alias tablex fields group "mr"
Exactly how the groups work, and whether they can be combined or excluded with fancier set theory manipulators needs to be thought about a bit more. If single letters are used to specify groups, then the language would be much simpler. However, letters are obviously less descriptive than names. In the example above, the groups "m" and "r" are specified (a set join). Perhaps a minus sign could indicated that a group is to be excluded. Example:
  copy to alias tablex fields group "m -r"
This would included all the fields in group "m" unless they are also in "r".
  copy to alias tablex fields group "-r"
The above would include all fields except those in group "r". Grouping eliminates the need to type a long list of field names in commands that deal with multiple fields. This would really be great with Browse-type functions.

Data dictionary functions and properties could be accessed using the following syntax:

  tablex.fieldx:prevalid()     // call function
  tablex.fieldx:length         // property
  .fieldx:length               // a variatilon
  .fieldx:alias()              // returns field's alias
  tablex.fieldx:alias()        // useless but valid
This provides a kind of polymorphism that could simplify class select code. The last example is not tied to the Data Dictionary, but uses the same syntax. (Some suggest using two colons "::" for these to match C++ and make them more visable.) I have not quite figured out how to provide this syntax for any Control Table, not just Data Dictionaries. The macro or Eval( ) fallback can always be used.

Perhaps a set of parethesis can be used to execute the expression contained in the field. Example:

   foo = tablex.fieldx()

is equivalent to:

   foo = eval(tablex.fieldx)

The "Where" clause will be used instead of the XBase "for" to reflect SQL syntax. Example:

   append from file "foo.db" where .rate > 25

Something similar to XBase macros should be provided, but with some variations. Macros should not just be allowed for any syntactical specification because it makes compilation too slow and difficult. Rather, an approach similar to Clipper should be taken, except be allowed in most clause parameters. Field and variable names, alias names, clause parameters, routine names, and boolean expressions (such as after Where clauses) can be macro-tized. Examples:

  %aliasmacro%.fieldx = "12"
  aliasx.%fieldmacro% = "12"
  amacro = ".rate > 25"
  append from file "foo.db" where %amacro%
Macro names need to be surrounded by percent signs on each side. This prevents some of the ambiguities of XBase macros. Similarly, an "Eval()" function will evaluate an expression. Eval can optionally take two parameters if validation testing is needed. Example:
  if not eval(result,".rate * .hours")
    msgbox "Error, invalid expression"
  else
    msgbox "Paycheck is "+trans(result,"####.##")
  endif
Note that aliases can be assigned to each other, reducing the need for macros. (An alias is really just a numeric integer value.) Example:
  Open "foo.dbf" alias panda
  appendblank()
  panda.name = 'Toogie'
  bears = panda    // alias clone
  bears.name = 'Toogie Tay'  // same record and field
  a = bears     
  a.name = "Moogie's a Panda"
  a.location = 'China'
A zero valued alias represents an unassigned or unopened area, usually returned from a function.

The keyword "off" can be used to deactivate a clause parameter. Exampe:

  open "foo" errorto istat   // specify status variable
  open "foo" errorto off     // specify none for runtime halt
  open "foo" exclusive off
"Off" can be used on clauses that need not be present. This helps make them macro-able. (We will have to change the XBase "off" usage with "List".)

Like XBase, our language will not have null variables. A newly declared variable has a boolean value of "false" (replacing .f.) by default. My experience is that null's in other languages cause more problems than they solve. If you really want null's in order to seem "modern", then perhaps a compromise can be reached: A variable can have a null result, but still evaluate to zero, empty blank (zero length string), or false if evaluated (instead of a runtime error). A function such as "isnull()" could perhaps test for nullness.

By the way, I really like the Empty() command of some XBase dialects. It returns true if a variable or field is zero, blank (included multiple blanks), or false. I also like the Alltrim() command.

Should some syntactical constructs be eliminated? Using an earlier example:

  set caseignore on
  caseignore(true)       // equivalent?
Perhaps we can do away with the XBase-dirived Set commands. However, Using "Set" tells us that it is a block-contextable command (described earlier). We are not sure what to do about this.

Variables can be declared with the following types:

  public variable = x   // global
  private variable = x  // private, but visable to child routines
  local variable = x    // only visable to routine declared in
  var variable = x      // same as private
'Public' and 'Private' function like their XBase counterparts, except they can be overriden by the 'Isolate' routine specificer presented earlier. The assignment statement (initializer) is optional.

Although the XBase WHILE and FOR statements should perhaps be combined into one WHERE command to better fit SQL conventions, the While command often assumed that an index was available for ranges. For reasons such as this, a way to test if an index is available should be made available to the program. Here is an example that processes active clients between Client Numbers 2500 and 2600:

   Var istat
   Loop Scan Clients where (between(clientnum,2500,2600) ;
         and active=true) _istatto istat _bailnone
     // loop process body goes here
   EndLoop
   if istat.canindex = false
     msgbox "Not practical to scan without an index"
   endif

The "_bailnone" clause tells the scan loop to abort looping (jump out) if the While expression can not use an existing index. Unlike XBase, we are assuming that the database engine may pick which indexes to use, not necessarily the programmer. ( _bailnone and _istatto do not have to both be present.)

The interesting part here is that Istat is not a simple status code, but an alias to a small (temporary) table with index information. Thus, CanIndex is a field, although it (intentially) resembles a property of an "error object" of other languages. There may also be a field that contains a list of which portions of the expression can use indexes. Example:

  Field       Meaning

  Canindex    Is some of the Where expression indexable? (boolean)
  I_expr      Indexable expression (ex: "between()")
  Cantell     Is information on indexability available? (boolean)
Notice that I_expr perhaps contains only one clause or function. If more are availble, they would be in other records of this temporary alias. The other fields would simply be repeated for each record. (When created, the default pointer is at the first record.) Another alternative is to make "I_expr" be an alias to a seperate alias (table).

Thus, such system-generated tables are very similar to error and status objects of OOP languages. However, there are advantages to them over OOP objects: they do not have to fit in memory. For example, in the "Directory to alias X" example we could get lists of very large directories, regardless of memory size. The Index Status example may be overkill, but suggests some interesting concepts. Isn't thinking tabley wonderful? The power!

Note that if the alias name is not given in Scan, then the default alias is assumed. Perhaps an asterisk should be required in case the programmer forgets to add the alias name. Thus you would use "Loop Scan * where ...". Also note that we added parenthises around the Where expression in the example for clarity. Parenthises are not really necessary in this case. Further, whether or not _Bailnone should be triggered if no index info is available (related to CanTell) makes for an interesting discussion. Perhaps also have a "_BailUnkn" ?

Perhaps a way to easily embed table data into source code should be made. Perhaps something like:

  Open "people" alias tablex 
  data tablex _fields name, rank, listed   // append stuff
    "Poly Shore", 34.2, false
    "Monica Lawinski", 12.0, true
    "Themtal Adams", 40.1, false
  enddata

I have decided that the keyword "as" is a better alternative to "alias". Thus, you would say:

  Open tablex as t where rate < 23.5

I have also decided that having the word "loop" in a "while" loop is not really needed. Only "while" and "endwhile" are needed. A space between the two, such as end while should be allowed for those familiar with VB.

There should be some optional way to have a table automatically close at the end of a routine. Example:

  sub foo
    Open tablex _as  t  _autoclose
    blah
    blah
  endsub   // t will get closed at end of routine
If the "autoclose" clause is specified, then the table will automatically be closed when the routine finishes.

Perhaps by default all tables should have only routine-level scope unless explicitly specified as global. A _global clause could be supplied for this purpose. This may make tables a bit "safer". Note that all subroutines (children routines) of the opener routine should still have access to the table, unless some sort of isolation mode is chosen for the children.

A "Scan" loop may also be operable on single statements:

myRoutine _scan t _where t.x > 7 _orderby t.y
This is equivalent to:
scan t  _where t.x > 7 _orderby t.y
  myRoutine
endscan

GUI's and Visual Development Tools

We are not really addressing GUI's here because tying in a UI tends to date a language and makes Web applications more difficult to build. For one possibility, see our SCGUI specificiation. We will also leave it to the market to build interactive development tools the same way that there are several Java development tools out there.
 

Openess

This new language will be an open standard. However, to commercially sell variations of it we request that you describe variations from the standard in your documentation, including areas that are not (yet) clear in the standard. A simple footnote or marker is sufficient. We just want to make it easy to distinqusih between the standard and proprietary extensions. Tricking a programmer into using proprietary extensions is a common way to get companies depedant on one vendor.


References, Links, and Related Documents


Think Tablely!

Document Copyright 1998, 1999 by Findy Services and B. Jacobs.