Note: this is intended as only a brainstorming document, NOT a specification.
The OOP fad shafted table and RDMBS-oriented language standards for too long. Now you can help remedy that.
We are designing the syntax and constructs of a new open ITOP language for the next programming era. It borrows heavily from (but improves on) XBase and incorporates many ITOP features. Unlike languages such as Java, C++, and Visual Basic, table access is built into the language instead of stapled on as an API afterthought. It would provide easy access and manipulation of all levels of tables, from small file-based tables to RDBMS. Listed here are only suggestions. We welcome any comments and suggestions you may have. (An E-mail link is at document bottom.)
Note that we are also involved in a parallel effort to categorize features of general programming languages. There may be some overlap with the discussion here.
There is also another approach to a table-friendly language. This is providing a flexible syntax, but not defining (in the language) specifically how to talk to tables. A draft of such a language is dubbed "L". Under this scenario, the following document can serve as more a collection/table API design guide instead of a language design guide. I doubt the L approach can ever be as "intimate" with tables as built-in syntax, but it might be a decent compromise. For example, things like "select from tablex where wages > myfunc()" may not be directly possible with L.
Notable improvements over XBase presented here are optional Block Contexting, "alias to" temporary tables that are not limited to memory (unlike most OOP components), integration with Data Dictionaries, better SQL integration, optimistic updates (pressed), and other "gifts". Enjoy!
Note that this language is targeted toward custom software. The requirements for custom software languages are a bit different than mass-distribution software, such as wordprocessors. Mass-production software generally needs to be more efficient and strict with type checking and so forth. This is not a pattern that we made up; it has been an industry trend for quite a while. This is why most custom software is written in Visual Basic and most commercial software is written in C++ today, for example. Unlike Java, we are not attempting to be everthing to everybody. However, custom business software is a large niche. There are many standard systems programming scripting languages, but not many table-oriented standard languages.
XBase used "Select" and "Set" statements to provide context simplification. We will call these "Forward Contexting" because they set the context for everything from the command and forward (time-wise) until changed. Although this is a very useful method for context simplification, it should be supplimented (not replaced) with a more formal method, which we will call "Block Contexting". Example:
with select translog .user = curuser .application = transact.misc transact.lookedat = true .time = curtime() .date = curdate() translog.date = curdate() // can still use explicit endwithIn this example everything between the with...endwidth pair is assumed to have "translog" as the default alias (table handle). Note that the dot indicates a field, with an optional table specifier on the left side of the dot. In XBase it was sometimes difficult to distinquish between memory and field variables. Other possible ways of distinquishing will be presented later.
If you want to reference the default alias, then no alias is needed before the period. After the "endwith", the default alias is whatever is was before the "with". Thus, the defaults can actually be nested. The above example would be equivalent to:
savealias = alias() select translog .user = curuser .application = transact.misc transact.lookedat = true .time = curtime() .date = curdate() translog.date = curdate() // can still use explicit select (savealias) // back to the way it was
Contexts can also be used with "set" commands. Here is an example comparing the forward contexting commands with block contexting:
set caseignore on // ignore case when comparing with set caseignore on // block contexting dostuff() endwith set deleted off // forward contexting with set deleted off // block contexting dostuff() endwith set datadict to "dict_4" with set datadict to "dict_4" dostuff() endwithDon't you think providing both types of context specifications is such a cool idea? I am proud of it. (If somebody thought of it before, then I am still proud to promote it.)
For some reason OOP proponents seem to prefer this 'With' approach to the 'select' and 'set' approach presented below. We are not quite sure why. They seem to feel that With statements are less abusable or confusing to read than the later approaches.
Another alternative that I am warming up to is to specify the scope of the setting. Example:
set caseignore on scope routine block // defines a block set caseignore off scope block select inventory scope block set blahblah on scope global endblockThere are basically 3 levels: Block, Routine, and Global. (The default is global if scope is not specified.) The scope lasts from the point in time it is specified up to the end of the scope. Thus, if "Set X on scope routine" is specified in the middle of the routine, it would take effect in the middle of the routine, and it would "lose its influence" when the routine is done.
A block can be an 'If' statement, loop, etc. The 'Block' structure shown above can be used if there is no appropriate 'If' or 'Loop' nesting.
Note that unlike the prior approach, this approach allows multiple settings per block. It also leaves open scope possibilities for groups of routines, libraries, etc.
open "foo" errorto ustat alias table3 access allWe suggest "open" instead of "use" to follow common programming conventions. The optional Errorto clause assigns the open status to the specified variable ("ustat" in this case). If no Errorto clause is specified, then a regular run-time error is triggered if there is a problem. Example:
open "foo" errorto istat if istat > 0 msgbox "Error on open; error number:" + istat cancel endif
Thus, unlike Java, trapping I/O errors is optional. For quick, ad-hoc scripts, forced error handling gets in the way. More on error "objects" is presented later.
Note that unlike XBase, a new (empty) work area is always assigned to the alias in the Open command. If you want to close a prior work area, then use something like:
close table3
Just like XBase, the order of the clauses usually does not matter; the Alias clause could have been before the Errorto clause if we chose. The Access clause can have one or more of the following settings seperated by commas: "read, change, append, delete, all". This example allows two types of access:
open "table3" access append, deleteThe standard network access clauses (Shared and Exclusive) can also be included, but are not part of the Access clause. Example:
open x access read, delete exclusiveThe open clause can also take SQL calls as table sources using special functions. Example:
open sql("select * from vendors") alias vendorsThis allows almost any type of data source or driver can be specified. Note that even joined tables can be opened this way in order to provide a virtual table. Tables do not have to be individual files or any particular file format (although some features may not be available for some table sources.)
Other alternatives for SQL could look like this:
Open cursor("cleints") _alias cursor _where active = "Y" _fields this,that,other Open ODBC("blah blah blah") _alias stuff s = "select * from x" Open _sql s _as stuff
The Cursor( ) function returns a temporary file name. The _fields clause is similar to SQL's "Select" clause. An _Orderby clause could also be present. (The underscores for identifying clauses will be described later.)
Another clause of the Open command is the "press on" or "press off" clause. If "press on" is specified, then updates to the current record only happen when the "Pressed" command is issued. The Pressed command is similar to the Update command of Visual Basic; it writes all the pending information to the record. "Press off" makes updates behave more like traditional XBase. Here is an example:
Open "table3" alias vendors press on set datadict to "dict3" appendblank() .company = "Bigshot Bank" .revenue = 1500000 if not pressed() // press and validate msgbox "Validation Error" endif close vendorsThis example uses a data dictionary called "dict3" to speficy the validation functions (which are stored in the Data Dictionary table). The Pressed() function indicates if there was a validation problem; if not the record is saved. Perhaps have a seperate Validate() function to check validation before pressing. The Validate function should be able to take both an alias and a field name as a parameter to tie it in with GUI builders. Example:
if not validate(table3,"field2") ... endif // or perhaps this: if not validate(table3.field2) ... endif if not validate(table3) // checks all fields ... endif
Commands that return lists should be able to place the list into a temporary table. Example:
Directory to alias dirtable // generate a directory list ? "File Count: " + reccount() // number of files in directory copy to file "dirfile.txt" delim copy to alias anotherone // A temporay table copy dirtable to alias anotherone // same as prior in this caseCommands with a "to alias" clause open a temporary table. (It may actually be physically stored on disk, but under a unique name that is not normally known by the user or programmer. It will also be automatically erased when it's alias area is closed.) This provides an easy way to generate temporary tables without worrying about if, where, and when to store them. An SQL query result can also be stored this way (since query results are often not kept.) Is this "tably cool" or what?
The Open command can optionally specify a data dictionary that would over-ride the default dictionary when fields for that table are referenced. (If a field is not found in the data dictionary, the next-higher level dictionary is looked into. Perhaps a Java-like "super" marker can be used in the table items to specify deference to next level.)
Other tables would still use the default dictionary unless they have selected an explicit dictionary. Here are some example commands related to selecting data dictionaries.
set datadict to "thing.dic" // set default dictionary open "foo.db" alias tablex datadict "pig.dic" select tablex // just to be clear set datadict to "another.dic" // new default // The tablex dictionary does not change set datadict alias tablex to "thisone.dic" set datadict alias to "thisone.dic" // same thing in this case ? showset("datadict") // shows "another.dic" ? showset("datadict alias") // shows "thisone.dic"Thus, "Set Datadict" and "Set Datadict Alias" are two independant commands.
Since this can get somewhat complicated, perhaps a data dictionary should be associated with an opened alias instead. You would either associate the DD with an alias handle when you open a table, and/or assign it to a handle later on.
Block statements will generally follow the XBase convention of having the "end" clause use the first word of the starting clause. Thus, we have pairs like if...endif, with...endwith, loop...endloop, and so forth. In addition, curly brackets will be allowed as an alternative for those used to them; although, we think the Endx approach is superior to brackets. (We will not use that silly semi-colon at the end of each statement, although it will still serve as a continuation marker.) Examples:
loop while x < 12 dostuff() endloop loop while x < 12 { // bracket variation dostuff() } loop until x >= 12 dostuff() endloop loop for i = 0 to 20 step 2 dostuff endloop with select vendors { .title = "Bank of Mars" .revenue = 1200000.00 }We suggest using a colon or a colon in parenthesis to put multiple statements on one line:
i = 3 : j = 2 : mytime = time() : dostuff() loop for i = 1 to 3 (:) dostuff() (:) endloop // or perhaps i = 3 (:) j = 2 (:) mytime = time() (:) dostuff()Which do you prefer? Also, should multiple statements on a single line be limited only to assignments and function calls (restricting control structures)? We think they should not be allowed with loops, but still be allowed with if...endif structures. The naked colons may also get confused with Data Dictionary calls (described later).
The way that XBase passes parameters is generally very simple and flexible. If a parameter is missing from the calling routine, the receiving parameter is simply given the value of a new variable (boolean "false"). This can be used for a sort of polymorphism with regard to parameters. However, it would be nice if there was a Paramcount() function to more easily determine the number of parameters sent. It would also be nice to specify a parameter as being non-passable (by value). Here is one possible way to do this:
Sub routinex( thingy, <dingy>, ringy) // do stuff return result // optional EndSubThe angle brackets tell that the parameter is to be passed by value. We also recommand that "Sub" be used in place of Function or Procedure, similar to Perl. The "Parameter" clause should still be an option.
There may be a simple way to allow procedure calls with "clauses", like the "exclusive" clause of the Open command. Let's start with an example:
Eat "x.db" _trigger on _meal dinner, 3 _veget sub Eat parameter filename if clause("trigger") ongot = clause("trigger",1) // 1st parameter of clause if empty(ongot) ? "Missing parameter for Trigger statement!" endif endif mealtype = clause("meal",1) // "dinner" in this case peoplecount = clause("meal",2) // "3" in this case if clause("veget") ? "No Meat, we are vegetarians" endif endsubOr, for an example a bit more familiar to XBaser's:
mycopy _from table1 _to table2 _for ".rate > .maxrate" mycopy _for ".rate > .maxrate" _to table2 _from table1 // same sub mycopy && copy from one table to another using a criteria sourcetable = clause("from",1) destination = clause("to",1) criteria = clause("for",1) if empty(criteria) criteria = "true" // default is to copy all endif if not empty(sourcetable) and not emtpy(destination) dostuff(sourcetable, destination, criteria) else reporterror("missing parameters") endif endsubThis allows one to build routines with easier-to-remember parameter setups. This is especially useful for building groups of routines with similar parameter types. For example, in XBase you often knew to use the "for" clause (similar to the SQL "Where" clause) to specify a record selection criteria, but you did not have to remember the order of the clauses because it did not matter.
If the Clause( ) function has one parameter, it returns True or False based on whether the clause was specified. If Clause( ) has two parameters, then it returns the clause parameter specified by the position number of its second parameter. A simple, yet powerful idea wouldn't you say?
In XBase the child routine generally inherits all the variable scope of its parent. Although this is often quite useful, there are times when one wants to build generic routines that do not depend on or want to be influenced by the parent's scope. An optional clause, such as "Isolate" should be available to provide such isolation:
sub foo(this, that) isolate // don't inherit parent's scope // do stuff endsubOr, perhaps "set isolate on scope routine" will do the trick.
gototop(table3) // to first record of table3 alias appendblank(table3) skip(table5) skip() // acts on default area
Function and procedure calls don't need parenthesis unless there are no parameters given.
routine1() routine1 stuff, this, that // 3 parameters routine1(stuff,this,that) // same
Rather than naming each field in clauses that require field names, the group names from the current data dictionary can optionally be used. Example:
copy to alias tablex fields group "mr"Exactly how the groups work, and whether they can be combined or excluded with fancier set theory manipulators needs to be thought about a bit more. If single letters are used to specify groups, then the language would be much simpler. However, letters are obviously less descriptive than names. In the example above, the groups "m" and "r" are specified (a set join). Perhaps a minus sign could indicated that a group is to be excluded. Example:
copy to alias tablex fields group "m -r"This would included all the fields in group "m" unless they are also in "r".
copy to alias tablex fields group "-r"The above would include all fields except those in group "r". Grouping eliminates the need to type a long list of field names in commands that deal with multiple fields. This would really be great with Browse-type functions.
Data dictionary functions and properties could be accessed using the following syntax:
tablex.fieldx:prevalid() // call function tablex.fieldx:length // property .fieldx:length // a variatilon .fieldx:alias() // returns field's alias tablex.fieldx:alias() // useless but validThis provides a kind of polymorphism that could simplify class select code. The last example is not tied to the Data Dictionary, but uses the same syntax. (Some suggest using two colons "::" for these to match C++ and make them more visable.) I have not quite figured out how to provide this syntax for any Control Table, not just Data Dictionaries. The macro or Eval( ) fallback can always be used.
Perhaps a set of parethesis can be used to execute the expression contained in the field. Example:
foo = tablex.fieldx() is equivalent to: foo = eval(tablex.fieldx)
The "Where" clause will be used instead of the XBase "for" to reflect SQL syntax. Example:
append from file "foo.db" where .rate > 25
Something similar to XBase macros should be provided, but with some variations. Macros should not just be allowed for any syntactical specification because it makes compilation too slow and difficult. Rather, an approach similar to Clipper should be taken, except be allowed in most clause parameters. Field and variable names, alias names, clause parameters, routine names, and boolean expressions (such as after Where clauses) can be macro-tized. Examples:
%aliasmacro%.fieldx = "12" aliasx.%fieldmacro% = "12" amacro = ".rate > 25" append from file "foo.db" where %amacro%Macro names need to be surrounded by percent signs on each side. This prevents some of the ambiguities of XBase macros. Similarly, an "Eval()" function will evaluate an expression. Eval can optionally take two parameters if validation testing is needed. Example:
if not eval(result,".rate * .hours") msgbox "Error, invalid expression" else msgbox "Paycheck is "+trans(result,"####.##") endifNote that aliases can be assigned to each other, reducing the need for macros. (An alias is really just a numeric integer value.) Example:
Open "foo.dbf" alias panda appendblank() panda.name = 'Toogie' bears = panda // alias clone bears.name = 'Toogie Tay' // same record and field a = bears a.name = "Moogie's a Panda" a.location = 'China'A zero valued alias represents an unassigned or unopened area, usually returned from a function.
The keyword "off" can be used to deactivate a clause parameter. Exampe:
open "foo" errorto istat // specify status variable open "foo" errorto off // specify none for runtime halt open "foo" exclusive off"Off" can be used on clauses that need not be present. This helps make them macro-able. (We will have to change the XBase "off" usage with "List".)
Like XBase, our language will not have null variables. A newly declared variable has a boolean value of "false" (replacing .f.) by default. My experience is that null's in other languages cause more problems than they solve. If you really want null's in order to seem "modern", then perhaps a compromise can be reached: A variable can have a null result, but still evaluate to zero, empty blank (zero length string), or false if evaluated (instead of a runtime error). A function such as "isnull()" could perhaps test for nullness.
By the way, I really like the Empty() command of some XBase dialects. It returns true if a variable or field is zero, blank (included multiple blanks), or false. I also like the Alltrim() command.
Should some syntactical constructs be eliminated? Using an earlier example:
set caseignore on caseignore(true) // equivalent?Perhaps we can do away with the XBase-dirived Set commands. However, Using "Set" tells us that it is a block-contextable command (described earlier). We are not sure what to do about this.
Variables can be declared with the following types:
public variable = x // global private variable = x // private, but visable to child routines local variable = x // only visable to routine declared in var variable = x // same as private'Public' and 'Private' function like their XBase counterparts, except they can be overriden by the 'Isolate' routine specificer presented earlier. The assignment statement (initializer) is optional.
Although the XBase WHILE and FOR statements should perhaps be combined into one WHERE command to better fit SQL conventions, the While command often assumed that an index was available for ranges. For reasons such as this, a way to test if an index is available should be made available to the program. Here is an example that processes active clients between Client Numbers 2500 and 2600:
Var istat Loop Scan Clients where (between(clientnum,2500,2600) ; and active=true) _istatto istat _bailnone // loop process body goes here EndLoop if istat.canindex = false msgbox "Not practical to scan without an index" endif
The "_bailnone" clause tells the scan loop to abort looping (jump out) if the While expression can not use an existing index. Unlike XBase, we are assuming that the database engine may pick which indexes to use, not necessarily the programmer. ( _bailnone and _istatto do not have to both be present.)
The interesting part here is that Istat is not a simple status code, but an alias to a small (temporary) table with index information. Thus, CanIndex is a field, although it (intentially) resembles a property of an "error object" of other languages. There may also be a field that contains a list of which portions of the expression can use indexes. Example:
Field Meaning Canindex Is some of the Where expression indexable? (boolean) I_expr Indexable expression (ex: "between()") Cantell Is information on indexability available? (boolean)Notice that I_expr perhaps contains only one clause or function. If more are availble, they would be in other records of this temporary alias. The other fields would simply be repeated for each record. (When created, the default pointer is at the first record.) Another alternative is to make "I_expr" be an alias to a seperate alias (table).
Thus, such system-generated tables are very similar to error and status objects of OOP languages. However, there are advantages to them over OOP objects: they do not have to fit in memory. For example, in the "Directory to alias X" example we could get lists of very large directories, regardless of memory size. The Index Status example may be overkill, but suggests some interesting concepts. Isn't thinking tabley wonderful? The power!
Note that if the alias name is not given in Scan, then the default alias is assumed. Perhaps an asterisk should be required in case the programmer forgets to add the alias name. Thus you would use "Loop Scan * where ...". Also note that we added parenthises around the Where expression in the example for clarity. Parenthises are not really necessary in this case. Further, whether or not _Bailnone should be triggered if no index info is available (related to CanTell) makes for an interesting discussion. Perhaps also have a "_BailUnkn" ?
Perhaps a way to easily embed table data into source code should be made. Perhaps something like:
Open "people" alias tablex data tablex _fields name, rank, listed // append stuff "Poly Shore", 34.2, false "Monica Lawinski", 12.0, true "Themtal Adams", 40.1, false enddata
I have decided that the keyword "as" is a better alternative to "alias". Thus, you would say:
Open tablex as t where rate < 23.5
I have also decided that having the word "loop" in
a "while" loop is not really needed. Only "while" and
"endwhile" are needed. A space between the two,
such as
There should be some optional way to have a table automatically close at the end of a routine. Example:
sub foo Open tablex _as t _autoclose blah blah endsub // t will get closed at end of routineIf the "autoclose" clause is specified, then the table will automatically be closed when the routine finishes.
Perhaps by default all tables should have only routine-level scope unless explicitly specified as global. A _global clause could be supplied for this purpose. This may make tables a bit "safer". Note that all subroutines (children routines) of the opener routine should still have access to the table, unless some sort of isolation mode is chosen for the children.
A "Scan" loop may also be operable on single statements:
myRoutine _scan t _where t.x > 7 _orderby t.yThis is equivalent to:
scan t _where t.x > 7 _orderby t.y myRoutine endscan