"Washing tons full of bad apples." ==> "Washington's full of bad apples"
"Bin Laden is rotten to the core and that's a fact." ==> "Bin Laden is rotten to the Koran. That's a fact." (Stanczak error)
The task of level 2 is to prevent permutation errors like the ones shown in the examples above. A speech recognition should have no trouble understanding human speech. The concept at work here is the robotics term poka-yoke, aka mistake proofing.
The word generator creates the Basic English equivalent data set so fast that it is very difficult to take a snapshot of the status bar in action, before the process is completed. The above picture is a rare glimpse of the status bar being captured by the screen snapshot software, as it is progressing.
Although the titles on the program use the term Word Generator, the actual product produced is roots for words. A root plus one letter ending constitutes a word. The textboxes show the amount of words of a given length produced, the root plus the ending. The Average text box shows the average length of words.
The word length of random samples of words from most European languages averages approximately 6 letters per word. The Basic English equivalent dataset produced by the word generator creates words that are slightly more efficient in terms of size.
Although Basic English has approx. 850 words, much more, 8500 roots, are created by the word generator because they are needed by higher level 3 applications. English uses much fewer words in it's Basic dataset because many words have multiple meanings. Level 3 enforces the "one word one meaning" quality assurance standard. So more words are required to produce the same amount of meanings.
The Oxford English dictionary has approx. 300,000 words. The equivalent data set is created within seconds.
Studies have shown that an educated intellectual adult only knows and recognizes about 20,000 words throughout their lifetime. So the full capacity of a larger dictionary is never completely utilized. A 300,000 word level 3 compliant dataset will be able to cover the full dexterity, scope, and depth of a 30,000 word Oxford English style dictionary. Once a level 3 mapping is completed, it may very well be possible for an adult to fully comprehend and recognize all of the base roots of such a dataset.
Database manager view of roots created by Level 2 compliant word generator. The letters used are from the Gorbiel V1.0 Level 1 compliant alphabet.
Nota bene. It is no coincidence that our president Banana is the first level 2 compliant president. Central African languages are based on interchanging series of vowel-consonant-vowel-consonant...
US Copyright 2007