General Intelligence :

Knowledge representation (KR)

version 2.0

Just as the goal in our vision project is to recognize "everything under the sun", the goal of our knowledge representation is to be able to represent "all possible thoughts".

To cut a long story short, our KR is based on first-order predicate logic (FOPL) which I'm sure anyone familiar with Prolog will be familiar with it.

eg:
loves(John,Mary) "John loves Mary"
father(John,Peter) "John is Peter's father"

FOPL is also basically the KR language used by Cyc, known as CycL, except that it uses a Lisp-like notation:

loves(John,Mary) --> (loves John Mary)

As a quick way to get started on this project, you can simply think of our KR as FOPL / Prolog / CycL.

But there is one minor issue...

First-order vs higher-order logic

Cyc actually uses "higher-order logic" in its inference engine. In the above examples, "John" and "Mary" are called first-order objects or variables because they can appear inside predicates and be quantified with the quantifiers "for all" and "exist"; but in first-order logic the predicates (eg "loves") cannot be treated as variables or be quantified.

Sometimes we need to express something like:

is-blind(love) "Love is blind"

I won't go into the details, but there are many situations where we need to treat predicates as variables in order to express general rules. Cyc's solution is to use higher-order logic (HOL) but HOL algorithms are much more complex than FOL ones, and HOL has much fewer ready-made algorithms than FOL such as induction, abduction, truth maintence, etc.

My solution is to change the KR and retain the FOL format. All sentences will be expressed using a generic predicate "R":

loves(John,Mary) --> R(Loves,John,Mary)

so now all objects are first-order.

The rest of this page contains some old material which are still under development. For one, our KR needs to be augmented with probabilistic or fuzzy logic, which is still an open research topic. Other than that, the KR would be strictly within FOPL.

In a nutshell, the knowledge representation scheme has 3 parts:

a subset of first order predicate logic

some examples
reification
representing time
why "standard" predicate logic is inadequate

enhancements borrowed from term logic

composition functor
representing embedded clauses
quantifiers, numbers, and doing arithmetics

uncertain logic

The resulting logic is called geniform.

Appendix: knowledge representation in Cyc, Soar, and ACT-R

First order predicate logic

We build our knowledge representation upon a framework of first-order predicate logic because of 3 reasons. Firstly, most people are familiar with it (eg Prolog), so there is no need to learn anything new. This is desirable when convincing people to join a new project. Secondly, from my analysis of computer vision it seems that predicate logic is sufficient for that purpose, so we "don't fix it unless it's broken". Thirdly, many efficient algorithms and important theoretical results have been obtained for first-order and propositional logic, including:

Satisfiability in propositional logic (ie, determining if a statement can be true under some variable assignments) is NP-complete [It was the first problem proved to be NP-complete, by Stephen Cook in 1971]
Entailment in propositional logic (ie, given a set of statements [the knowledgebase], determining if a conclusion is implied) can be converted to satisfiability, and is co-NP-complete (as hard as NP-complete).
Entailment in propositional logic restricted to Horn clauses (ie, statements of the form P→Q₁^Q₂^Q₃^... where the Q's are literals) can be done in time linear in the size of the knowledgebase.
Entailment in first-order logic is semidecidable — an algorithm can decide if a statement is entailed but is not gauranteed to terminate if the statement is not entailed [a result proved separately by Turing and Church in 1936].
Entailment in Datalog knowledgebases (ie, consisting entirely of Horn clauses with no functions) by forward-chaining can be done in polynomial time provided that the sizes of rules and the arities of predicates are bounded by a constant.

{ We will probably use Datalog or Horn logic so that inference can be more efficient. }

John McCarthy first advocated the use of first-order logic in knowledge representation, and the idea that "almost everything" can be expressed in it.

Some examples

People acquainted with Prolog should be at ease with first-order logic. The important point to bear in mind is that the "world" is composed of objects and their properties and relations. Objects are represented as constants or variables. Properties of objects and relations between objects are represented by unary and n-ary predicates, respectively.

There exists a person John who is male.

∃john:person male(john)

John hits Mary.

hit(john,mary)

Reification

John hits Mary softly.

This one requires modifying hit() with softly() which is not allowed in predicate logic (Even second-order predicate logic does not allow this, it merely permits existential/universal quantification of predicates). The trick to work around this problem is called reification. This means creating a new variable, an "abstract individual", to represent "the act of hitting", and then applying softly() to it:

isa_Hit(hit₁) from(hit₁,john) to(hit₁,mary) softly(hit₁)

The problem with reification is that the original relation, hit(john,mary), becomes embroiled in several parts, making it difficult to recognize. Currently our remedy for this is to represent a fact that needs to be reified, redundantly in both its simple and reified forms.

Representing time

{ To do: situation calculus and event calculus... }

Without reification:

hit(john,mary,t₁)

With reification:

time(hit₁,t₁)

Why "standard" FOL is inadequate for general intelligence

If "John loves Mary" is loves(john,mary), what about the statement that "Loving Mary is foolish"?

One obvious solution is to create a hypothetical "person" variable and say:

for all x, loves(x,mary) → foolish(x)

which seems to solve the problem because it enables us to use the 2-place predicate love(). But this has the undesirable effect that anyone who loves Mary would immediately become a fool. A normal person may do foolish things occasionally. Sometimes we just want to say that an act is foolish without saying anything about the person who does it. That would be extremely awkward to express in predicate logic.

Historically, predicate logic got its name because Aristotle divided a simple sentence like "John loves Mary" into 2 parts: "John" is called the subject and "loves Mary" is called the predicate. Therefore we have to represent "love" as a predicate that takes on a subject variable (John) and an additional object variable (Mary). This becomes a problem when we want to talk about "love" as an abstract entity, as the above example illustrates. Predicate logic is inadequate when we want to talk about things inside predicates. Therefore we will augment it with term logic (see next section).

Another problem with modern predicate logic is the quantifiers "for all" and "exists" introduced by Gottlob Frege (1848-1925). While they can deal with many quantification problems (especially in mathematics), they cannot represent fuzzy concepts like "a few" or "many". Ironically, expressing numbers in sentences is also difficult, as in "3 mangos are stolen". These problems can be solved by the use of term logic.

Enhancement with term logic

Term logic is revived in recent decades by Fred Sommers and his student Englebretsen. See this summary page of their Term Functor Logic. I am indebted to them, as well as to Ben Goertzel and Pei Wang for introducing me to term logics, who also developed their own versions.

The idea we borrow from term logic is the notion of the "term" which is smaller than the predicate. A term is a unit of meaning close to a "word" in ordinary language. Terms can form composites with other terms to create new meanings, for example "green" + "apple" yields "green apple".

Composition functor

The composition functor * is introduced into the new logic. (x*y) is the composition of 2 terms x and y, such as green*apple meaning a thing that is both green and an apple.

Special algorithms are needed to compute the meaning of x*y [ see the page on geniform ]. In some sub-domains such as low-level vision processing, there is no use for the composition functor, so we can use a subset for which processing is more efficient.

But there are 2 reasons why term logic is useful for higher cognition:

Representing embedded clauses

Firstly, it provides a way to represent partial predicates (see the last section why this is a problem). Now we can express "Loving Mary is foolish" as:

foolish(love*mary).

Terms allows us to reason about practically everything.

Quantifiers, numbers, and doing arithmetics

Secondly, composition enables us to do arithmatics in a natural way.

"All ravens are black" = ( black(all*raven) )

"Some apples are green" = ( green(some*apple) )

"3 mangos are stolen" = ( stolen(3*mango) )

You may be dubious about expressing 3*mango in the same way as green*apple. But if we think of 3 as the concept of "all things that number 3" then 3*mango is just the intersection of the set "3" and the set of mangos. The following diagram illustrates this:

What is special about this is that the concept "3" is treated just like any other concept, and we can define "3" in the Pattern Recognizer: "3" has instances such as "3 musketeers" and "3 vertices of a triangle"; it is the successor of "2"; etc.

In this way, establishing the concepts of numbers mimics the way human infants learn numbers. And more importantly, we achieve this in the same way as learning other concepts.

One caveat is that, since we dispensed with Frege's quantifiers, we have to explicitly check for syllogistic inferences.

[ For more on how this can be incorporated in first-order predicate logic, see geniform. ]

Knowledge representation in Cyc, Soar, and ACT-R

{ I think Soar and ACT-R's knowledge representation may be equivalent to first-order logic... }

{ Ben Goertzel and Pei Wang developed different forms of probabilistic term logic intended for GI. "Term logic" is a modern form of scholastic logic which predates predicate logic. More on this... }

You may provide some instant feedback with this form:

Reference

| Home | Table of Contents |

5/May/2007, 23/Oct, 10/May/2006 (C) GIRG [ Notice about intellectual property on this page ]

Name:
Comment / suggestion / question: