| Home | Table of Contents |

General Intelligence :

Inductive learning for vision

Inductive logic programming seems to be the harder part of our project (in fact ILP is considered one of the hardest areas in computer science!)


Background to ILP (inductive logic programming)

Books and internet resources

  1. Machine Learning [Mitchell 1997] has a chapter on "Learning Sets of Rules" which is a good introduction.
  2. AI: A Modern Approach talks about ILP in chapter 19
  3. This 1994 book by N. Lavrac and S. Dzeroski is free for download: Inductive Logic Programming: Techniques and Applications.
  4. Stephen H Muggleton's website on ILP: www.doc.ic.ac.uk/~shm/
  5. Peter A Flach's website: www.cs.bris.ac.uk/~flach and his PowerPoint on Knowledge Representation and ILP.
  6. many more...

A simple ILP example

To learn the relation Daughter(x,y).

A database may have the following facts:

Male(mary) = false
Female(mary) = true
Mother(mary, louise) = true
Father(mary, bob) = true
Daughter(bob, mary) = true
etc......

After presenting a large number of such facts, this rule may be learned:

IF Father(y,x) ^ Female(y) THEN Daughter(x,y)

Existing ILP software

  1. FOIL
  2. CIGOL ('logic' spelt backwards)
  3. GOLEM
  4. PROGOL
  5. LINUS
  6. etc...

Special concerns related to vision

The "fovea" focuses on a particular area or scale of an image. Low-level feature extraction results in a graphical representation of the scene. The nodes and links of this graphical representation is then converted to a set of logical statements. This set is then taken to be the "model" (possible world) of the logic.

Low-level feature extraction has to use the searchlight mechanism to discover the spatial relations among features, and then outputs a graphical / logical representation for the next level. The "universe" of the logic at each level is distinct from each other.


Why use logic?

Our scheme is to reduce 3D to 2D to 1D to 0D. Along this route, many concepts need to be defined:

  1. First, we have the image decomposed into edgels, which are the lowest-level features.
  2. Then we define "lines" and "curves" as collections of contiguous edgels.
    (From 0D to 1D: line = conjunction of edgels)
  3. Then we define "polygons", "parallelograms", "squares", etc as collections of lines.
    (From 1D to 2D: surface = conjunction of lines)
  4. Then we can define "cylinders", "blocks", etc as collection of surfaces.
    (From 2D to 3D: volume = conjunction of surfaces)
  5. Then we define objects such as "bicycles", "wheels", etc as composed of primitive geons.

If we can enter all these things as rules in a knowledgebase, we have a vision system that operates by the rules. Our strategy is to encode these rules as logical formulas.

Why not neural networks?

The use of logic to represent rules (definitions) has a tremendous advantage over neural networks. For example, a quadrilateral is defined as any figure with 4 sides. Quadrilaterals can appear in all sizes and shapes, yet we have no problem recognizing any of them. With predicate logic, we can easily define a quadrilateral in terms of edges and vertices (an example is given in Vision scheme). The resulting formula is very succinct and is capable of covering all possible cases (ie it has low algorithmic complexity).

On the other hand, in neural networks we have to learn the concept of quadrilaterals in a high dimensional space populated with many instances of quadrilaterals. Such a statistical learning method continually "molds" the classification space into the right shape, with incremental steps, which is extremely inefficient and prone to errors. In other words, it fails to reduce algorithmic complexity significantly.

The essense of logic is that it can represent objects and their relations. Objects are variables and relations are predicates. The use of variables allows logic to represent many things succinctly. For example I can use the predicate Kicks(x,y) to mean x kicks y. The predicate can be used to denote "boy kicks ball", "girl kicks dog", or "John kicks Mary" even though these events are very dissimilar. Thus the great expressiveness of predicate logic. Most artificial neural networks nowadays cannot express things with variables.

You may argue that the brain is a neural network and so they must be capable of achieving vision and cognition. I guess the answer is that the brain probably uses neurons to perform some sort of symbolic/logical computations, rather than statistical learning as many neuroscientists now assume. Unfortunately how the cortex performs computation is still a terra incognita.


What's the use of inductive learning?

For a very simple vision system, inductive learning is not needed. All we need is to patiently enter definitions of shapes / objects into a logical knowledgebase. Then the system can recognize those things.

Inductive learning is needed for 2 purposes:

  1. teaching by "show and tell";
  2. automatically discovering useful concepts.

"Show and tell"

It is far easier to show the computer a number of examples of an object (eg "pineapple") than to explicitly define the general appearance of that object. The inductive learner will automatically generalize from the exemplars.

Automatic discovery of useful concepts

This is an advanced topic, but of great practical value. A "useful" concept is one that helps meaningfully to characterize a broad class of objects. For example the concept of "leaf" may characterize the leaves often attached to many types of fruits, and may also apply to flowers and trees. So it is a useful concept.

Another useful concept is "parallelogram" which often occurs in the views of rectangular blocks. If we do not use the concept of parallelogram then the description of rectangular blocks may become cumbersome.

In other words, a useful concept reduces description lengths. But if we are talking about the total description length of a knowledgebase, then the discovery of "good" concepts requires global searching. This is therefore a computational hard problem.


Inductive learning methods

We are currently looking for researchers in ILP to collaborate with us to develop the Inductive Learner...


Outstanding questions

Do we need some special inference operations outside of predicate logic? Such as stochastic logic?

When doing higher-level recognition, does the Recognizer need to be aware of the constituents of high-level features? Eg, that a line is a collection of edgels?


| Home | Table of Contents |

22/Jun, 17/Jun/2006 (C) General intelligence corporation limited