INTRODUCTION TO ARTIFICIAL INTELLIGENCE AND EXPERT SYSTEMS
Copyright
1993, 1994, 1995 by Carol E. Brown and Daniel E. O'Leary
All rights reserved,
reproduced by special permission.
Artificial intelligence can be viewed from a variety of perspectives.
- From the perspective of intelligence
artificial
intelligence is making machines "intelligent" -- acting as we would expect
people to act.
- The inability to distinguish computer responses from human responses is
called the Turing test.
- Intelligence requires knowledge
- Expert problem solving - restricting domain to allow including
significant relevant knowledge
- From a research perspective
"artificial intelligence is
the study of how to make computers do things which, at the moment, people do
better" [Rich and Knight, 1991, p.3].
- From a business perspective AI is a set of very powerful tools, and
methodologies for using those tools to solve business problems.
- From a programming perspective, AI includes the study of symbolic
programming, problem solving, and search.
- Typically AI programs focus on symbols rather than numeric processing.
- Problem solving - achieve goals.
- Search - seldom access a solution directly. Search may include a variety
of techniques.
- AI programming languages include:
- LISP, developed in the 1950s, is the early programming language
strongly associated with AI. LISP is a functional programming language
with procedural extensions. LISP (LISt Processor) was specifically
designed for processing heterogeneous lists -- typically a list of
symbols. Features of LISP that made it attractive to AI researchers
included run- time type checking, higher order functions (functions that
have other functions as parameters), automatic memory management (garbage
collection) and an interactive environment.
- The second language strongly associated with AI is PROLOG. PROLOG was
developed in the 1970s. PROLOG is based on first order logic. PROLOG is
declarative in nature and has facilities for explicitly limiting the
search space.
- Object-oriented languages are a class of languages more recently used
for AI programming. Important features of object-oriented languages
include:
- concepts of objects and messages
- objects bundle data and methods for manipulating the data
- sender specifies what is to be done receiver decides how to do
it
- inheritance (object hierarchy where objects inherit the attributes
of the more general class of objects)
Examples of
object-oriented languages are Smalltalk, Objective C, C++. Object oriented
extensions to LISP (CLOS - Common LISP Object System) and PROLOG (L&O
- Logic & Objects) are also used.
Definitions of expert systems vary. Some definitions are based on function.
Some definitions are based on structure. Some definitions have both functional
and structural components. Many early definitions assume rule-based reasoning.
What the system does (rather
than how)
"... a computer program that behaves like a human expert in some useful
ways." [Winston & Prendergast, 1984, p.6]
- Problem area
- "... solve problems efficiently and effectively in a narrow problem
area." [Waterman, 1986, p.xvii]
- "... typically, pertains to problems that can be symbolically
represented" [Liebowitz, 1988, p.3]
- Problem difficulty
- "... apply expert knowledge to difficult real world problems" [Waterman,
1986, p.18]
- "... solve problems that are difficult enough to require significant
human expertise for their solution" [Edward Feigenbaum in Harmon & King,
1985, p.5]
- "... address problems normally thought to require human specialists for
their solution" [Michaelsen et al, 1985, p. 303].
- Performance requirement
- "the ability to perform at the level of an expert ..." [Liebowitz, 1988,
p.3]
- "... programs that mimic the advice-giving capabilities of human
experts." [Brule, 1986, p.6]
- "... matches a competent level of human expertise in a particular
field." [Bishop, 1986, p.38]
- "... can offer intelligent advice or make an intelligent decision about
a processing function." [British Computer Society's Specialist Group in
Forsyth, 1984, pp.9-10]
- "... allows a user to access this expertise in a way similar to that in
which he might consult a human expert, with a similar result." [Edwards and
Connell, 1989, p.3]
- Explain reasoning
- "... the capability of the system, on demand, to justify its own line of
reasoning in a manner directly intelligible to the enquirer." [British
Computer Society's Specialist Group in Forsyth, 1984, p.9-10]
- "incorporation of explanation processes ..." [Liebowitz, 1988, p.3]
How the system functions
- Use AI techniques
- "... using the programming techniques of artificial intelligence,
especially those techniques developed for problem solving" [Dictionary of
Computing, 1986, p.140]
- Knowledge component
- "... the embodiment within a computer of a knowledge-based component,
from an expert skill ..." [British Computer Society's Specialist Group in
Forsyth, 1984, pp.9-10]
- "a computer based system in which representations of expertise are
stored ..." [Edwards and Connell, 1989, p.3]
- "The knowledge of an expert system consists of facts and heuristics. The
'facts' constitute a body of information that is widely shared, publicly
available, and generally agreed upon by experts in the field." [Edward
Feigenbaum in Harmon & King, 1985, p.5]
- "Expert systems are sophisticated computer programs that manipulate
knowledge to solve problems" [Waterman, 1986, p.xvii]
- Separate knowledge and control
- "... make domain knowledge explicit and separate from the rest of the
system" [Waterman, 1986, p.18].
- Use inference procedures - heuristics - uncertainty
- "... an intelligent computer program that uses knowledge and inference
procedures" [Edward Feigenbaum in Harmon & King, 1985, p.5]
- "The style adopted to attain these characteristics is rule-based
programming." [British Computer Society's Specialist Group in Forsyth, 1984,
p.9-10]
- "Exhibit intelligent behavior by skillful application of heuristics."
[Waterman, 1986, p.18].
- "The 'heuristics' are mostly private, little rules of good judgment
(rules of plausible reasoning, rules of good guessing) that characterize
expert-level decision making in the field." [Edward Feigenbaum in Harmon
& King, 1985, p.5]
- "incorporation of ... ways of handling uncertainty..."[Liebowitz, 1988,
p.3]
- Model human expert
- "... can be thought of as a model of the expertise of the best
practitioners of the field." [Edward Feigenbaum in Harmon & King, 1985,
p.5]
- "... representation of domain-specific knowledge in the manner in which
the expert thinks" [Liebowitz, 1988, p.3]
- "... involving the use of appropriate information acquired previously
from human experts." [Dictionary of Computing, 1986, p.140]
- They create categories
- Cash is a Current Asset
- A Current Asset is an Asset
- They use specific rules, a priori rules
- E.g., tax law . . . so much for each deduction
- Rules can be cascaded
- "If A then B" . . .
- "If B then C"
- A--->B--->C
- They Use Heuristics --- "rules of thumb"
- Heuristics can be captured using rules
- "If the meal includes red meat
- Then choose red wine"
- Heuristics represent conventional wisdom
- They use past experience --- "cases"
- Particularly evident in precedence-based reasoning
e.g. law or choice
of accounting principles
- Similarity of current case to previous cases provides basis for action
choice
- Store cases using key attributes
cars may be characterized by: year
of car; make of car; speed of car etc.
- What makes good argumentation also makes good reasoning
- They use "Expectations"
- "You are not yourself today"
If we differ from expectations then it
is recognized
- "Patterns of behavior"
Computer models are based on our models of human reasoning
- Frames
frame attributes called "slots"
each frame is a node in one
or more "isa" hierarchies
- They use rules A--->B--->C
Auditing, tax . . .
Set of rules is
called knowledge base or rule base
- They use cases
Tax reasoning and tax cases
Set of cases is called a
case base
- They use pattern recognition/expectations
Credit card system
Data
base security system
- a network of nodes and relations
- in some ways very similar to a traditional database and in other ways very
different
- attributes called "slots"
- value can be stated explicitly
- a method for determining the value rather than the value itself
- each frame is a node in one or more "isa" hierarchies
- higher levels general concepts - lower levels specific
- unspecified value can be inherited from the more general node
- concept: prototypical representation with defaults that may be
overridden
- Example
To describe a thing growing in my back yard: an elm is a
deciduous tree, a deciduous tree is a tree, a tree is a plant, a plant is a
living organism.
Currently, the most common
form of expert system
Structure of
a Rule-based Expert System
- User Interface
- Friendly
- Maybe "Intelligent"
- Knowledge of how to present information
- Knowledge of user preferences...possibly accumulate with use
- Databases
- Contains some of the data of interest to the system
- May be connected to on-line company or public database
- Human user may be considered a database
- Inference Engine
- general problem-solving knowledge or methods
- interpreter analyzes and processes the rules
- scheduler determines which rule to look at next
- the search portion of a rule-based system
- takes advantage of heuristic information
- otherwise, the time to solve a problem could become prohibitively long
- this problem is called the combinatorial explosion
- expert-system shell provides customizable inference engine
- Knowledge Base (rule base)
- contains much of the problem solving knowledge
- Rules are of the form IF condition THEN action
- condition portion of the rule is usually a fact - (If some particular
fact is in the database then perform this action)
- action portion of the rule can include
- actions that affect the outside world (print a message on the
terminal)
- test another rule (check rule no. 58 next)
- add a new fact to the database (If it is raining then roads are
wet).
- Rules can be specific, a priori rules (e.g., tax law . . . so much for
each exemption) - represent laws and codified rules
- Rules can be heuristics (e.g. If the meal includes red meat then choose
red wine). "rules of thumb" - represent conventional wisdom.
- Rules can be chained together (e.g. "If A then B" "If B then C" since
A--->B--- >C so "If A then C").
(If it is raining then roads are
wet. If roads are wet then roads are slick.)
- Certainty factors represent the confidence one has that a fact is true
or a rule is valid
the discipline of building
expert systems
The Role of
the Knowledge Engineer
- Knowledge acquisition
- the process of acquiring the knowledge from human experts or other
sources
(e.g. books, manuals)
- can involve developing knowledge to solve the problem
- knowledge elicitation
- coaxing information out of human experts
- Knowledge representation
- Method used to encode the knowledge for use by the expert system
- Common knowledge representation methods include rules, frames, and
cases.
- Putting the knowledge into rules or cases or patterns is the knowledge
representation process
The
Case-based Reasoning Process
- Uses past experiences
- Based on the premise that human beings use analogical reasoning or
experiential reasoning to learn and solve complex problems
- Particularly evident in precedence-based reasoning
(e.g. tax law or
choice of accounting principles)
- Useful when little evidence is available or information is incomplete
- Cases consist of
- information about the situation
- the solution
- the results of using that solution
- key attributes that can be used for quickly searching for similar
patterns of attributes
- Elements in a case-based reasoning system
- the case base - set of cases
- the index library - used to efficiently search and quickly retrieve
cases that are most appropriate or similar to the current problem
- similarity metrics - used to measure how similar the current problem is
to the past cases selected by searching the index library
- the adaption module - creates a solution for the current problem by
either modifying the solution (structural adaptation) or creating a new
solution using the same process as was used in the similar past case
(derivational adaptation).
- Learning
- If no reasonably appropriate prior case is found then the current case
and its human created solution can be added to the case base thus allowing
the system to learn.
(artificial neural networks and
connectionist models)
- Based on pattern recognition - used for credit assessment and fraud
detection
- A set of interconnected relatively simple mathematical processing elements
- Looks for patterns in a set of examples and learns from those examples by
adjusting the weights of the connections to produce output patterns
- Input to output pattern associations are used to classify a new set of
examples
- Able to recognize patterns even when the data is noisy, ambiguous,
distorted, or has a lot of variation
- Neural network construction and training
- the architecture used (e.g. feed-forward)
- how the neurons are organized (e.g. an input layer with five neurons,
two hidden layers with three neurons each, and an output layer with two
neurons.)
- the state function used (e.g. summation function)
- the transfer functions used (e.g. sigmoid squashing function)
- the training algorithm used (e.g. back-propagation)
- Architecture
- How the processing elements are connected
- Commonly used architectures:
- Layers (also called levels, fields or slabs)
- Organized into a series of layers
- input layer
- one or more hidden layers
- output layer
- Some consider the number of layers to be part of architecture
- Others consider the number of layers and nodes per layer to be
attributes of the network rather than part of the architecture
- Neurons - the processing elements
The vocabulary in this area is not
completely consistent and different authors tend to use one of a small set of
terms for a particular concept.
Structure
of a Neuron
- consists of
- a set of weighted input connections
- a bias input
- a state function
- a nonlinear transfer function
- an output
- Input connections have an input value that is either received from the
previous neuron or in the case of the input layer from the outside
- Bias is not connected to the other neurons in the network and is assumed
to have an input value of 1 for the summation function
- Weights
- A real number representing the strength or importance of an input
connection to a neuron
- Each neuron input, including the bias, has an associated weight
- State function
- The most common form is a simple summation function
- The output of the state function becomes the input for the transfer
function
- Transfer function
- A nonlinear mathematical function used to convert data to a specific
scale
- Two basic types of transfer functions: continuous and discrete
- Commonly used continuous functions used are Ramp, Sigmoid, Arc Tangent
and Hyperbolic Tangent
- Continuous functions sometimes called squashing functions
- Commonly used discrete functions are Step and Threshold
- Discrete transfer function sometimes called activation function
- Training
- The process of using examples to develop a neural network that
associates the input pattern with the correct answer
- A set of examples (training set) with known outputs (targets) is
repeatedly fed into the network to "train" the network
- This training process continues until the difference between the input
and output patterns for the training set reaches an acceptable value
- Several algorithms used for training networks
most common is
back-propagation
- Back-propagation is done is two passes
- First the inputs are sent forward through the network to produce an
output
- Then the difference between the actual and desired outputs produces
error signals that are sent "backwards" through the network to modify the
weights of the inputs.
V-1. Advantages of Expert Systems
- Permanence - Expert systems do not forget, but human experts may
- Reproducibility - Many copies of an expert system can be made, but
training new human experts is time-consuming and expensive
- If there is a maze of rules (e.g. tax and auditing), then the expert
system can "unravel" the maze
- Efficiency - can increase throughput and decrease personnel costs
- Although expert systems are expensive to build and maintain, they are
inexpensive to operate
- Development and maintenance costs can be spread over many users
- The overall cost can be quite reasonable when compared to expensive and
scarce human experts
- Cost savings:
Wages - (elimination of a room full of clerks)
Other
costs - (minimize loan loss)
- Consistency - With expert systems similar transactions handled in
the same way. The system will make comparable recommendations for like
situations.
Humans are influenced by
- recency effects (most recent information having a disproportionate
impact on judgment)
- primacy effects (early information dominates the judgment).
- Documentation - An expert system can provide permanent
documentation of the decision process
- Completeness - An expert system can review all the transactions, a
human expert can only review a sample
- Timeliness - Fraud and/or errors can be prevented. Information is
available sooner for decision making
- Breadth - The knowledge of multiple human experts can be combined
to give a system more breadth that a single person is likely to achieve
- Reduce risk of doing business
- Consistency of decision making
- Documentation
- Achieve Expertise
- Entry barriers - Expert systems can help a firm create entry
barriers for potential competitors
- Differentiation - In some cases, an expert system can differentiate
a product or can be related to the focus of the firm (XCON)
- Computer programs are best in those situations where there is a structure
that is noted as previously existing or can be elicited
- Common sense - In addition to a great deal of technical knowledge,
human experts have common sense. It is not yet known how to give expert
systems common sense.
- Creativity - Human experts can respond creatively to unusual
situations, expert systems cannot.
- Learning - Human experts automatically adapt to changing
environments; expert systems must be explicitly updated. Case-based reasoning
and neural networks are methods that can incorporate learning.
- Sensory Experience - Human experts have available to them a wide
range of sensory experience; expert systems are currently dependent on
symbolic input.
- Degradation - Expert systems are not good at recognizing when no
answer exists or when the problem is outside their area of expertise.
- Expert Systems in Finance
Daniel E. O'Leary and Paul R. Watkins (Editors), 1992,
Amsterdam: North-Holland
- Expert Systems in Business and Finance: Issues and Applications
Paul R. Watkins and Lance B. Elliot (Editors), 1993, Chichester:
John Wiley and Sons.
- Artificial Intelligence in Accounting and Auditing: Using Expert
Systems, Volume II
Miklos A. Vasarhelyi and B.N. Srinidhi (Editors), 1993,
Princeton, NJ: Markus Wiener Publishing, 1993.
-
- Anonymous. Dictionary of Computing, 1986, New York: Oxford
University Press.
- Bishop, Peter. Fifth Generation Computers Concepts, Implementations
& Uses, 1986, Chichester, England: Ellis Horwood Ltd.
- Brule, James F. Artificial Intelligence: Theory, Logic and
Application, 1986, Blue Ridge Summit, PA: TAB Books.
- Edwards, Alex and Connell, N.A.D. Expert Systems in Accounting,
1989, Herfordshire, UK: Prentice Hall International (UK) Ltd.
- Forsyth, Richard, Expert Systems: Principles and Case Studies,
1984, London: Chapman and Hall Computing.
- Harmon, Paul and King, David. Expert Systems: Artificial Intelligence
in Business. 1985, New York: Wiley.
- Liebowitz, Jay, Introduction to Expert Systems, 1988, Santa Cruz,
CA: Mitchell Publishing, Inc.
- Michaelsen, Robert H.; Michie, Donald and Boulanger, Albert. "The
Technology of Expert Systems" Byte; April 1985; 10(4): pp. 303-312.
- Rich, Elaine and Knight, Kevin. Artificial Intelligence Second
Edition. 1991, New York: McGraw-Hill.
- Waterman, Donald A. A Guide to Expert Systems, 1986, Reading, MA:
Addison-Wesley.
- Winston, Patrick H. and Prendergast, Karen A. (Editors). The AI
Business: Commercial Use of Artificial Intelligence, 1984, Cambridge, MA:
The MIT Press.
Introduction to Artificial Intelligence and Expert Systems
Copyright
1993, 1994, 1995 by Carol E.
Brown, Oregon State University, mailto:brownc@bus.orst.edu and Dan E. O'Leary,
University of Southern California, mailto:Oleary@rcf.usc.edu
All rights
reserved, Reproduced by special permission. Photocopy
Permission
Page last updated December 17, 1994