Lecture by Professor Hubert Dreyfus Presented at UH 1/27/98
Why Symbolic AI Failed: The Commonsense Knowledge Problem
I. Turing predicted that there would be a machine that would behave
intelligently enough to be indistinguishable from a human by the year 2000.
Given the state of things today, it is highly unlikely that the prediction
will be met.
A. Artificial Intelligence (AI) in the 50s and 60s suggested that the promise was not unrealistic. Newell and Simon (working at RAND) showed with concrete programs that computers can do more than caclulate on numbers; they could represent with symbols and programs that operate on these symbols could display aspects of intelligence. This symbolic information processing model of the mind was able to produce programs that solved puzzles, played games and proved theorems in logic and mathematics.
B. In the late 60s and early 70s the really interesting work shifted to
MIT. One notable program, SHRDLU (by Terry Winograd) was able to take commands
and answer questions concerning a "micro-world" of blocks, pyramids,
and other simple objects presented on a TV screen. At that time there was
optimism that human intelligence could be duplicated by building larger
and larger worlds. Minsky claimed AI research would solve the problem in
a generation.
II. But problems began to surface...
A. See for example, Charniak's problems with programs for language processing.
He was interested in developing a program that could respond as well as
4-year olds do to questions posed about a story presented to them. Consider
the following story:
Today was Jack's birthday. Penny and Janet went to the store. They were going to get presents. Janet decided to get a kite. "Don't do that," said Penny. "Jack has a kite. He will make you take it back."
1. Note the things about the story that are not explicit but which we know none the less. The presents were for Jack. The kite was a present. etc. An intelligent story understander should figure this out. These problems could be partly resolved by storing information in the computer in a data structure called a birthday party frame. The frame incorporates information such as that at birthday parties people give presents to the person being honored. That people generally buy presents, typically at stores, etc.
2. This helps solve part of the problem, but what about the italicized it in the last sentence? Grammaticaly it should refer back to the last mentioned kite, namely the kite Jack already had. But we know that this kite is not the one that Jack will make Janet take back, it will be the new kite that goes back if he already has one. Any 4 year old will get this right. But how is the language understanding computer going to know?
3. Perhaps we can begin by adding the information that people do not want to recieve more than one thing of the same kind. But what frame does this go into? It doesn't seem to be about birthday parties, household objects, or rules about gifts. Even worse, the rule at issue isn't even true. It is false in the case of dollar bills, marbles, cookies, etc.. But even these exceptions have execptions for example, in the case of someone with too many marbles or a giant cookie, but even these exceptions have exceptions as in the case of a cookie monster.
B. Minsky had thought that what we needed to program in common sense was
about 100,000 facts. A generation later he is of the opinion that the AI
research problem is the hardest science has ever undertaken.
III. The reaction of the AI community to problems of these kinds
is to investigate micro-worlds, i.e. areas where the solution does not
depend on common sense knowledge.
A. Feigenbaum is famous for proposing that we abandon the project of capturing general intelleigent, and instead focus on knowledge engineering, that is, the development of systems that can act like experts in narrow domains, or expert systems.
B. Expert systems for diagnosing blood diseases (MYCIN) and spectrographic analysis (DENDRAL) were constructed that are successful, but only because they deal with circumscribed areas where common sense knowledge is not required.
C. The picture of this work is that to provide intelliegence in an area, we need to extract rules from the experts. The problem is that experts are rarely aware of how they function.
D. The difficulty is that expertise may amount to a massive number of special cases, so the process is never ending.
E. For example. Samuels wrote a program to play checkers. People say it played chaompionship checkers. This is not really true, after 35 years of work on formulating good rules Samuels now says that the project is essentially stalled. The program can give a good amateur a workout, but does not do as well as an expert.
IV. The history of philosphy has explored exactly the same issues
and presuppositions.
A. Socrates thinks that to understand (say) piety you need to know the rules for distinguishing the pious from the non-pious. (See the Euthyphro.) But the participants in the dialogues are unable to do so. As we would put it today, we are unable to give the necessary and sufficient conditions for piety.
B. Plato's reaction to this was to believe that though we do not know the rules, they are part of the unconscious equipment provided by the soul - ideas we forgot when the soul became embodied. The role of philosophy is to recover these lost rules.
C. AI researchers are attempting to do the same. To find the rules by learning them or guessing at them in order to duplicate human intelligence in rule-governed machines.
V. We need a new look. It may seem that beginners begin with cases and work up to general principles. But couldn't it be the other way around? Perhaps we start with crude rules and then learn to apply these intelligently to cases. Let us consider the examples of chess and driving. Here is a sequence of stages that are revealed by actually looking are expertise in the real world.
1. Novice: Objective Features; Strict Rules
The novice learns some basic rules that mention simply observable features:
Car: Shift when speedometer reads 10.
Chess: Exchange pieces whenever the numerical value of the trade is in your favor.
2. Advanced Beginner: Situational Aspects; Maxims
The more advanced learner appreciates features (like engine sounds of strain or over-revving) that are harder to distinguish. He applies rules that are not hard and fast, but which have exceptions that depend on the nature of the situation
Car: Shift up if the engine is straining
Chess: Avoid a cramped pawn structure
3. Competent: Relevance; Salience; Planning; Goal Setting
The competent person invents goals and plans to achieve them, rather than simply following rules or maxims. There is an appreciation of what is relevant in a given situation or context. Here the person displays an involvement with the task, for he or she has taken responsibility for success/disaster, so that concepts of risk, hope, opportunity and threat are in play.
Car: On an off ramp, road surface is relevant, passenger comfort is not
Chess: In a king side attack, loss of material on the other side of the board is irrelevant.
4. Proficiency
The person immediately appreciates the goal and an appropriate plan, and reasons out what to do. Intuitive behavior replaces explicit rule following. The goal is not calculated; it is obvious. However there is still a portion of the process that is reasoned out.
Car: You know by the seat of your pants when you are going too fast on an on-ramp.
Chess: Now is the time to attack the king side.
5. Expertise
The true expert just does it without problem solving and without reasoning. There is a "flow" in the experience, and consciousness retreats. The grand master in chess, for example, simply recognizes thousands of basic patterns and can play good chess even when reasoning is blocked by doing a parallel mathematical task.
Chess: I only consider one move, the right one.
VI. Suppose this 1-5 stage account is right. Then we have an explanation for why symbolic AI failed.
A. In symbolic AI we are asked to formalize the rules. What this does is force the expert to retreat to level 1 or 2. The result is a program that displays abilities at the Novice (1) or Advanced Beginner levels, and so a program that does not display true expertise.
B. Common sense knowledge is knowing how rather than knowing that. The source of this kind of knowledge is in a negotiation with the real world.
Example. Even very young hildren know a great deal about water. Trying to codify this knowledge in a theory of water would be a massively complex task. The child manages because he/she is actively involved with manipulating water. Children are learning the 10,000 water prototypes. Thye are developing not factual knowledge so much as a set of skills for getting around in the world. They do not have to go through rules to manage the cases. They simply experience the world and learn appropriate action.