Syllabus for Phil 3395 Introduction to Cognitive Science 1998
Jim Garson x 3208 Office Hours: MWF 11-12 e-mail:garson@menudo.uh.edu
Texts: M Mind , Paul Thagard
CS Cognitive Science , Stillings et. al.
Packet #56 at the UC Copy Center or the Bookstore

Week 1 Jan 21-26 What is Cognitive Science?
M Ch. 1 CS Ch. 1
Top-Down: Classical Cognitive Science
Week 2 Jan 28-Feb 2 Logic
M pp. 23-34 of Ch 2.
Week 3 Jan Feb 4-9 Reasoning
M pp. 34-41 (rest of Ch. 2) CS pp. 168-174
Week 4 Feb 11-16 Rules
M Ch 3. CS pp. 139-151 (skim if you like) pp. 155-159, 164-168
Week 5 Feb 18-23 Concepts and Learning
M Ch. 4. CS pp. 159-164, 192-213
Week 6 Feb 25-27 Review and QUIZ Feb 27
Bottom-Up: Neurally Inspired Cognitive Science
Week 7 Mar 2-6 Neuroscience
CS Ch. 7 (skim 270-275; 282-289; 291-298; 321-325)
Week 8-9 Mar 9-23 Vision
M Ch. 6. CS Ch. 12 (skim 467-479; 487-490; 506-512)
Week 9-10 Mar 25- Apr 3 Connectionism
M Ch. 7. CS pp. 63-83; 324-325; 92-93; 114-116; 121-124
Week 11 Apr 6 - Apr 10 Review and QUIZ Apr 10
In the Middle: Cognitive Psychology, Linguistics and Philosophy
Week 12 Apr 13-17 Cognitive Psychology
CS Ch. 2.1-2.3 M Ch. 6. CS 2.7
Weeks 13 Apr 20-24 Linguistics
CS Ch 6.1, 6.2, 6.3
Weeks 14-15 Apr 27-May 1 Philosophical Issues
M Ch. 9 CS Ch. 8.3 Qualia
May 4 Review

Final or Project MAY 8

Quizes 30% each. Final/Project 40%


Cognitive Science Notes Week 1
What is Cognitive Science?
A. Course Mechanics

  1. Introductions
  2. Books, There are 2:
    a. Stillings et al. will be used for many reading selections: Chs. 1-4, 6.1-6.3, 7, 8.3, 12
    b. Thagard is our text. We will cover all of it except (perhaps) Ch. 5
  3. Your Duties: Read week's assignment before first class. Be prepared to answer oral questions about its content from me.
  4. Evaluation: 2 Quizzes and a Final/Project. The project may be a paper, a website or a computer program, or you may opt to take a final exam. If you chose to do a project, a proposal of 2 pages explaining your plans must be provided to me on April 1 . If I do not receive it on time you will take the final. All quizzes 60%, Final/Project 40%. These notes contain an annotated list of resources: Books, Websites etc. to help you with projects. For more details see me.

B. Introduction: What is Cognitive Science?

  1. Cognition = Thinking

(So Cognitive Science = the Science of Thinking or the Science of the Mind)

  1. So it is just Psychology?
    True there is Cognitive Psychology.
    This brand of psychology is a new movement that has to some extent displaced the Behaviorist School.
  2. But there are also essential contributions from other fields:
    Linguistics, Computer Science, Neuroscience, Anthropology, and Philosophy.
  3. So any course in Cog Sci is multidisciplinary.
    a. Problem: Since each professor is schooled in a given discipline, each teacher will have gaps and prejudices. Here is some account of mine. Strong areas: Philosophy, (Logic), Linguistics, and Computer Science. Weak ones: Neuroscience, Psychology and Anthropology
    b. We will have a few visitors to help fill some of my gaps.
  4. In each of the disciplines the contributions are only from a subgroup:
    a. Philosophy: Philosophy of Mind, Logic
    b. Psychology: Cognitive Psychology
    c. Computer Science: A wing of Artificial Intelligence
    d. Linguistics: Computational Linguistics
    e. Neuroscience: Functional neuroscience
    f. Anthropology: Cultural or Cognitive Anthropology
  5. How to conceive of this interdisciplinary overlap: Each is about the same topic, but each one differs markedly in their methods: (See the discussion in CS section 1.6.)
    a. Philosophy: More abstract issues about the nature of thinking.

C. Applications of Cognitive Science

  1. What could be more exciting than to cross the last major intellectual frontier?
  2. Understand reading so as to treat reading problems.
  3. Speech therapy for stroke victims.
  4. Understand memory so as to predict reliability of legal witnesses
    (Witness blending: At what speed did the car smash into the truck? vs.
    At what speed did the car run into the truck?)
  5. Imagery for training athletes
  6. Expert systems and robotics

D. Roots of Cognitive Science

  1. History
    a. Cog Sci (like sciences in general) has its roots in philosophy, especially the philosophy of nature. All other sciences related to cog sci branched out of philosophy:

Philosophy: 300 BC
Neurology: mid 1800s
Psychology: 1860s
Anthropology: 1920s
Linguistics: 1920s
Computer Science: 1960
(PhD is doctor of philosophy, In Library of Congress Book codes, B is philosophy, BD is psychology). b. So at the roots we can expect to see philosophers and a key philosophical issue: Is the Mind appropriate for study by science? This question is related to the fundamental struggle between dualists and naturalists.

E. A Controversy: Top-Down vs. Bottom Up

  1. Top-Down The classical paradigm: CRUM (Computational-Representational Understanding of Mind) GOFAI (Good Ol' Fashion Artificial Intelligence)
    a. Characterize what the mind does using logic and computer science. Once these abilities are simulated on a computer you can then understand the general principles that underlie human intelligence.
    b. Thinking = processing of representations.
    c. Some analogies:
  2. d. Understanding cognition does not require detailed information about how the brain actually implements information processing.
    Instruction: cream the butter can be done in various ways. It doesn't matter (much) to the end product exactly how you do it
    e. After all. You can do software science without knowing anything about hardware.
    f. The project: Define the task. Theorize about what are likely representations and procedures to get job done. Model the theory as a data structure and algorithm. Program actual computer code, and run it on a computer platform to test the ideas.
    g. When the project is done we will understand the Mind as the result of computation in the brain.

  3. Bottom - Up: Connectionism
    a. We need to look more carefully at the actual neurology of the brain to understand what it does and how it does it.
    b. The symbolic computer is a very bad model of what the brain does and how it does it.
    c. A search for alternative mechanisms guided by what we find out about neural structure will be more fruitful in explaining thought.

Reading
Von Ekhart, Barbara What is Cognitive Science? A good discussion of the nature of the discipline at the turn of the 90s. Discusses worries that cognitive science may not count as a cohesive science.



Cognitive Science Notes Weeks 2-3 (M Ch 2, CS pp. 168-173): Logic and Reasoning
I. Logic

A. The Attraction of Logic to Cognitive Science

The fundamental problem was that the system couldn't handle relations: is a tail of.

PROLOG simplifies things by only allowing sentences of the form:
A&..&B -> C
You can think of this as a rule saying that if you can prove A, .., B then you will have C. Or to put it another way, to prove C, you should prove A, .., B. Note also that
-> C
says that C is proven. It can also be shown that
B ->
(where the right hand side of the -> is empty) amounts to saying that not B is proven. So in PROLOG notation you express 'not' by putting a sentence on the other side of the arrow. Finally, in PROLOG, you leave off universal quantifiers: Ux, as they are understood.

The idea is that two matching sentences (C(x) and C(a) in the first example) on different sides of the -> can be cancelled, in which case the variable x takes on the value it would get in the match in the result. (So in the first example, we cancel the C(x) in the first line, and set x to a, obtaining A(a) ->. It can be shown that the resolution rule is the only rule needed for a correct system of predicate logic!


This data can be expressed in PROLOG in the following set of sentences. where 'C' abbreviates 'connects to', and where extra parentheses are omitted for legibility
1. -> Caf
2. -> Cab
3. -> Cbe
4. -> Cbc
5. -> Ccd
6. Cxy&Cyz -> Cxz
Note the last trivial claim about connectedness: if x is connected to y and y to z, then x is connected to z.
a. Suppose we want to solve the problem of how to get from a to d. It turns out that Cad can be proven in predicate logic from 1-6, and the proof provides instructions for getting from a to d.
b. Let us see how the resolution rule turns up the proof. c. Begin by adding NOT Cad to steps 1-6. Our strategy is to prove Cad by showing NOT Cad leads to a contradiction.
7. Cad ->
8. Cay&Cyd -> 6 7 Resolution
9. Cbd -> 2 8 Resolution
10. Cby&Cyd -> 6 9 Resolution
11. Ccd -> 4 10 Resolution
12. > 5 11 Resolution
Line 12 is an arrow with nothing on the right or the left. This is indicates a contradiction in PROLOG. So line 7 has lead to the contradiction on line 12 and we know Cad follows from 1-6. Note also that the data used in the proof: 1. -> Cab, 4. -> Cbc, and 5. -> Ccd amounts to a solution for the problem. In proving there is a connection between a and d, PROLOG has found a pathway from a to d, and so solved the problem.

II. Strengths and Weaknesses of Logic for Cognitive Science
A. Good News: Predicate Logic is sound and complete. A completely rigorous and correct system for predicate logic can be computerized so that any correct pattern of reasoning in the language can be discovered by a computer.
B. Good News: Turing's Thesis: There is excellent evidence that any process that can be expressed with a finite set of rules can processed by a digital computer that operates on representations in the language of predicate logic. So it would seem that any coherent reasoning is always something that is captured by logic programming.
C. Bad News: A fully correct mechanism for logic problem solving may spend way too much time solving problems. (There is strong evidence that logic problem solving requires exponential processing times in a worst case.) For this reason PROLOG has had to sacrifice correctness to obtain efficiency.
D. Bad News: Turing's Theorem: Predicate Logic has no decision procedure. Although good reasoning can always be discovered (eventually) by a logic problem solver, there is no guarantee that bad reasoning can be identified as such in a finite amount of time.
E. Bad News: Godel's Theorem: There is no correct finite system of rules for mathematics.
F. Bad News: Many sentences of English resist representation in the language of predicate logic: John believes that all men are mortal on Sunday but not Monday.
G. Good News: Systems to handle belief, time and other so-called intensional concepts have been developed, and are being adopted by AI researchers. However, many of these systems are controversial, and there are few standards as there are with predicate logic and PROLOG.
H. Bad News: Predicate Logic doesn't let you take it back. If I say that all tigers are striped in the data, then asserting that Tigger is an albino tiger causes a contradiction. (I assume the data includes a rule that says albino means not striped.) From a contradiction anything follows in predicate logic. The problem is that predicate logic uses monotonic reasoning, which means that the more information you have the more you can prove from it. What is needed is a way to add data that removes previous lines of reasoning.
I. Good News: Non-monotonic logics have been developed that are modifications of predicate logic. In these systems you can say 'All birds fly', and then assert 'Penguins don't fly' without causing contradictions. Arrangements are made so that the data 'Penguins don't fly' automatically creates an exception to the rule: 'All birds fly'.
J. Bad News: Predicate Logic doesn't (conveniently) let you handle matters of degree. If I write 'Tall(John)' then I have said that John really is tall. There is no way to say he is sort of tall, or somewhat tall. Similarly you can't (easily) say that the probability that John is tall is 90%.
K. Good News: Many-Valued Logics, Logics of Probability, and Fuzzy Logics allow expression of matters of degree. Fuzzy logics have been found to be quite useful in AI especially in controlling machines.

III. The Psychological Plausibility of Logic
A. Probably the most damaging complaint about the use of logic as a foundation for cognitive science is that it is a poor model of human reasoning. Logic may be a fine standard for good reasoning but is a bad picture of what human beings actually do when they reason. There are two kinds of evidence for this claim.

Further Reading for Weeks 2-3
Predicate Logic H. Pospesel A good introduction to Predicate Logic. (If you don't already know propositional logic, read his Propositional Logic .)
Logic for Problem Solving R. Kowalski The classic text on logic programming.
Minimal Rationality C. Cherniak Argues that logic is too expensive for humans to use.



Cognitive Science Notes Week 4: Rules
M Ch 3, CS 4.1 (AI), 155-159 (Semantic Nets), 164-168 (Rule Based Representations)

I. Introduction: Rules and Problem Solving in AI
A. Logic programming is too limiting. Why model the human mind as predicate logic data plus a logic "engine" when there are so many other perfectly good programming techniques the mind might use.
B. Take a more general approach. A mind is a data structure along with a set of rules (a program) that operates on that data.
C. The basic idea of rule representation was implicitly in PROLOG: have rules of form: If A&..&B then C. But instead of having these as data about the world, think of these as cognitive processes that intelligent agents use: the rules of cognition. Also, why restrict yourself to rules of the if..then form? Any representations of procedures programmable in a computer language can be used.
D. Problem solving is taken to be the application of rules (programs) to explore a search space, i.e. a space of all the possible actions that might lead to a solution.
(Example: What is the search space for a combination lock?)
E. Note on this week's topic: It is "Rules". This can be narrowly construed so that we would be studying just rule-based representations and so-called production systems. Or we can construe this more broadly to include any system that uses programs (sets of rules) to operate on symbolic representations.

II. Some Historical Roots
A. The Logic Theorist (Newell and Simon)
Newell and Simon interview subjects to determine strategies they use to solve problems. (Protocol analysis.) In the logic theorist the task was solving proof finding problems in logic. The program was built around strategic methods found in their protocols.
B. General Problem Solver (GPS) (Newell and Simon)
Newell and Simon felt that there are basic features common to all problem solving: Define goal, resources (initial state), and a "search space" with a measure of "distance" from resources to goal. Fundamental strategy: Find subgoals (with shorter distances from initial state) that contributes to ultimate goal. Now apply the method over again to each of the subgoals.
C. CHECKERS (Samuel) One of the very first programs able to learn a task. It plays championship grade checkers and learned to do so by chaining how the board is evaluated on the basis of past successes and failures.
D. SHRDLU (Winograd)
The program converses with the user about a block world. It can describe the world and carry out commands.
E. Expert Systems MYCIN (Shortliffe)
Expert systems are computer programs that simulate an expert's reasoning ability. For example, MYCIN diagnoses blood diseases on the basis of symptoms and test results. It compares well with doctors. A rank ordered list of diagnoses is produced along with suggestions for further tests and observations. MYCIN is strongly rule based: If stain is negative and rod shaped and anaerobic then probability is .6 that the organism is bacterial.
F. SOAR (Newell and Simon)
Uses chunking or knowledge compiling to define collections of rules that resolve smaller problems. These are created and then stored for use in future problem solving. This provides a kind of rule based learning. SOAR has been able to duplicate the kind of reasoning reported by subjects on the same problems; it also models data on learning rates fairly well.
G. (Shank and Abelson) Developed a system that uses frames and scripts to interpret newspaper clippings.
H. HYPO (Ashey and Rissland) A reasoning system designed to relate a new case to legal precedents developed in the past. It is capable of measuring the similarity of one case to another along various legal dimensions
I. CYC (Lenat) A massive reasoning system designed to simulate the world knowledge of a human being. Lenat's theory is that failures in pervious systems to model cognition are simply due to the lack of enough data. (Gravity is everywhere. Gravity was over the river. Gravity was not supported. Therefore Gravity fell in the river. Gravity is not a sea creature. Gravity cannot swim. So gravity drowned.)

III. Fundamenals of AI
The fundamental issues are these:
A. Representing the Data (other than with predicate logic)

2. The Interpreter Cycle for a Production System
a. Forward Chaining
Examine the context to locate the rules that apply.
Decide which rules to fire
Fire the rules and calculate their effects on the context
Repeat.
b. Backward chaining:
Examine goal and locate rules that would produce it.
Decide which rules are good bets.
Set their antecedents (the if parts) as new goals.
Repeat.
c. Both

V. An Assessment of AI
A. Bad News: Representational Power
As we explained in the case of PROLOG, rules do not have the full expressive power of predicate logic.
B. Bad News: Correctness
In logic we can certify that a reasoning process will be correct. There are no similarly straighforward ways to certify the correctness of rule-based systems. For example the project of certifying programs (of reasonable size) is still far beyond us.
C. Good News: Flexibility
You can program rules to be as flexible as you like, for example you can build them with defaults. 'If x is a bird then x flies' can live in harmony with 'If x is a penguin then x does not fly'. The secret is to build a reasoning system that keeps track of which rules override other rules. (Penguins are a kind of bird. So all rules about birds are overridden by any information you have about penguins.) You can also allow the formulation of rules about how to use rules ... rules about rules about rules etc.
D. Good News: Speed through Heuristics
In logic programming, the system blindly searches though every possible solution. In rule based systems it is a lot easier to formulate heuristics: rules of thumb that help shorten the search.
F. Good News: Rules Seem Ideal for Explaining Language
The grammatical structure of language can be expressed as a set of rules (although no one claims to have an account of the full grammar of English). Chomsky has argued that there must be a set of innate rules that govern language abilities called Universal Grammar. If this is right then rule based approaches correctly model a central cognitive accomplishment.
G. Bad News: Combinatorial Explosion
Dennett's account of R2-D1
H. Bad News: Common Sense is Elusive. Why Computers Can't Understand Language. (Terry Winograd's work on airline reservation systems shows how complex language can be. When someone asks for a flight a little closer to 7, they don't want one a little closer, they want one as close as possible to 7!)
I. Bad News: AI has no theory. What have I learned when I create a computer model that does something? All the AI researcher has done is transfer his intelligence to the computer in the form of a program. This may tell us something about what computers are capable of doing, but it certainly does not reveal the details or even the principles of human intelligence.

Further Reading
Winston, P. Artificial Intelligence A clear text on artificial intelligence with thorough accounts of such topics as representation and search.
Winograd, T. Language as a Cognitive Process Provides a good starting point for understanding the classical approach to natural language processing.
Forsyth, R. Expert Systems A collection describing a number of expert systems.
McCorduck, P. Machines Who Think A excellent survey of the history of AI.




Cognitive Science Notes Week 5
M Ch 4, CS 4.1 (AI), 159-164 (Scripts and Frames), 192-213 (Learning)
you may skim 203-213
I. Concepts in Cognitive Science (I-III after a presentation by Eric Margolis)
A. Concepts are a mystery but cognitive science can now help us resolve some age old philosophical problems, for example: How much of the mind is innate?
B. Common sense and cognitive science may look at innateness in different ways. The ordinary man's questions might be what innate talent did Mozart have that made him so good at music (and me so awful). Or what innate characteristics differ from one race to another.

This shows that 'up' goes with the phrase 'up the mountain', in the first, but 'up' goes with 'called up' in the second. We detect such structural facts without thinking.

II. Concepts: What are They?
A. Philosophers have always analysed concepts: Plato: What is justice?
B. And wondered whether they are inborn.

III. Puzzles about the Classical View of Concepts.
A. We never seem to be able to give the definition (rules) for any concept. Try: A tiger is a large, striped feline. But there are well known counterexamples to the definition: the albino tiger. It seems that what makes something a tiger is not the presence or absence of any one feature but something more like a cluster of features (Consider Wittgenstein's thoughts about family resemblance.)
B. Prototypes vs. Rules.

IV. Learning
A. Which concepts are innate or to what extent are they innate? An how do we explain the acquisition of ones that are not innate?.
B. Techniques for acquiring new concepts: a) combinations of old concepts b) combination of features c) extraction of patterns in raw data (like inductive generalizations) C. Learning is the acid test for AI models, for often good performance is simply due to the fact that the desired ability is simply programmed in. To display the flexibility of human intelligence, you need a model that can generalize from what it has already achieved to resolve novel problems
D. Classical learning techniques. A historical overview.

  • Further Reading
    Shank, R. and Abelson, R. Scripts Plans Goals and Understanding The classic work on AI from the frames and scripts perspective.



    Week 6. Review and Quiz
    Cognitive Science Study Questions for Quiz 1
    Reading: M Chs. 1-4, CS 139-173 (skim case studies 142-151), 192-212 (skim 203-212)
    I. Identify the following terms in a sentence or a phrase.
    Artificial Intelligence Cognitive Psychology
    La Mettrie Leibniz
    Wundt Skinner
    Behaviorism The Mind-Body Problem
    CRUM GOFAI
    Syllogism Predicate Logic
    Universal Instantiation (Specification) Modus Ponens
    Modus Tollens Logic Programming
    PROLOG Resolution Rule
    sound and complete Turing's Thesis
    Godel's Theorem non-monotonic logic
    Fuzzy Logic The Logic Theorist
    General Problem Solver Newell and Simon
    Minsky McCarthy
    Winograd Blocks World
    Search Space Semantic Net
    Frames The Frame Problem
    Blackboard Models CYC
    ARCH-LEARNER Winston
    MYCIN CHECKERS
    HYPO SHRDLU
    Expert system Scripts
    ID3 Quinlan
    selection task Wason
    Tversky and Kahneman Rips
    Shank and Abelson PTRANS ATRANS
    Connectionism Piaget
    Spelke Baillargeon
    Theory-Theory Prototype
    Rosch Chomsky
    Lexicon Miller

    II. Sample Questions (from 2-3 sentences to one paragraph)
    1. What is the difference between cognitive science and cognitive psychology?
    2. What parts of what disciplines make up cognitive science?
    3. It has been suggested that the disciplines that make up cognitive science all study the same question, but employ different methodologies to obtain an answer. Discuss.
    4. Discuss the contributions of each of the following disciplines to cognitive science, citing the name of the subdiscipline that is especially relevant: Philosophy, Psychology, Computer Science, Linguistics, Neuroscience, Anthropology.
    5. Cite some of the practical applications of cognitive science mentioned in your readings.
    Describe the debate between dualists and naturalists. Explain the role of the following figures in the debate: Plato, Descartes, Nagel, McGinn, Aristotle, La Mettrie, Hobbes, Leibniz, Wundt
    6. Briefly describe the difference between top-down and bottom-up methodologies in cognitive science. Note only the main points of difference.
    7. Mind : Brain :: Software : Hardware. Explain how this analogy might characterize the classical approach to cognitive science.
    8. Explain the formula: mental representations + computational procedures = thinking.
    9. Explain CRUM in detail.
    10. What are five ways of evaluating theories of mental representation suggested by Thagard?
    11. Describe the fundamental assumptions of the classical school in cognitive science.
    12. Give reasons that a classical cognitive scientist might cite to argue that the science of cognition is not the science of the brain.
    13. What is the difference between deductive and inductive logic?
    14. Express the following thoughts in Predicate Logic: Albert loves Betty. Albert's father is rowdy. Everything is rowdy. All humans are mortal.
    15. In PROLOG all data is expressed in clauses of the form A&..&B > C. How can we say: not 16. Explain the resolution rule in about four sentences.
    17. From clauses 1-6 derive Cad > using the rule of resolution.
    1. -> Caf
    2. -> Cbe
    3. -> Cbc
    4. -> Ccd
    5. Cxy&Cyz -> Cxz

    18. What difficulties surface in trying to represent in Predicate Logic such sentences as: 'John believes that Mary believes that philosophers are right on Thursdays under a full moon.'
    19. Why is 'Tigger is an albino tiger' a problem for theories of intelligence based on predicate logic? How might the use of defaults or non-monotonic logic help resolve these problems?
    20. Describe advantages and disadvantages of using predicate logic to explain cognition.
    21. Cite psychological experiments that suggest that predicate logic is not a good model of how humans actually reason.
    22. People do not perform very well on Wason's 4-card task. What factors tend to improve performance? What does this suggest about the nature and origin of human reasoning abilities?
    23. Tversky and Kahneman's work shows that many people (given information that Mary was a leftist in college) will judge that it is more likely that Mary is a feminist and a bank teller than that she is a bank teller. This of course violates a fundamental rule of probability: A and B can never be more probable than A. What does this show about the nature of human reason? Does it demonstrate that probability theory is irrelevant to cognitive science? What would a classical theorist say about this?
    24. What are some good reasons for abandoning logic programming and resorting to more general rules and representations in cognitive science?
    25. Evaluate the predicate logic approach to cognitive science along the following 5 dimensions: representational power, computational power, psychological plausibility, neurological plausibility, and practical applicability.
    26. What was Dennett's point in describing the robots R1, R1D1, R1D2?
    27. Describe the abilities and limitations of the following AI programs: CHECKERS, SHRDLU, MYCIN, SOAR, HYPO, CYC.
    28. How does MYCIN (or HYPO or SOAR) work?
    29. Describe semantic nets and frames, citing points of similarity and difference. What are limitations of these forms of representation?
    30. What is the difference between forward and backward chaining? Explain with an example.
    31. What is heuristic search? Explain with an example.
    32. What are strengths and weaknesses of the rules approach to artificial intelligence? Evaluate the approach along Thagard's 5 dimensions.
    33. A major problem in artificial intelligence research is to provide methods for usefully storing vast amounts of relevant knowledge about the world. Explain this problem with an example.
    34. Why would exploration of differences in innate abilities in the population be a misguided approach to the topic of innateness in cognitive science?
    35. Cite some of the things that we all know and take for granted about language, which nevertheless correspond to very sophisticated cognitive abilities.
    36. Describe some of the basic aspects of the notion of an object that have been explored by psychological experiments with infants.
    37. Describe one of the experiments given in class which suggests that the concept of an object is innate. How does the data support the conclusion? Can you think of any criticisms of the experiment?
    38. Piaget believed that the concept of an object was pretty much absent in infants for the first 9 months. His evidence was that infants do not display an ability to manually search in the right direction once an object is occluded. What alternative explanation of this behavior can you give?
    39. What is the theory-theory of concepts? Explain with reference to how the concept of mass is defined in science by such formula as F=ma.
    40. What reasons are there for rejecting the idea that concepts are like definitions in the dictionary?
    41. What evidence supports the idea that concepts are prototypes? What evidence undermines the idea?
    42. Explain the major problem classicists face in explaining how we learn genuinely new concepts.
    43. Explain three different methods whereby cognition might acquire new concepts.
    44. Give an account of the history of the development of artificial intelligence programs that learn, explaining the variety of methods that have been explored. You should cite the contributions of at least three of the following programs: CHECKER, ARCH-LEARNER, ID3, COBWEB, SOAR.
    45. What are the hardest questions to resolve in developing artificial intelligence programs that can learn? Give an account of at least four different approaches to the problem.
    46. Steven Pinker says that artificial intelligence reveals that the easy problems are hard and the hard problems are easy. Provide examples to illustrate his point.



    Cognitive Science Notes Week 7: Neuroscience
    CS Ch. 7 (skim 270-275, 282-289, 291-298, 321-325)
    I. Why Study The Nervous System?
    A. It is intrinsically interesting
    B. To test models in cognitive science
    (e.g. there is a distinction between short-term and long-term memory, in just about every psychological theory of memory. OK, the model would be confirmed if we could locate different neural structures that support long and short-term memory.)
    C. Knowledge about neurology can help suggest new hypotheses about how cognition might work. We may discover genuinely new cognitive mechanisms we hadn't thought of yet. Later I will argue that this is true in the case of connectionist models.
    II. The Brain
    A.
    Brain Cells

    III. How Neurons Work
    A.
    Ion pumps (Potassium Sodium) in the cell wall maintain a difference in charge (called a membrane potential) across the cell membrane, so that the inside is negative.
    B . But there are channels that can open to let positive ions back into the cell (or let negative ions out) and cancel the negative inside charge near the channel. This is called depolarization of the cell wall.
    C. When channels open at the root of the axon (the axon hillock), the reduction of the charge difference (membrane potential) causes neighbor channels to open as well. This causes a cascade of openings (depolarization) down the axon all the way down to its synapses.
    D. At the synapse, the change in charge causes little sacs full of neurotransmitters (called synaptic vesicles) to open into the cell wall, exposing the cell wall of the neighboring neuron's receptor sites to the neurotransmitter. Depending on the neurotransmitter and receptor site, the presence of the neurotransmitter may inhibit or sensitize the neighbor neuron to possible future depolarization.
    E. The effects at all the synapses of the neighbor neuron add together. If there is enough over all activity at its axon hillock, the channels there will depolarize and the neighbor cell will fire.

    IV. Neural Plasticity
    A. During early development neural structure is often formed by the elimination of excess neurons and synapses.
    B. The development of structure depends on the stimulation the brain receives, and when it occurs. If a sighted child is blindfolded during the critical period for creation of sight structures, the ability to see will have great difficulty developing. The same sort of critical period appears in the case of the recognition of phonemes (language sounds) and the ability to process grammatical structure.
    C. If a child loses cortex normally devoted to such functions before the critical period there is a good chance another region of cortex will take over. So the brain is plastic at an early age.
    D. However after a critical period, lost of the relevant part of the brain means that the ability cannot be restored, or is restored with great difficulty.
    V. Brain Regions and Topographic Maps
    A.
    In a normal brain, there are standard locations in cortex of the basic functions (although there are some variations as well). Here is a crude picture of a left hemisphere:



    B. Motor, sensory, auditory and visual cortex are all arranged in topographic maps. This means that regions in cortex correspond to regions of the body, the retina, or the cochlea. For example, parts of sensory cortex respond to stimulation of the palm, and nearby ones to the thumb etc. (See p. 299 for a sample map). In auditory cortex, some neurons are devoted to low pitch, and their neighbors to slightly higher pitch and their neighbors to pitches higher still etc.. An area such as visual cortex may have many different topographic maps devoted to different functions, such as general shape detection, motion, and color.
    To some extent, the specific regions dedicated to a given sensory region vary depending on much stimulation is received there. So the brain is still somewhat plastic at the micro-level.
    VI. Neural Representation
    A. Is there a grandmother neuron, a neuron that fires when I see a grandmother? For that matter, is there a neuron that fires when I see a pure blue visual field? Almost certainly not.
    B. Brain representations are distributed across many neurons. So the representation of my grandmother is no doubt the combination of many many neurons coding for lots and lots of features that make up my grandmother experience: color of hair, facial shape, gait, sound of voice, etc..
    C. Neural representation often uses what is called coarse coding. We illustrate this in the case of color vision. You might think that there are neurons that are responsive to particular wavelengths of light, say neurons for 500 nanometers, for 510 nanometers, etc..
    D. But color vision depends on the fact that we have 3 different kinds of cones (sensory neurons) (called S, M, L) that respond somewhat differently to color. These cones have a very large region of wavelength overlap so that for most colors, all 3 kinds of cones are active at least to some degree. The representation of the color red, for example, corresponds to a characteristic amount of activity on the S, M and L cones. (So there really aren't any red green or blue cones as some popularizations would have it .) Green has its own pattern of activity, and so on for the other colors. This means that a color sensation is represented as a triple of numbers indicating the activity of S, M, and L. This kind of coarse coding is surprisingly efficient.
    E. A similar representation is used to code tastes, but here there are 4 not 3 styles of tasters neurons (roughly for salt, sweet, sour, bitter).
    F . For another example, the direction of a target object is coarse coded as a collection of activities on various neurons. How can this information be used by the brain to control the arm to grab the object? Is it ever averaged together in one place in the brain? Probably not. We just send the raw collection of directions in parallel to motor output to control the position of the arm. The slightly contradictory muscle movements will average out in the arm, and you will get the job done.
    VII. Neuropsychology
    A. Neuropsychology is the psychological study of how the brain carries out specific cognitive functions. This is mostly unknown, but new methodologies offer hope, and some older ones have already given some information.

    Further Reading

    Mind and Brain Scientific American Special Issue, September, 1992. [An excellent collection of introductory articles on the brain. The papers on vision (p. 68) language (p. 88) and neural net learning (p. 144) are especially relevant.]

    Churchland, P. The Engine of Reason, the Seat of the Soul [An enjoyable book. See especially Ch. 2 on sensory coding, and Ch. 7 on brain defects]

    Hardin, C. Color for Philosophers [This is an excellent introduction to the surprising facts of color vision, with interesting philosophical morals as well.]

    Kosslyn, S. and Koenig, O. Wet Mind [This is an excellent and accessible account of cognitive science with strong emphasis on neurlolgy and neuropsychology. Very good on brian lesions and corresponding cognitive deficits.

    Kuffler, S. et. al. From Neuron to Brain [A fine textbook on neurology, though not much on cognition]



    Cognitive Science Notes Week 7: Video: Pieces of Mind

    A. The Fundamental Issue
    Alda, holding a brain in his hands, says this learned a (or even 2 or more) language(s). It felt rage, love and lust, and for a brief moment it felt death. It seems astonishing that a mere blob of glup like a brain could be responsible for intelligence and experience. The job of the scientist who works from the bottom up is to try to make a plausible story about how this could be so.
    B. Split Brains
    We visited Gazzaniga's lab at Dartmouth. We got to see the odd results concerning Joe, whose corpus callosum was severed to control seizures. The result is two hemispheres that operate pretty much independently. Alan Alda (and other normals) cannot draw two pictures with each hand at once. But Joe can do so easily.
    When 'phone' is present to his right hemisphere his left hand (which is controlled by the right hemisphere) can draw a phone. When asked to explain his performance his left hemisphere, which controls his verbal output and could not see the presentation of 'phone' does not know why he drew a phone. The communication between the two hemispheres is by paper, not in the brain itself. The only way for the left hemisphere to know what the right hemisphere saw is to see what it drew.
    When 'Bell' is presented to the right and 'Music' to the left hemispheres and the left hand picks out a picture of a bell, the right hemisphere has an explanation: I picked the bell because I hear music from bells recently. He ignores the fact that there were other pictures more appropriate for 'Music'. We know the real reason was that the right hemisphere got 'Bell'. Oddly his left hemisphere tells itself stories (confabulates) to try to convince itself that it is really in control.
    When painting by Archimbaldo that make images of faces out of pieces of fruit are presented to left and right hemispheres, Gazzaniga predicts that the right hemisphere, where face recognition is located will see a face, while the left hemisphere will see pieces of fruit. That's what we saw.
    C. Memory and Emotion
    We visited Jim Lagaw at UC Irvine, who works with memory in rats and humans. When a rat learns something (where a platform is in murky water) while under the influence of adrenaline (the chemical produced during a "fright or flight" experience), it remembers it much longer. If beta blockers (which block the effects of adrenaline are given to a rat in a stressful situation, it no longer gets the memory advantage associated with the presence of adrenaline. [The Chronicle has recently run stories now about other drugs that improve memory that may be a great help in slowing the effects of Alzheimer's Disease.]
    Similar results can be produced in people. When subjects are asked to remember the contents of a video that is a bit gory (severed legs) their emotional response correlates with how well they remembered details that had nothing to do with the gory part (occupation of the father). When patients are put under the emotion condition but given beta blockers, they still rate the emotional content as high, but the improved memory effects are lost.
    Brain scans of the amygdala (which is a center of emotional control) show that for subjects with brighter (more active) amygdalas the memory is better
    D. False Memory
    We visited with Dan Schacter (Harvard), where Alan was treated to a boring sight of a picnic in the park. By having Alan view pictures of the scene, Dan was able to insert two false memories of the event. [So we should be quite careful about testimony of witnesses in court.] Dan then went on to discuss the question: where is memory located? Probably all across cortex, with sights, sounds, smells, memorized in roughly the areas of cortex where these things are perceived. The hippocampus seems to be central in the process of laying down memories. Can we tell when a person is having a false memory with a brain scan? Maybe. For example, for subjects who are remembering words they have heard, the normal part of auditory cortex does not light up when they are having false memories, and it does light up when they are hiving true memories.
    E. The Purpose of Dreaming
    We learned about some work by Carlyle Smith. Alan Alda goes to his dream lab. It turns out that his ability to detect whether associated symbol strings are true words or not is enhanced if he tries the task just after REM (dreaming) sleep. Could REM be used to help us tighten up or exercise associations between out concepts?
    We also learned that people who are intellectually active (taking a lot of exams) have lots more REM sleep. What kind of associating does REM dreaming help us with. We saw that students learning the logic game Wff'N Proof in a room with a loudly ticking clock, did much better if a similar sound was played during their REM sleep. Could REM sleep facilitate logical reasoning - where we review and connect things in our dreams? Maybe.
    F. The Location and Development of Language
    In adults the grammar words like 'of', 'if', 'all', 'not', 'for', 'the' are processed in the left hemisphere in a specific area (Wernike's area). Is this true of children? Not at all. For children aged about 4, language seems to be processed all over the brain. [Specialization of function in development seems to correspond to focussing activities in separate parts of the brain.]
    There is a window of opportunity when pronunciation and grammar of a language can be learned by the child easily. If a language is learned after this period, the learner will almost certainly end up with an accent, and will have a hard time learning the grammar. This is true even of people who learn sign language. The brain is extremely plastic in youth, but becomes "set in its ways" by adolescence.

    Cognitive Science Notes Week 8-9: Vision
    CS Ch. 12 (skim pp 467-479; 487-490; 506-512 )
    I. Why the Problem of Vision is Hard
    A. Information from the retina is an array of values, one from each rod and cone. This representation is a far cry from what cognition needs: a representation of a three-dimensional world filled with objects.
    B. The representation cognition needs would allow us to distinguish one object from another, to appreciate their positions, motions, sizes, shapes, and textures, despite the fact that lighting in the environment is variable, we ourselves may be moving, and we must recognize objects from many different points of view.
    C. The problem of vision is to explain the mechanism that transforms the retinal array into an object-level representation that can be stored in memory and processed by other cognitive systems.
    II. The Nature of Low-Level Visual Processing
    A. A fundamental question about vision is the extent to which higher cognitive processes such as goals, expectations, attention, reasoning, and conceptual structures, influence the transformation from retinal to cognitive level representations. Although these factors are clearly important, a lot of visual processing can be safely studied from bottom up, leaving aside these top-down considerations.
    B. Common Assumptions about low-level vision

    III. David Marr and the Primal Sketch
    A. Edge Detection. On Marr's theory, the first job the vision system has to do is detect edges or object boundaries. Usually an object boundary corresponds to a quick change in intensity, however there can be intensity changes within the object boundary and sometimes boundaries do not correspond to a noticeable intensity change (green book on a green table). So we will need to supplement this idea in the long run.


    a = normal.......a > normal....... a < normal
    There are off-center cells that have the exact opposite function. They have the feature that if an area is suppressed across the brim but not the center, the activity will be less than normal, and when the center and part of the brim is suppressed, their activity is greater than normal.
    Off-Center Cells

    a = normal..........a < normal..........a > normal

    1. Detecting and oriented edge. Now imagine that whole rows of off-center and on-centers are lined up in rows. If a zero-crossing lines (line of active on one side and inactive on the other) up just right all the cells in that row will fire, and no other cells will. We have a perfect oriented line detector!


    1. By using the same trick for larger or smaller sized patches we can determine how sharp the edge is that we are detecting. (Smaller patches detect sharper edges.)



    IV. Marr's Theory of Higher Level Representation
    A. (the task is to explain how the brain can go from information about:
    Contrast, 2-D Velocity, Disparity, 2-D Orientation to a 3-D world of objects involving:
    Color, Texture, 3-D Shape, Distance, 3-D Trajectory
    B. Levels on the Theory

    V. Treisman's Theory of Attention and Primitive Processing
    A. The thesis is that there are basic visual processes that are computed in parallel that feed information to higher level processes responsible for binding features together. These second processes are calculated in series by an attention mechanism.
    B. We can develop evidence fro this theory by presenting images with target shapes surrounded by distractors. If we measure the reaction time for identifying the targets and discover that it is fast and does not depend on the number of distractors, then we assume it is a basic parallel process. If the reaction time grows with the number of distractors then we assume the process is serial and involves attending to one thing after another in the scene.
    C. For example, the letters L and T have the same elements in the same orientation, and differ only in how the elements are conjoined to each other. Recognition of these targets depends on attention. The differences between them do not just obviously "pop out". However if you examine a field of |s and /s, where the only difference is the orientation, the difference is immediately and easily apparent.
    D. Basic features include orientation, brightness, curvature. A discrimination that requires conjoining features (white triangles and black squares, vs. black triangles and white squares) is extremely difficult to discriminate and takes tedious one-by-one inspection.

    VI. Biederman's Theory of Higher Level Processing
    A. Biederman's theory describes the features of something like Marr's 2.5 D level, but the primitives he postulates (called geons) are more flexible and varied: cones, cylinders, cubical shapes, etc and distortions of these such as changing the length to width ratio and the shape of the center line.
    B. Biederman believes that objects are identified by a process of recovery by components. Each of the components is recognized, and their identity and arrangement allows us to tell what kind of object we have.
    C. But how are the components recognized? By the boundaries between them, which are typically concave and rather sharply sloped inwards. (Consider the "joints" of the Michelin man, for an exaggerated version of the idea.)
    D. Some evidence for these ideas is found in how well we recognize line drawings where parts of the scene are obscured. If the points of connection between parts (apexes of triangles for example) are removed, the scene is hard to recognize. If other parts are removed, but the connecting parts left, recognition is fairly easy.
    E. The fact that objects viewed from strange angles where the "joints" are obscured are hard to recognize (p. 499) is a variant of the same line of evidence.

    VII. Top-Down vs. Bottom-Up
    A. Marr and many other researchers have tried to create theories of vision where the processing from retina to brain does not require higher-level information to identify the object. (For example, concepts like animals have 4 legs, or that the sky is above us and is blue or grey, etc.)
    B. Clearly there are instance where higher level information is required to resolve the ambiguities in a scene. For example the same shape (N) can be read horizontally as an en, and vertically as a zee.
    C. But to what extent does vision rely on top-down processing? Consider the Kaniza Triangle. Here we see an image of a triangle hovering above the scene, but there is no luminance differences on the two sides to allow us to pick out the boundary. Why do we perceive the edge? Perhaps conceptual information about how other things blot out other shapes helps us. However, there is some evidence that this phenomenon is very low level. For example we have evidence that the "edges" are already processed in V2, the next center down from the earliest: V1. So maybe a bottom up explanation is going to be more likely.
    D. Another piece of evidence that low level processing is doing most of the work is the fact that a zero-crossing representation of a set of letter Bs obscured by blobs allows us to recognize the Bs much better than a blob representation. Perhaps the zero-crossing is really doing most of the representational work.

    VIII. Further Topics on Vision
    A. The Up-down axis is used as a major default assumption about how objects are aligned. Violations of this alignment cause errors in identification. (A problem faced by NASA designers of space stations, where expectations about up-down are more often violated.)
    B. The alignment approach to object recognition is a competitor with theories based on recovery by components. One basic idea is that objects have conspicuous alignment points. These can be used to rotate and rescale the image to see if it matches stored representations from a "standard" point of view.
    C. There is a major division in visual pathways between the what system (object recognition) and the where system (object location). These correspond to different topographic maps on cortex. It is possible that the recognition system is narrower view-window that the whole visual field, so that an attention mechanism must be used to move access to this processing from one element to another in the scene. We typically do not recognize all objects in a scene at once.
    D. Eye motions (saccades) that flit from one spot to another in the scene are essential to effective vision. The detection of the motion of the visual scene across the retinal array is suppressed during a saccade. This is one form of visual attention. There is another form that operates even if our direction of gaze is fixed. This serial process may be involved with binding properties together in the same object. Binding errors can be induce by presenting scenes very quickly.




    Cognitive Science Notes Week 9-10 (Connectionism)
    (M. Ch. 7; CS pp. 63-83; 324-325; 92-93; 114-116; 121-124)
    I. Radical vs. Implementational Connectionism
    A. The fundamental connectionist idea is to build models of cognition that are guided by the nature of neural processing, but to abstract away from irrelevant neural features.
    B. There are three different ideas about how the classical or symbolic processing account relates to connectionist theories.

    II. The Basics
    A. Connectionist models are known by many names: (artificial) neural nets, parallel distributed architectures, subsymbolic models.
    B. Units. Connectionist models are connected networks of simple processors called units. The units are supposed to model the basic behavior of neurons.
    C. Weights. The synapses which regulate signals between neurons are modeled by values called weights. Weights can be positive (indicating that activity at the synapse encourages the neighbor neuron to fire) or negative (indicating that activity at the synapse inhibits firing by the neighbor neuron).
    D. Activation Function. It is assumed that all units calculate the same very simple function. The fundamental idea is that the unit i sums the signals it receives from each of the neurons connected to it. The signal aj coming from each unit j connected to i is multiplied by the weight wij between i and j. The sum of these values for each connected unit is calculated. This value might be any positive or negative number. But a neuron's activity is best modeled as a number between 0 (inactive) and 1 (maximum firing rate). So we adjust this sum so that it lies between 0 and 1 with sig, the logistic (or sigmoid) function. (See p. 69 (b) for its graph.) Putting these ideas together, we get the basic activation function for units.
    ai = sig(Sj wij aj) where sig is the function: sig(n) = (1+e-n)-1
    This says that the activity of unit i is the result of multiplying the activity aj of each input neuron by the weight connecting it to i, and then applying the sigmoid function to this sum. Connectionists assume that all cognitive processing results from the behavior of many units all of which compute this function or a minor variant of it. Note that any possible arrangement of connections of such units can be expressed by simply setting wij to zero for any two units that are not connected. Therefore the architecture and behavior of the neural net is defined entirely by the weights between the units.
    III. Standard Feed-Forward Architecture
    A. Many connectionist models conform to a standard configuration called feed-forward nets. There is a bank of input units which contain the signals coming into the system, a bank of output units, recording the system's response, and usually one or more banks of hidden units that are waystations in the processing. In a connectionist model of a whole brain, the input units model the sensory neurons, the hidden units the interneurons, and the output units the motor neurons.
    B. The astonishing thought behind this model is that all the brain does is simply the result of massively many units calculating the activation function according to the settings of the weights (the synaptic connections). Could such a simple-minded calculation really do the job? There is a lot of intriguing evidence that it may.
    IV. Recurrence
    A. Feed-forward Architectures are limited in what they can do. The signal flows directly from input to output. However we know that the brain contains recurrent pathways, that is, pathways that loop back to earlier levels.
    B. Winner Take All Arrays. One use of recurrence in connectionist models is to provide for mutually inhibiting banks of neurons. Each bank sends inhibiting connections to the other bank, with the result that only one of the banks can be active. These arrays can be applied to problems that involve parallel constraint satisfaction. Such nets can model decisions between incompatible alternatives (M. p. 116), for example, the two ways of viewing the Necker cube. Marr and Poggio have used the idea to model how the brain matches up images from the two eyes to facilitate stereoscopic vision. The same kind of models can be used to understand decision making, planning, and explanation (M p. 115-117). C. Simple Recurrent Architectures. In simple recurrent architectures, information on the hidden units is sent back to the input level, so that information about the hidden units at time t-1 is available at the inputs at time t. This provides for a kind of short term memory, and is essential for processing where the net needs to respond to the history of the inputs. Such nets have been shown to be capable of simple grammatical processing.
    V. Learning
    A. The success or failure of a neural net model depends on the selection of the right weights. But how can we determine which weights we need to accomplish a certain task? One solution to the problem is to let the net figure it out. Let the presentation of the input and its response adjust the weights. There are two basic styles of learning in connectionist models: unsupervised, where the net simply adjusts the weights on the basis of the inputs it receives, and supervised learning where the adjustment is done. Descriptions of the most famous unsupervised (Hebbian) and supervised (Backpropagation) learning methods follows.
    A. Hebbian Learning. The idea goes back to Donald Hebb. Put information at the input units, and calculate the activity of all the units. Then increase the weights between active units, and decrease those between inactive units. Do this for all the inputs that the net will encounter. This process will cause the net to classify regularities found in the input. For example, imagine that the inputs code for different features of animals: fur/feathers, 2/4 legs, forward/sideways facing eyes, sharp/blunt teeth, wings/no wings, carnivore/herbivore. Now train the net with features found in animals at the zoo. Weights between such features as carnivore, forward facing eyes, sharp teeth, will get strengthened. Also those between feathers, 2 legs and wings. The net has "discovered" the concepts "bird" and "predator". When features for a new animal are presented it will activate the units that represent the closest category to which those features belong. It is almost as if the net has extracted some prototypes from the data which it can apply to novel inputs.
    B. Backpropagation. Backpropagation is the most popular form of supervised learning. We will illustrate with the example of a net trained to pronounce English words. The spelling of a word is put on the inputs, and a code for its correct pronunciation is to be presented on the output. This task is hard because of the irregularities of English pronunciation: 'have' does not rhyme with 'came' and 'same'; 'though' does not rhyme with 'rough' and even 'tough'. The training set will consist of a list of words together with their correct pronunciation codes. Training proceeds as follows. Start with random weights. Now present the first word in the training set, and calculate the activities of all the units. The output units will almost certainly not match the desired code for that word. For each output unit, trace the source of the error back through the network. Adjust weights (slightly) in the direction that will correct the error. Now do the same thing for the next item in the training set, and so on.
    VI. Connectionist Representation
    A. In local representation, single units are devoted to recording a concept. (Think grandmother neuron.)
    B. In distributed representation, the representation of an item consists of a pattern of activity across all of the units. Nets trained with backpropagation and Hebbian learning spontaneously generate distributed representations of concepts they are learning. For example, a cluster analysis of the activation patterns on the hidden units of NETtalk shows a hierarchy of clusters and subclusters corresponding to phonetic distinctions. There is a main clustering into two: vowel, consonant. And within the consonants subclusters for voice or unvoiced, etc.. In learning the task, the network has acquired the concepts that it needs to process the inputs correctly.
    C. Distributed representations in connections models correspond to extremely complex arrays of values across many units. Therefore the representation for a concept like [cat] can code of lots of features of the concept such as mammal, pet, furry, aloof, stalks-mice, and other features (like how it looks) that we would be hard pressed to describe in language. This so called subsymbolic form of representation allows the symbol to carry its own information about what it is about. The symbol is not arbitrary and atomic the way a word in a language is. By analysing the symbol, you can find out what it "means".
    VII. Famous Connectionist Models
    A. Connectionist models have been used for such divergent tasks as recognizing submarines, deciding bank loans, and predicting protein folding, to name just a few. What follows are a few of the better known connectionist models trained by backpropagation.
    B. TRACE: Rummelhart and McClelland (1986) Predicting Past Tense of English Verbs
    *Input: Phonetic code of present tense verb (sing)
    *Desired Output: Phonetic code of the past tense of that verb (sang)
    *Architecture: Feedforward net without hidden units
    *Training Set: Phonetic codes of present and past tense of 460 English verbs
    *Results: The net learned he past tenses of the 460 verbs in 200 rounds of training, and it generalized fairly well to new verbs, with good appreciation of "regularities" to be found among the irregular verbs (send / sent, build / built; blow / blew, fly / flew). During learning as the system was acquiring more regular verbs, it overregularized: (break / broked). This was corrected with more training. Children are known to exhibit the same tendency to overregularize. Whether this is a good model of how humans process verb endings is a matter of hot debate (Pinker & Price 1988).
    C. NETtalk: Sejnowski and Rosenberg (1987) Pronouncing Written Text
    *Input: 7 letters of the text (including space) in a moving window
    *Desired Output: Phonetic code for the center few letters, which is sent to a speech synthesizer
    *Architecture: Standard 3 layer feed-forward net. (80 hidden units)
    *Learning: A large training set of text coupled with its phonetic transcription.
    *Results: During learning the system goes through stages of babbling, double-talk, and finally intelligible speech, (with some accent). Generalization to novel text is good. Statistical analysis shows that hidden units use a distributed representation of basic phonological features.
    D. Elman (1991)
    * Input: Words drawn from a small set of English words (23 words plus End-of-Sentence) coded in 1s and 0s.
    * Output: One output unit for each each word in the set.
    * Architecture: Simple Recurrent Net
    *Training Set: Grammatical sentences of from this vocabulary for a brand of English restricted to a small subset of its grammatical rules. The grammar did, however, provide for a hard test of grammatical awareness: subject-verb agreement across arbitrarily long relative clauses:
    Any man who hates women who hate men .. also hates feminists.
    *Desired Output: When a word from the sentence is applied to inputs, the desired output is the next word in the sentence. (Of course the net can't possibly succeed at this task.)
    *Results: Nets were trained to be extremely accurate in the following sense, on the presentation of a sequence of words, all and only words that would be legal continuations at that point are active beyond a certain threshold at the output. When a word is presented that violates the rules of grammar no words reach threshold at the output. The trained net came very close to this desired performance.
    VII. Attractions of Connectionist Models
    A. Biological Plausibility. Neural net models "look like" the processing that we find in a brain, especially when we look at the processing we know about: sensory input and motor output. There is evidence for Hebbian learning at synapses. The 100 step rule would suggest that the brain's processing, unlike the usual classical models, is highly parallel.
    B. Soft Constraints. Nets can learn to appreciate subtle statistical patterns that would be very hard to express as hard and fast rules. This allows them to avoid the brittleness displayed by classical models.
    C. Fast Processing of Multiple Constraints. Nets can quickly resolve in parallel the complex set of conflicting forces to make a decision.
    D. Graceful Degradation. When units are lost, the net behaves almost as well. In classical systems the loss of a circuit typically causes a fatal processing error.
    F. Flexible Response to Noise. When the inputs are noisy (if part of the input is inaccurate or obscured by some other signal) nets respond appropriately (though somewhat less accurately).
    G. Vector Representation. There is evidence that the brain is deeply committed to representations in the form of vectors (arrays of values). For example, coding for color and taste are both by vectors of 3 and 4 values. Neural net architectures are perfectly designed to handle vector processing.
    H. Prototypes. Connectionist classification methods remind us of the idea of matching inputs to prototypes. So evidence for prototypes helps sway us toward connectionism.
    I. Unified Theory of Learning. Classical accounts employ a variety of different learning techniques. Connectionists have a simple and fairly unified theory of learning based on backpropagation and Hebbian processes.
    J. No Programming. We do not need to hire programmers to make a system do some complex task. Just build a training set for the desired behavior and let the model learn what to do.
    VIII. Weaknesses of Connectionist Models
    A. Biological Implausibility

    Further Reading
    Clark, A. (1993) Associative Engines MIT Press [Provides a nice review of connectionist work with special emphasis on the problem of representation in language.]
    Elman, J. L. (1991) "Distributed Representations, Simple Recurrent Networks, and Grammatical Structure," in Touretzky, D. Connectionist Approaches to Language Learning , Kluwer, Dordrecht, 91-122. [The classic experiment on teaching nets to learn syntactic structure]
    Pinker, S. and Prince, A. (1988) "On Language and Connectionism: Analysis of a Parallel Distributed Processing Model of Language Acquisition," Cognition , 23, 73-193. [A through criticism of Rummelhart, D. and McClelland (1986). Undermines claims that connectionist models correctly simulate the process of language learning, especially the the acquisition of regular verbs.]
    Rummelhart, D. and McClelland, J. (1986) "On Learning the Past Tenses of English Verbs," in Parallel Distributed Processing , vol. I, MIT Press, Ch. 18, pp. 216-271. [Classical article describing TRACE.]
    Sejnowski and Rosenberg (1987) "Parallel Networks that Learn to Pronounce English Text", Complex Systems , 1, 145-68. [For more on NETtalk.]



    Cognitive Science Study Questions Quiz 2

    CS Ch. 7 (skim 270-275, 282-289, 291-298, 321-325)
    CS Ch. 12 (skim pp 467-479; 487-490; 506-512)
    M. Ch. 7 CS pp. 63-83; 324-325; 92-93; 114-116; 121-124)

    Identify Terms
    Neuron, Glial Cell, Soma, Dendrite, Axon, Synapse, (Cerebral) Cortex, Myelin, Neurotransmitter, Forebrain, Midbrain, Hindbrain, Left/Right Hemisphere, Limbic System, Amygdala, Hippocampus, Thalamus, Lateral Geniculate Nucleus, Cerebellum, Corpus Collussum, Grey Matter, White Matter, axon hillock, Membrane Potential, Ion Channels, Depolarization, Synaptic Vesicles, Receptor Sites, (Somato-) Sensory Strip, Motor Strip, Wernike's Area, Broca's Area, Visual Cortex, (Pre-)Frontal Cortex, Association Cortex, Auditory Cortex, Topographic Map, Retina, Cochlea, Coarse Coding, Distributed Representation, Cones (S, M, L), Grandmother Neuron, Neuropsychology, MRI (Magnetic resonance imaging), PET (positron emission tomography), ERP (event-related potentials), Dissociations, Adrenaline, Beta-Blockers, REM, David Marr, Grey Scale Representation, Primal Sketch, 2.5 D Sketch, 3D Sketch, Zero Crossing Map, Mexican Hat Function, On/Off-Center Cells, Treisman, Biederman, geons, recovery by components, illusory contours, radical connectionism, implementational connectionism,(artificial) neural nets, parallel distributed architectures, subsymbolic models, input units, hidden units, output units, activation function, sigmoid function, weight, fee forward net, winner-take-all net, recurrence, simple recurrent net, Hebbian learning, backpropagation, training set, parallel constraint satisfaction, unsupervised/supervised learning, local/distributed representation, subsymbolic representation, Rummelhart and McClelland, TRACE, Sejnowski and Rosenberg, NETtalk, Elman, Graceful Degradation, Soft Constraints, Multiple Constraints, Systematic Processing

    Questions
    1. Approximately how may neurons/glial cells does the brain contain?
    2. Explain how neurons fire. Explain the role of the following items in your account: axon hillock, ion channels, depolarization, synapses, synaptic vesicles, receptor sites,neurotransmitters
    3. In youth. the brain shows immense plasticity. What are the limits of this plasticity as the brain matures?
    4. Cites at least two examples of cognitive functioning where early plasticity disappears.
    5. To what extent does brain development depend on stimulation from the outside world? Explain with at least two examples
    6. Diagram the cortex explaining what regions are devoted to what kind of processing
    7. Explain how topographic maps for hearing, vision and touch are laid out in cortex.
    8. Explain how colors and tastes are represented in the brain.
    9. Why is it a distortion to say that the retina contains red, green and blue cones? Explain with reference to the way in which color is calculated in the brain.
    10. What are the priciple methods used in neuropsychology. Explain what we have learned about memory from the case of patient HM.
    11. The human brain's left and right sides are usually specialized for certain functions. Give details on these asymmetries.
    12. Explain the odd effects that result from a patient's loss of the corpus collosum
    13. Joe is a split brain subject studied by Gazzaniga. Describe at least two different experiments on Joe and explain their significance for our understanding of the brain.
    14. Describe at least two experiments that suggest that ability to remember is related to emotion.
    15. Explain how false memories of an event were inserted in Alan Alda's brain.
    16. How might we tell when a person is having a false memory with a brain scan?
    17. What did the experiment with the ticking clock show about the possible function of dreams? How did the experiment show that?
    18. What does Carlyle Smith think is a possible function of dreaming?
    19. What is the problem faced by the visual processing system in the brain?
    20. Distinguish top-down from bottom-up theories of visual processing. Cite clear cases of visual processing where the processing is likely to be (at least partly) top-down. Now cite evidence that tends to show that visual processing is primarily bottom up. What does the phenomenon of illusory contours have to do with the issue?
    21. Describe two assumptions about neural processing that guide theories of how visual processing works.
    22. What is the zero crossing map and how do neurons compute it?
    23. Given a bank of on-center and off-center cells which process the zero-crossing map, explain how to construct an (oriented) edge detector.
    24. Marr's theory of vision involves a number of levels. Name four of them and explain what representations at each level are like.
    25. Explain Marr's theory of higher level visual processing (3D sketch).
    26. Explain how experiments can be used to determine which are the basic features of the visual system that are processed in parallel, and which are serially processed by an attention mechanism. Given examples of discriminations of each kind.
    27. Compare Biederman's theory of object recognition using recovery by components with the alignment approach.
    28. On Biederman's theory, how are the divisions between the components that make of objects recognized? What experiments have been performed that tend to confirm this view?
    29. Describe the two main pathways in the visual system and explain what each one does. How does the phenomenon of attention play a role in these two systems?
    30. How are attention and visual binding related?
    31. What is the difference between those who believe in radical and those who believe in implementational connectionism?
    32. Explain fundamental assumption connectionists make to simply their models of the brain?
    33. Describe the items to be found in a generic connectionist model. Explain the range of numerical values that is allowed for each.
    34. What is the activation function. Give the equation, then explain what the equation says about how a unit behaves.
    35. What is recurrence? Give examples of two different kinds of recurrent architectures and explain what they can do.
    36. What is the difference between unsupervised and supervised learning? Give examples of learning methods of each kind.
    37. Explain how Hebbian learning works, and how nets trained with Hebbian learning behave. Use an example to illustrate.
    38. Explain how backpropagation works.
    39. Explain the difference between local and distributed representation. What kind of representations are typically generated by connectionist learning methods?
    40. What is subsymbolic representation?
    41. Connectionist models have been been applied to a wide variety of different tasks. Name some of them.
    42. Describe TRACE or NETtalk. Mention the task to be accomplished, the inputs, the desired outputs, the architecture, the training set, and the results achieved.
    43. Describe Elman's work on simple recurrent nets that process simple grammars.
    44. Give reasons for and against the view that neural nets are reasonable models of a human brain.
    45. List 9 reasons for thinking that neural nets are good models of cognition.
    46. Why would neural net theory be especially well adapted to prototype theory?
    47. What are the major weaknesses of connectionist models?
    48. It has been suggested that multiple constraints and soft constraints are reasons for preferring connectionist over classical models. Discuss the issue.
    49. Explain why the systematic nature of cognitive processing and the mind's ability to generalize pose difficult problems for connectionist models. For which kind of connectionists are the problems most serious?



    Cognitive Science Notes Week 12 (Cognitive Psychology - Imagination)
    (CS 2.1-2.3; M Ch. 6; CS 2.7)
    I. Imagination in Cognitive Psychology
    A. Imagination: the Historical Context

    II. The Imagery Debate

    1. Imagination presents an interesting alternative to the classical approach to cognitive science. Since the classical view takes cognition to be the result of symbolic processing similar to what goes on in a computer, either imagination must be understood as a form of symbolic processing, or imagination must be considered to play no interesting cognitive role.
    2. The classical theory is that cognitive science's job is to discover what is common to the human cognitive architecture. This amounts to something like discovering the functional structure of a computer. Humans are remarkably alike in their basic cognitive abilities, so it is a good guess that there is a structure here that we all have in common.
    3. The basic picture of this architecture is to divide it in three: Sensory systems; Motor systems; and Central systems. The central systems include thinking attention memory learning and language. The idea is that sensory systems are modular. They do their jobs of delivering information to the central system independently from each other and from direction by the central system. The central system receives information from the senses and controls motor output, but it is not driven by sensation. A lot of thought and reasoning can proceed without help at all from sensation. (In fact sensation may hinder thought; why else do we stare at the ceiling when we think hard?) A computer processing away without the help of any new input is a good model of this feature.
    4. Classicists presume that the central system is highly dependent on symbolic processing. There are several reasons for believing this.
      a. Turing machines are a model of computation, and known to be capable of carrying out any function that can be specified by a set of rules, including presumably all possible tasks in thinking, reasoning, attention, memory, learning and language. Turing machines are the simplest kind of symbolic processors (computers). Although the brain almost certainly doesn't implement such a simple and cumbersome machine, variants of the same basic design (von Neumann architectures) have been proven to be extremely useful and powerful information processing devices. What better model of intelligence could we find?
      b. Abilities in reasoning and language appear to require processing that is sensitive to the structure of symbol strings. For example, our language and thinking processing abilities are productive , which means that we are capable of recognizing and understanding an unlimited variety of sentences. The computational model explains how we are capable of this amazing feat. We compute language meaning much the way that a calculator computes arithmetic - not by storing up large quantities of information (the answers), but by following simple rules (similar to routines you learned in grade school to add subtract multiple and divide). Furthermore, our abilities in language and thought are systematic, which means that we use the same processing methods to deal with structures that have the same form. Humans that can understand 'John loves Mary' also understand 'Mary loves John' and every other sentence of the form: Name Verb Name. Humans do not learn the meanings of sentences piecemeal. They compute meanings on the basis of a sentence's structure. The symbolic processing hypothesis would explain this fact. A calculator deals with 3x(5+7) using fundamentally the same operations as it does to calculate 7x(5+3), for it too bases the calculation in the same symbolic structure in each case.

    III. The Advantages of Imagination

    1. Now let us turn to the facts of imagery. Kosslyn has championed the view that imagery is a separate cognitive ability that involves feeedback to the visual system. The idea that imagery is important to cognition provides an alternative to the view that central processing is essentially symbolic. Brains might contain a special graphic processor along with a symbolic processor. What advantages would this graphic processor bring?
      a. One important idea is that visual imagery carries much more information that symbolic representations are capable of. A picture is worth a thousand words.
      b. Consulting an image makes things obvious that we would otherwise have to think out, (eg. the chair is close to the couch in figure 6.2 M p. 96)
      c. There are a number of skills such as finding things, planning errands, trying out ways of building things such as bridges, explaining continental drift etc. where an ability to imagine the various objects, actions and likely outcomes is extremely helpful. We can literally see in our mind's eye the things we should avoid doing when we imagine a course of events. Imagination gives us foresight. It also allows us to adapt ahead of time. For example, just by imagining a task, the athlete can train herself to improve.
      d. There is excellent evidence that much of language understanding is based on metaphors which are in turn founded on visual imagery . For example, top, up ,etc. mean better, stronger (top of his game) down, bottom mean worse, weaker etc. (in the pits).
      e. We also know that imagery is useful in solving problems (the architect's diagrams) and improving memory. There is even some new interest in imagery in artificial intelligence research.

    IV. Experiments on Imagery

    1. Koslyn had people imagine moving attention from one point to another on a map. The time to move attention was proportional to the distance on the map suggesting that attention actually "moves" from point to point in an imaging space.
    2. Mental rotation experiments support the same basic idea. Subjects asked whether two images matched apparently used a mental rotation technique to solve the task, for the time it took for solution depended on the angle through which the image would have to be rotated to align the two. However work of Pylyshyn shows that angle of rotation is not the only feature that effects speed on this task.
    3. Experiments with PET and rCBF scanning show that visual imagination and other cognitive tasks differ in that the former involve activation of visual areas of the brain. Furthermore work with brain-damaged patients shows that brain damage can selectively impair abilities at mental rotation.



    Cognitive Science Notes Weeks 13-14 (Linguistics) (CS 6.1, 6.3, 6.4)
    I. What is Linguistics?

    A. Linguistics is the study of language. The most important thing to explain is how the phonetic input (which is just a sound wave) is converted into meaning. (This is analogous to the fundamental question about vision: how we convert the visual input into a 3-D world of objects.) Other questions include how is knowledge about language represented, and how it is possible for a child to learn language.
    B. Traditionally linguists have taken a less cognitive view of their discipline. The idea is that first we need to characterize the regularities in language with a grammar: a formal theory of exactly how the meaningful units of language are formed. Since meaning of a sentence or phrase depends on grammatical structure, a grammatical account might serve as a framework for understanding how sound is converted into meaning.
    C. Prescriptive vs. Descriptive. Cognitive linguistics, where the concern is with human cognitive activity, should provide a descriptive rather than prescriptive account of human speech. A prescriptive theory would explain how we ought to speak with such rules as: A preposition is something you should never end a sentence with. A descriptive theory explains how we actually do speak: ending prepositions are common in English speech, so the grammar must accommodate that construction.
    D. Competence vs. performance. Nevertheless, even cognitive linguists do not intend to study all the gory details of actual language use, with all its ahs and ums, false starts, sentence fragments, etc.. The theory is not intended to explain our mistakes in the use of language in actual performance . The idea is that each of us had a basic linguistic competence in the language we speak. This competence is reflected in the fact that we all know that 'he loves she' is not grammatical, even though we might actually utter this sentence by mistake on occasion. So linguistics studies the fundamental rules of the various human languages (our competence), but does not try to account to violations of these rules in our everyday performance .
    E. A fundamental presupposition of computational linguistics is that competence can be represented with a set of rules called a grammar. These rules are somewhat different from the grammatical rules you learned in English class because their role is not to preach proper usage, but to reflect linguistic structure. This assumption is an attractive one, for the rule based approach seems ideally suited to explaining productivity (or unlimited ability to understand novel sentences) and systematicity (the fact that we process phrases with similar grammatical form in the same way) of language.
    F. Building a grammar for a language is a very difficult task. We are far from having a grammar for English, or any other natural language. To start to appreciate what some of the obstacles might be consider the pattern of cases where 'that' is optional, rather than required. The pain (that) I feel is unpleasant. Optional 'that'. The dog that bit me has rabies. Non-optional 'that'. Everybody who knows English knows this, but what is the rule here? This is just one of thousands of regularities of English that need to be explained by a grammar.
    G. Another fundamental question is language learning. Every normal human learns a language effortlessly. After a certain age this ability to acquire language is lost, and learning a new language takes great effort. What is the mechanism that makes language learning possible. This question seems especially pressing when we consider how complicated the task of learning language is. (A person with an average size vocabulary, has learned 5-10 words a day during childhood, and has mastered scores an scores of complicated rules such as when 'that' is optional.) Many linguists believe that the data that a child has to go on is way carries too little information for the child to learn language from scratch. Some special mechanism, part of our genetic endowment, must exist that explains our spectacular performance in language learning.
    H. This suggests that there must be linguistic universals, that is regularities common to all languages that the child can bank on to make the right guesses about the structure of the language she is learning.
    I. The process of language understanding can be divided onto 3 levels. At the phonological level the brain extracts words (morphemes) from the pattern of sound at the ears. At the syntactic level, the sequence of words is analysed into a grammatical structure. At the semantical level, rules are used to convert the syntactic structure into the meaning. Unfortunately we wont have time to cover phonology. Since the semantical theory is not as well understood, we will concentrate our study of linguistics on the syntactic level.
    II. Syntax
    A. The same sentence can have more than one syntactic structure: Time flies like an arrow. They talked over the noise. The differences in structure can be revealed with a constituency test. 'They talked the noise over', has one meaning: they talked-over (discussed) the noise. This is explained by a rule that allows a verb-particle cluster to move the particle to the end of the sentence: John called-up the candidate -> John called the candidate up. But you can't do this with prepositions. John called up the stairs -> * John called the stairs up. So this reveals that there are two readings of 'They talked over the noise.' one where over is a particle (they talked-over the noise), the other where it is a preposition (They talked, and over that was noise).
    B. How can we account for such regularities with a grammar (set of rules)? Almost all theories of syntax appeal at some point to the notion of a phrase structure grammar. Here is a simple phrase structure grammar for a fragment of English:
    S -> NP TENSE VP (Sentence -> NounPhrase, TENSEmarker, VerbPhrase) NP -> DET N (NounPhrase -> DETerminer, Noun, RELativeCLause)
    TENSE -> {PRES, PAST} (TENSEmarker -> either PRESent or PAST)
    VP -> V AP (VerbPhrase -> Verb, AdjectivePhrase)
    AP -> A (Adjective Phrase -> Adjective)
    Rules such as these may be used to create Phrase Markers into which words can be inserted to create sentences. (See CS p. 245, p. 247). Class exercise. What else would we need to add to accommodate the facts of English structure? Some answers: relative clauses, prepositional phrases, ...
    C. Linguists presume that information on individual words is stored in memory in what is called the lexicon . The lexicon contains information about each word's pronunciation, meaning and grammatical features. For example, the entry for 'eat' would say this is a verb that takes a direct object, and the one for 'dines' says this is a verb that cannot take a direct object. The entry for 'rice' would say that this a mass noun, and the one for 'steak' that this is a count noun. This information can then be used to check whether combinations of words that form phrases and sentences are legal or not. 'John eats steak' is legal since 'dine' takes a direct object and 'steak' is a noun that can fill that role. But 'John dines steak' is illegal since 'dines' takes no direct object. 'John ate two steaks' is legal because 'steaks' is a count noun, and 'two' is a counting adjective. 'John at two rices' is illegal since 'rice' is a mass noun and does not take the plural or combine with counting adjectives.
    D. The theory of transformations was a central feature of linguistics in the 60s and 70s. Ideas in the theory are still incorporated in some way or another in all theories today. The basic idea is that phrase markers generated from a phrase structure grammar are then modified by a set of transformations. Examples include transformations that do the work of generating questions or converting active voice into passive voice. For example, to get a question from the marker S(NP(DET N) TENSE(PRES) VP(V NP)), we insert the question pronoun 'what' into the last NP and swap it with the first NP: S(NP('what') TENSE(PRES) VP(V NP(DET N)) as in 'What is the problem?'. A similar transformation takes us from the marker for 'John loves Mary' to the marker for 'Mary is loved by John'. The question transformation also explains the form of the sentence 'I know who Bill insulted' It is derived from a phrase marker for the sentence 'I know Bill insulted who' which is then transformed. Transformations also explain how sentences like 'Moe and Curly added salt ' are derived by deletion from markers for the sentence 'Moe added salt and Curly added salt.'
    E. To illustrate how a linguistic theory might be applied let us examine a regularity concerning case. When a noun or pronoun appears in the subject of a sentence we say it is in the nominative case. When it is in the object of a verb, it is in the accusative case. In English pronouns are marked for nominative and accusative cases. We say 'He ran' (not 'Him ran') because 'he' is the nominative case pronoun. We say 'Mary loves him' (not 'Mary loves he') because 'him' is the accusative case pronoun. We would like to explain the following puzzling phenomena. (The star: * indicates a sentence that is not well formed in English.)
    (1) I believe she is a spy
    (2) *I believe her is a spy
    (3) *I believe she to be a spy
    (4) I believe her to be a spy
    (5) *It was believed she to be a spy
    (6) *It was believed her to be a spy
    (7) *It was believed Brenda to be a spy
    This phenomenon may be explained with some simple ideas. Nominative case (she) is assigned by TENSE. The verb 'to be' is tenseless, and that every NP is assigned some case. Why is it 'she' in (1), not 'her'? Because present TENSE assigns nominative case. Why is it 'her' in (4) not 'she'? Well there is no TENSE in 'to be' so nominative is not assigned. Instead, 'believe' takes accusative case in its second argument: 'John believes her' so the same case gets set by default in 'John believes her to be a spy'. Why do neither (5) nor (6) work? First (5) would be wrong because there is no TENSE to assign nominative case. For (6), note that 'It was believed' does not take a direct object: 'It was believed Mary'. So the explanation for why (4) worked won't work here. Sentence (7) is harder to explain because if you substitute 'Brenda' for the pronoun in (1)-(4) they work. The reason 'Brenda' doesn't work in (7) is that 'Brenda' must be assigned some case in every good sentence in which it appears, even though the case is not marked explicitly in the word 'Brenda', Since neither nominative (5) nor accusative (6) is a possible assignment in (7) the sentence does not work.
    III. Universals
    A. The Case for and Against Linguistic Universals



    Cognitive Science Notes Week 14 (Linguistics) (After a talk by Justin Leiber)
    I. How Did Linguistics Get Started?

    A. Language is so ubiquitous, it drops out of view. Therefore we believe that it is somehow easy or obvious. When (say the French) encounter people who speak another language there is an irresistible tendency to think their failure to speak "properly" betrays an inability to reason. (Why after all would they be so perverse to call everything by the wrong name?)
    B. Study of language prior to the 1950s tended to be diachronic or historically oriented. The concern was with how languages developed. There was also a concern with written text. And the concern was prescriptive: this is how language ought to be used, rather than descriptive: here is how people actually speak.
    C. So linguistics was ripe for correction. In the 1920s, there arose new interest non-European languages and cultures. Cultural anthropology is born. There is new emphasis on linguistics as a synchronic study, a study of how a language actually works at a given moment in time. There is a move away from prescriptive linguistic study and to a descriptive account of how people actually talk and how this integrates with the rest of their social doings.
    II. Structural Linguistics
    This was the school (1920-1955) that grew out of these new trends. The two major works in the movement were Bloomfield's Language , and Zelig Harris' Methods of Structural Linguistics . Some features of the school:

    1. Strict behaviorism. We are not supposed to ask people what they mean. Just look at verbal behavior. (Bloomfield does talk a bit about meaning, but he characterizes phrase meaning as the influence it has on others behavior when uttered.)
    2. Interest in languages outside Europe, especially American Indian languages.
    3. Spoken language is emphasized over text. Language is seen as a stream of sound.
    4. To understand a language you need to understand a corpus (set of utterances) used in a culture. Linguistics is the study of "sonic behavior".
    5. All higher level structure in language is determined by phonetic features (bottom up), so for example...
    6. Words are merely the collecting together of sounds that often are uttered together.
    7. Harris was interested in discourse analysis (the analysis of whole groups of sentences). Another influential idea of his was to idea that sentences could be reduced to normal or kernel form. For example passive sentences are converted to active, and relative clauses into separate sentences: 'The visible world was created by invisible God.' ==> 'The world is visible'. 'God is invisible'. 'God created the world'. This inspired the theory of transformations.

    III. Chomsky's Work
    A. Chomsky has had an immense influence. He is the most cited living author. Chomsky proof-read Harris' book and with the support of Nelson Goodman went off to Harvard. There he created a massive work The Logical Structure of Linguistic Theory which laid the groundwork for a whole new approach to linguistics. A piece of this called Syntactic Structures (1957) had a strong influence on linguistics, and can be thought of as a major influence on the birth of cognitive science.
    B. Chomsky borrowed ideas from the theory of formal systems (logic), and applied them to language. In such a formal theory we define what we mean by a well-formed formula (wff) using a set of recursive rules. Like this: p, q, r are wffs. If X is a wff, then so is ~X. If X and Y are both wffs, then so are (X&Y), (X->Y), (XvY), and (X<->Y). These allow the creation of a potentially infinite class of wffs: p, q, ~p, (~p&q), ~(~p&q), (~p&q)->~p, etc... Chomsky's idea was that the same was true of language. There is a set of rules that can be used to generate all (and only) the well-formed sentences of English.
    C. Chomsky also borrowed the notion of a transformation from Harris, an idea that is also inspired by the notion of the application of rules to a set of sentences to create good reasoning. One such rule is Modus Ponens: from X and X->Y, deduce Y. Correct logical rules have the feature that they preserve truth. (If the premises are true then so must be the conclusion.) This logical idea is the analogue of the Katz-Postal Hypothesis that transformations preserve meaning .
    D. On Chomsky's view the goal of linguistics is to produce a grammar, i.e a set of recursive rules that generate the whole host of possible sentences of a language. Language is productive and it is essential to capture this fact in linguistics. The linguist's role is not to study a corpus (set) of actual utterances. Instead it studies human competence to produce an unlimited variety of well formed sentences.
    E. So methodology in linguistics may be very different from methodology in psychology where we must do experiments on many subjects. In linguistics, the linguistic can use his own linguistic competence to make judgements about what is or is not a sentence of her language. Since this competence is presumably shared, there is no need to constuct experiments across different speakers. (This methodology makes psychologists very uneasy.)
    E. The grammar is what the baby learns when it learns language, so the linguists job is to learn explicitly what is learned unconsciously by the child.
    F. By studying grammars we study the rules that children acquire when they learn language. Since the language heard by children does not contain enough clues to determine the grammar, there must be some innate mechanism that helps the child learn the linguistic rules.
    G. The goal is to provide formal theories of language at three possible levels (the higher the better). 1. Observational Adequacy: (Weak generative capacity) Rules that generate all and only the sentences of the language.

    IV. The Hierarchy of Grammars
    A. Finite State Machines. These are devices that merely move from one box or node to another, making a selection from each box and continuing along any arrow from a box. Chomsky showed that no finite state machine can generate all and only strings of the form: aaaa...bbbb... where the number of as and bs is the same. But language has structures with similar complexity, notably wherever there are rules of agreement (say) between noun phrase and verb phrase: 'John loves Mary', but 'Men love Mary'.
    B. Phrase Structure Grammars. Phrase structure grammars allow the introduction of rewrite rules with variables referring to grammatical types. This vastly improves the power of the grammar to account for the structure of language. Rules of language must be expressed not by how one transitions from one 'box' to another, but by the understanding of grammatical categories: NP, VP, Auxiliary Verb,. How else did Justin's daughter generalize to the sentence : 'I am going am'nt I?' She "knew" the rule that tag negation works with auxiliary verbs 'I have it haven't I' (but not regulars '*I walk, walkn't I') so this construction makes sense. This kind of over-generalization is a basic feature of language learning: children learn irregular plurals: mice, but as they become more familiar with the regular plural rule, they go back to mistakes: mouses. Click experiments also suggest that we are aware of the phrase structure of sentences that we hear, For the time at which clicks are heard subjectively moves towards the major grammatical boundaries.
    C. Transformational Grammars. These introduce yet another innovation. Rules that transform phrase structures into alternative forms. Transformations provide especially economical explanations for the formation of questions, and passive voice, but also in accounting for deletions ('John and Mary like Jill' instead of 'John likes Jill and Mary Likes Jill') that we may be using to help memory chunking that helps overcome the 7 plus or minus 2 constraint on short term memory.



    Cognitive Science Notes Weeks 14-15 (Philosophy) (M Ch. 9; CS 8.3 but not pp. 355-362)
    I. The Mind Body Problem
    A. What is the Mind? How is the Mind related to the Brain? Some possible answers:

    II. Some Problems for Functionalism and Nearby Brands of Materialism
    A. Emotions. It does not seem that emotions could be computational states of anything. Suppose a computer-robot were to mimic my brain's computational states exactly. Still it wouldn't feel anything. There seems to be an important divide between thought and rationality on one hand and the emotions. Perhaps functionalism is a good theory of the former but it seems much less able to account for the latter. Here are some responses to the challenge.

    Further Reading on Consciousness and Qualia

    Chalmers, D. The Conscious Mind (Presents a dualist theory. Has a good bibliography)
    Chalmers, D. "The Puzzle of Conscious Experience," Scientific American 273, (1995) pp. 80-86 (A very accesible review of Chalmer's dualism)
    Churchland, Paul "The Rediscovery of Light" Journal of Philosophy , 1995
    Churchlands, Paul and Patricia "Could a Machine Think?" (Jan. 1990) Scientific American pp. 32-37
    Crick, F. and Koch, C. "Towards a Neurobiological Theory of Consciousness," Seminars in Neurosciences 2 (1990) pp. 263-275 (A neurphysiological view: consciousness is a certain synchronized activity of neural bundles oscillating at about 40 hertz).
    Damasio, A. Descartes' Error (Argues that emotion is central to cognition)
    Dennett, D. Consciousnes Explained (An instrumentalist account of consciousness. Fun reading but challenging noetheless.)
    Searle, J. "Is the Brain's Mind a Computer Program?" Scientific American (Jan. 1990) pp. 26-31 (An easy entry to some of the arguments against functional accounts of qualia)



    Cognitive Science Study Questions Final

    Part I
    Cognitive Psychology - Imagination (CS 2.1-2.3; M Ch. 6; CS 2.7)
    Linguistics (CS 6.1, 6.3, 6.4)
    Philosophy of Mind (M Ch. 9; CS 8.3 but not pp. 355-362)

    Identify Terms
    John Locke, William James, Principles of Psychology, Behaviorism, Central Systems, Motor Systems, Sensory Systems, The Imagery Debate, Turing Machine, von Neumann architectures, Productivity, Systematicity, Symbolic Processing, Kosslyn, Descriptive vs Prescriptive Linguistics, Performance vs. Competence, Grammar, Finite State Machine, Phrase Structure Grammar, Transformational Grammar, Universal Grammar, Phonology, Phoneme, Morpheme, Syntax, Semantics, Discourse Analysis, Constituency Test, DETerminer, Lexicon, Mass vs. Count Noun, Active vs. Passive Voice, Nominative vs. Accusative Case, Principles and Parameters, Binding Theory, Bonding Theory, X-bar theory, Agent vs. Patient, Structural Linguistics, Bloomfield, Zelig Harris, Normal Form, Kernel Form, Chomsky, Syntactic Structures, Well-Formed Formula (wff), Katz-Postal Hypothesis, Corpus, Observational Adequacy, Weak Generative Capacity, Descriptive Adequacy, Strong Generative Capacity, Explanatory Adequacy, The Mind Body Problem, Materialism, Physicalism, Naturalism, Reductive Materialism, Identity Theory, Functionalism, Eliminative materialism, Instrumentalism, Damasio, amygdala, dopamine, serotonin, qualia, quale, Absent Qualia Objection, Chinese Nation Objection, thalamus, argument from lack of imagination, zombies, ersatz qualia.

    Questions
    Cognitive Psychology
    1. Give an account of the shifting attitudes towards imagination that have developed in the last 100 years in cognitive psychology. Give the names of major influential figures, and explain the cultural events that affected these intellectual trends.
    2. Why has research on imagination been relatively unpopular in cognitive psychology until relatively recently. What has encouraged new interest in imagination?
    3. Give reasons for thinking that a complete cognitive science should study imagery.
    4. What evidence do we have that the ability to imagine is important to thought and reasoning?
    5. What happens to a man blind from birth due to cloudy corneas (but whose eyes are otherwise normal) when his sight is restored. Explain your answer.
    6. Cite evidence for each side in the imagery debate. A good answer will explain what the debate is about and how the evidence is relevant.
    7. "The classical theory is that cognitive science's job is to discover what is common to the human cognitive architecture." Explain in detail.
    8. According to the classical theory what are the main divisions in the human cognitive architecture? Where in this architecture is symbolic processing carried out?
    9. Cite reasons for thinking that the brain depends on symbolic processing.
    10. How do the phenomena of productivity and systematicity support the symbolic processing approach to cognitive science?
    11. What advantages do images have over symbolic forms of representation?
    12. Describe two of Kosslyn's experiments on imagery and explain their significance.

    Linguistics
    1. What is the fundamental question to be answered by linguistics?
    2. Explain the difference between prescriptive and descriptive linguistics. How would you classify linguistics today and why?
    3. What is the distinction between competence and performance theories in linguistics. Explain how the decision to create a competence theory affects methodology in linguistics.
    4. What is a grammar and why have grammars been important to linguistics in the last 40 years?5. Why is it so hard to provide a grammar for English? Cite examples to make your case.
    5. Many linguistics think that the human brain contains a special device especially dedicated to language. What evidence is there to support this view?
    6. What are linguistic universals? Why are they important in theories of language learning?
    7. Use an example to explain how a constituency test may be used to reveal the linguistic structure of a sentence.
    8. Create a phrase structure grammar that is capable of generating the following sentences (and sentences like them). Sally drank booze. The man drank. The man who visited Sally drank booze.
    9. What is the lexicon? Given an example of how it plays a role in determining which sentences are grammatically well-formed.
    10. Explain what the theory of transformations says and give examples of some phenomena it explains.
    11. Give a linguistic theory that explain the following data (* means the sentence is ungrammatical):
    (1) I believe she is a spy (2) *I believe her is a spy (3) *I believe she to be a spy (4) I believe her to be a spy (5) *It was believed her to be a spy (6) *It was believed Brenda to be a spy.
    12. What are linguistic universals? Give some examples. Why do linguistics tend to believe in them?
    13. What is Principles and Parameters Theory? Illustrate with an example or two.
    14. What are binding theory, theta theory and X-bar theory about?
    15. Linguistics prior to 1950 tended to be diachronic rather than synchronic. Explain with examples.
    16. How did the birth of cultural anthropology in the 1920s influence the development of linguistics?
    17. Describe the structural linguistics movement. Name its most influential authors, and explain the fundamental assumptions made by this school.
    18. How was Chomsky's work different from work done by structural linguists?
    19. How did ideas drawn from logic and the theory of formal systems influence Chomsky's ideas abut how to proceed in linguistics?
    20. What is the goal of linguistics according to Chomsky?
    21. Chomsky felt that the linguist's role is not to study a corpus (set) of actual utterances. Then what is the linguists role and how does it affect the linguist's methods?
    22. Why is learning so important to Chomsky's theory of language.
    23. Why are finite state grammars inadequate for explaining language?. How does the use of phrase structure grammars improve matters?
    24. What evidence do we have that the human brain actually processes a phrase structure grammar?
    25. Name an advantage that transformational grammars might provide for human language processing..

    Philosophy of Mind
    1. What is the Mind-Body Problem? Why is it so difficult?
    2. Give at least 4 different positions one might take in the philosophy of mind and explain how they differ from each other.
    3. Why do you think that most cognitive scientists who have a worked out philosophical position tend to be functionalists?
    4. Why do emotions challenge the functionalist account of the mind? Explain a number of different strategies one might take to meeting the challenge.
    5. Give the basic outlines of a functionalist theory of the emotions.
    6. Give the basic outline of a psychophysical account of the emotions.
    7. What are qualia? Why is accounting for qualia such a difficult problem for functionalist theories of the mind?
    8. Describe at least two different strategies for providing a scientific account of qualia.
    9. What is the absent qualia objection to functionalism? Explain with reference to zombies and the Chinese Nation. What responses to the problem of absent qualia have been explored?
    10. "I just cannot imagine how the activity of neurons could amount to conscious experiences." Give reasons for an against the idea that this counts as a good objection to functionalist theories of consciousness.

    Part II The Whole Course
    (I will select items from the study questions for quizzes 1 and 2. I will also include a section where I will ask more general questions such as these:)
    Our course in cognitive science has been drawn from the following academic areas: logic, artificial intelligence, neuroscience, neuropsychology (especially vision), cognitive psychology (especially concepts and imagery), connectionism, linguistics, and the philosophy of mind. Select as many of these areas as are relevant and explain how research in these disciplines throws light on the following questions:

    1. Whether top-down or bottom-up theories of cognition are more apt.
    2. What is the nature of reasoning?
    3. Whether a symbolic processing model of cognition is adequate.
    4. What are concepts?
    5. What is consciousness?
    6. How do we understand language?
    7. How are objects recognized?
    8. What are perceptions, for example, the sensation of seeing the color red?
    9. What features of cognition are innate?
    10. How does the brain learn?
    11. What is human memory?