Nina's Research Interests
{This page is still being written}

My primary interest is in how people produce language. The average person's spoken vocabulary is approximately 30,000 words (not counting inflections, like plurals and verb tenses as different instances of words). The question is, how do we so quickly and accurately (for the most part) select from amongst all of these words when we are speaking?

I have investigated this question using various methods and subject populations:

My dissertation (and one of the papers I am currently writing) involved a detailed investigation of both word substitution errors and tip-of-the-tongue (TOT) states. The purpose was to gain a better understanding of how word forms are organized in our brains in such a way that we are able to select them when we want to speak.

Word Substitution errors

A word substitution error occurs when a person intends to say a particular word and accidentally inserts another word in its place. For example:

"I got whipped cream on my mushroom"
  instead of:
"I got whipped cream on my mustache"
This type of word substitution error is referred to as a "form-related" error because the target and error are similar in form, but not in meaning. There are also such errors that overlap with their target in meaning, but not in form. Click here to see a description of a particular example. I have done extensive investigations into a large set of these errors. Specifically, I have looked at the phonological relationships that obtain between the targets and the errors. These relationships may tell us something about the way that words are organized in our heads. So, if there were no relationship between targets and errors, we might think that words are organized randomly and if we make a mistake, we just get some other random word. In fact, there are quite strong relationships between targets and errors. They almost always share the same number of syllables and grammatical category, and they often begin with the same sound(s), as in the example above.

Many researchers have noticed that there is a tradeoff between semantic and phonological relationships between targets and errors. So, if there is a strong phonological relationship, there is less likely to be a semantic relationship and vice versa. Coding semantic relationships is extremely difficult. One way that researchers have dealt with this issue is to have people (usually college freshman taking "Introduction to Psychology") rate the relationships between two words. This is actually quite a good method. However, a large portion of the errors in the set I am using were collected by linguists and psycholinguists. This means that they were often made by people with a high level of education who do a lot of reading and writing and who also have vocabularies that are specific to their field of study. This means that most people would be unfamiliar with many of the terms.

In order to address this issue, I first coded all of the relationships by hand and confirmed them with another rater. This proved to be more difficult than one would think and there was not always agreement. As an improvement, I had wanted to get some sort of objective representation of meaning. Two sources were available at the time. One is a large project at Princeton called Wordnet. This is a hand coded connected listing of many different types of semantic relationships. The problem with using Wordnet for this purpose is that it does not code certain types of relationships that tend to occur in speech errors. Specifically, there are some that bear an associative relationship, like "feet" for "shoes." These types of relationships are not currently defined in Wordnet. The other option is HAL, the hyperspace analog to language. This system uses cooccurence frequencies as a measure of relatedness. The problem with using HAL for this purpose was that although it does an excellent job, for the most part, of estimating semantic distance, it has some built in biases. For example, here are some pairs that it thought were distantly related:

589: "cent" and "dollar"
614: "bib" and "napkin"
769: "spider" and "fly"
Here are some pairs that it thought were closely related:
174: "teach" and "take" (presumably because they both occur with the word "class" - which is itself a relatively high frequency word - very often)
138: "high" and "full" (perhaps they both frequently occur with "price")
145: "build" and "bring"
As noted, it relies on cooccurences. This means that if two words both frequently occur with another word, particularly if that word is relatively high frequency, it will consider these two words to be highly related. It also considers things to be more related if they are the same part of speech, which may or may not be accurate in terms of how people think of semantic relationships. The reason HAL does this is because if two words occur in close proximity to the very frequent word "the" - then they come out as related. Pretty much any noun is therefore somewhat related to any other noun. The creator of HAL has noted this in a few papers, that it is really more of a measure of syntactic relationships than semantic. Still, some semantic relationships do fall out of it and it is quite interesting to see just how far it can go.

There are several alternatives emerging now. They use a similar principle to HAL, but are more specific about semantics. They fall under the class of models using Latent Semantic Analysis, or LSA. There is a demo of one on the web, here. Also, see my bookmarks for some others.

Because we had some ideas of our own about how semantics could be represented using a computer analysis of a corpus and because we wanted specific information contained in our corpus, we endeavored to develop our own model. This uses a parser to tag parts of speech and then relates members of the same grammatical category to each other by the way that they relate to words in other grammatical categories. For instance, the words "car" and "truck" would presumably frequently occur as objects with the word "drive" as the verb. Furthermore, they might both be described with similar adjectives, such as "fast" and "ugly." This reduces the problem that simple cooccurence tabulators encounter with words being associated with high frequency function words (like "the" and "to"). This project has been halted temporarily due to a computer technical problem that we could use some help with. Go here to read about the project and about the problem that is holding us up.

TOT states

A TOT state occurs when a person is attempting to say a particular word and is temporarily unable to recall how to say the word, although they usually know precisely what the word means. Everyone has experienced this awful sensation, which has been likened to being on the brink of a sneeze. I collect instances of this sort of experience, so if you have one (or many) please feel free to email me at the address at the bottom of this page.

Often when people experience TOTs, they can think of other words that are not the one they are searching for. Part of my work involves describing the relationships between these other words, or alternates, and the target words. Further, I am interested in what the information that people have available while they are in a TOT state can tell us about the normal language production process. One way to address this question is to compare the overlap between TOT target words and their alternates with the overlap between word substitution targets and the errors that are produced in their stead. I am currently writing a manuscript describing such a comparison.

Also click here for more information about current research on TOT and what it tells us about "the mind's language machinery."

Or here for another article about TOT experiences and research on them.

I have found another recent article about Deborah Burke's TOT research.

Daniel Schacter recently spoke on NPR about memory lapses including TOT states. click here to hear the interview.


Another part of my dissertation was an attempt to take advantage of the fact that people have TOTs more often for names than for any other type of word. This experiment involved teaching people names of "aliens" by showing pictures with a name written below them. I used aliens so that the objects would not be familiar and so that I could control phonological characteristics of the names.

Participants saw the names paired with the pictures in a "study" session. This was followed by a "test" session in which they only saw the pictures and had to try to recall the names. They went through this study/test procedure several times until they learned the names of all 18 aliens. A week later, they returned and tried to recall the names when looking at the pictures. I recorded the information they could provide when they could not recall the names of the aliens. Half of the subjects saw the items presented as names, as in "This is Hork" and the other half of the subjects saw them as kinds, as in "This is a Spagligon." We expected to find more difficulty in learning and recalling the items that were presented as names.

Several interesting findings came from this study. One was that there was no difference between the items learned as names versus as kinds. It is possible that the simple difference of one letter "a" is not sufficient. However, other studies have used this technique and found differences between the two conditions. One possible way to elicit a difference might be to present the aliens as a group for the "kinds" condition. Another interesting finding was that the types of responses people gave and the proportion of correct answers of various types was similar to what is given for words and names that people have known for a long time. This suggests that this technique is useful for eliciting TOTs in the laboratory.


I actually got my start in psycholinguistics by working in Myrna Schwartz's lab at Moss Rehabilitation Institute. Ever since then, I have been fascinated by the various forms of aphasia that can occur. The most common form is anomia, in which a person is frequently unable to recall words that they used to know. This situation is very similar to the TOT state that happens to all people, but in these cases, it happens for very familiar words. Some of my work is aimed at determining whether people in this situation have more information about the target word than they are able to express. If so, it may be possible to use this fact to help them to retrieve the word more easily.

Another type of aphasia is "deep dyslexia." This is a disorder of reading in which people are inaccurate at reading function words (like "the," "and," "to," and "of"), make many errors on inflectional affixes (like reading "cats" as "cat") and have tremendous difficulty reading nonwords (like "chint"). One interesting characteristic of this disorder is that they make semantic errors in reading words, e.g., reading "snow" as "winter." This is thought to occur because they are unable to make use of the normal way of reading using correspondences between letters and sounds (this inability is why they are unable to read nonwords). Therefore, they must rely on a different way of reading, by accessing the meaning of the word and then pronouncing it based on the meaning. Some of the time, this is successful, but it results in the inability to read words with less meaning (like the function words and nonwords) and also in semantically related words being produced to some concrete nouns.
My Master's Thesis was an investigation of three people with deep dyslexia. We were interested in determining whether their ability to read some of the function words, usually presented in list format by researchers and speech pathologists, might be improved by presenting them in sentence contexts. In all three patients, the sentence contexts did improve their reading of the function words.


A morpheme is the smallest unit of language that has meaning on its own. Traditional studies of morphology have assumed that there are at least three major types of morphemes: bases, inflections and derivations. An inflected word, like "dogs" has two morphemes - "dog" and "-s" with the "-s" being the inflectional marker for the plural. Verb tenses are also inflections (-ed, -ing, and -s). A derived word, like "eruption" also has two morphemes - the base "erupt" plus the derivational inflection "-tion" which makes a verb into a noun.

One of the major issues with respect to morphology has been whether the morphologically complex forms have their own representations in our list of words in our heads (the lexicon), or whether they are formed by rules each time we use them. Irregularly inflected words (like "mice" and "brought") are thought to have their own unique representations (since they cannot be formed by rules); whereas regularly inflected words (like "houses" and "jumped") are represented in our lexicon as the base form and when we want to make a plural or past tense, we simply use the rule. According to Chomsky and others, this is the source of children's errors in which they overgeneralize a rule they have just learned, creating words like "bringed."

The issue becomes less clear when we seek to determine whether derived forms are represented as base plus affix or as a whole. Furthermore, many people do not agree with this traditional interpretation. One alternative is that our brains pick up on the regularities present in the language we use. Thus, the regularity between the pronunciation, spelling and meaning of the past tense "-ed" is represented by the fact that all of these interconnections will correspond. According to these types of theories, "connectionist" theories, although the morphological relationship between certain pairs of words may be the same, the strength of the relationship between them might differ depending on the semantic, orthographic and phonological similarity within the pair.

I am involved in two different projects that investigate the role of morphology in language. One looks at the relationships between targets and errors in word substitution errors (described above) and the other looks at the types of errors that can result from a particular type of aphasia (also described above), phonological dyslexia.

Laura Gonnerman and I are collaborating on an investigation of the role of morphology in speech errors. There are several questions we are interested in. One is, if someone makes an error involving movement of affixes, does the ability of the affix to move depend on its semantic transparency? For example, an actual recorded error was the following:

"Sure they could try to get other jobs, but it would be hard since the uninflation rate... uh, unemployment rate is so high."
In this case, a meaningful prefix is moving, but all sorts of other errors are possible.

The other project related to morphology that I am working on involves an investigation of two people who had brain damage that resulted in a problem with reading called "phonological dyslexia." The most prevalent symptom of this disorder is that they are unable to read nonwords (like "plig"). In addition, many phonological dyslexics have problems reading morphologically complex words. The two people in this study were able to read pseudosuffixed words like "corner" that has an ending that might be thought of as a morpheme "-er" but which is not performing that function in this word (another example of a pseudoprefix is the RE in "relish" which is not performing the usual function of the prefix RE-). However, these patients made more errors on words with actual suffixes, like "officer." We (Branch Coslett, Eleanor Saffran and I) are developing an account of this performance based on this and many other aspects of their reading pattern.

Bilingual and Cross-linguistic work

In collaboration with Tamar Gollan, I have investigated TOT states in bilingual Hebrew/English speakers. To see the abstract, click here. This work has been accepted for publication in the journal Bilingualism: Language and Cognition. If you are interested, you can email the first author, Dr. Gollan.

I am also beginning to collaborate with Bennett Schwartz in an investigation into the question of whether having some partial information about a word is at least part of what gives one the sensation of being in a TOT state. We intend to use bilingual subjects for this investigation to help us match words presented by the experimenter with words evoked within the subject. This project is still in the planning stages.

A third project using bilingual subjects will investigate the role of a first language on the types of syntactic constructions created by a second language speaker. Ines Anton-Mendez and I are planning an experiment that uses a speech elicitation paradigm to elicit specific types of utterances in English from native Spanish speakers.

Send me an email: at

back to my home page