Garrett Mitchener's Research:
Mathematical approaches to linguistics
|
Abstract
|
|
My mathematical interests are in dynamical systems and
probability. My application is to linguistics: I'm
interested in language change, and understanding why it
occurs, how it spreads, and what this can tell us about
the human mind. Much of language change happens inside a
black box, because the mental and social processes that
drive it are not directly accessible. Often, the only
available information is a limited corpus of the old
manuscripts that happened to survive the ages.
Mathematical models and simulations can help fill in the
missing information and allow researchers to test and
improve theories about how we speak, write, learn, and
interact. Math can also help us make the most of the
limited data we have and properly connect it to theory.
On this page, I give background in linguistics and
describe the problems I hope to solve.
|
|
What is mathematical linguistics?
My research focuses on the rather unusual combination of
mathematics and linguistics. Linguistics is in many ways a
social science, involving interviews and studies of documents
for example, but it's also mathematical, in that one of the
goals of the field is to understand the brain by finding
abstract computational machines that approximate its neural
machinery. And until recently, the term mathematical
linguistics primarily meant exactly that: the study of
abstract grammars and abstract machines, for example, regular
languages, context free languages, finite state automata, Turing
machines, and so forth. Many of these abstract theories of
language are not actually useful for understanding human
language, and they usually wind up in theoretical computer
science rather than linguistics. But there's more.
More recently, mathematics has become increasingly useful for
other areas of linguistics. For example, speech processing is
very mathematical. Signal processing (which is based on
functional analysis) and statistical learning theory are used to
study speech, produce automated transcription of speech, and
build human-computer interfaces based on speaking rather than
typing. I've dabbled in some of this, particularly automated
transcription, and believe me it's harder than it looks.
Getting a computer to transcribe an utterance into phonetic
symbols is extremely difficult; getting it to recognize words
from a particular language under limited circumstances is
easier.
The areas I'm most interested in are learning processes and
population dynamics, and how these relate to actual human
language, not the ideal abstractions of theoretical computer
science. These areas are related to historical linguistics,
which is the study of how languages change and what caused those
changes. For example, here's a little bit of Old English
(spoken till 1100 AD or so) from
Ælfric's
Homilies:
On twam þingum hæfde God þæs manes
sawle gegodod.
Here's some Middle English (spoken from 1100 to 1600) from the
Rule of St. Benet:
In þa dais sal we here sumþing of godes seruise.
The pronunciation, syntax, morphology and spelling of English
have all changed dramatically over the centuries. And this is
typical of all languages: For some reason, children at some
point learn a language that's just a little different from their
parents', and over the years those little changes add up into
transformations of the entire language. Changes seem to occur
because of complex interactions between usage patterns, child
and adult learning processes, and population structure. At the
moment, my main line of research is to develop mathematical
tools for understanding this change process.
Another question of interest is: Assuming an evolutionary origin
for the human species, how might language have evolved in humans
from a non-linguistic ancestor species? Initially, it seems
perfectly sensible to say that language provides tremendous
survival benefit, and to attribute its appearance within humans
to mutation and natural selection. But, to be satisfied with
that little bit of an explanation is naive. A few thought
experiments show how much deeper the question goes: If language
is so useful, why did it evolve only once? (As far as we know,
many other species communicate, but not with anything as complex
as human syntax.) There are also bootstrapping problems. If a
mutation enabling language appears in one individual, its
benefit can't be realized because there's no one to
talk to, so there's no selection for the mutation and
it's effectively invisible to evolution. Then there's
the complex system problem. Language is an extremely complex
system, involving structured meanings, speech production,
parsing, and modeling of the speaker's mental state. One
part alone bears no obvious benefit without the rest. For
instance, structured speech is pointless if no one can parse it,
and the ability to parse structured utterances is useless if no
one produces structured speech. However, it's
astronomically unlikely that all the parts appeared at once
through a massive mutation. In short, there's a lot left
to be explained, and that's just for the origin of
language.
We'd also like to know how evolutionary forces may have
influenced the form of language, and questions like these come
up: Is there some reason that nouns have gender in most
languages? Why do we have contractions, irregular forms, etc?
Why is it difficult for adults to learn second languages? And
all sorts of theories can be proposed to explain these in terms
of some survival benefit, but initially they all lie in the
realm of pure speculation and “just-so stories.”
Unless there is strong evidence linking an aspect of language to
a proposed survival benefit, and strong evidence that the
proposed benefit was actually helpful to survival at some point,
there is no reason to accept that as the correct explanation.
My
dissertation began to address some
of the issues surrounding language and evolution. Most of it
centers around the simplest non-trivial mathematical models of
natural selection with imperfect learning and genetic variation
in the language faculty. And despite the simplicity, the
results are strikingly complex. For example, one version of the
model results in chaotic oscillations among grammars. Other
instances show that the traditional cartoon of evolution, in
which a “better” variant of a species takes over
from its ancestor, doesn't necessarily apply to language.
A mutation that gives an individual greatly improved language
skills that happen to be incompatible with the existing language
will most likely die out because its benefit can't be
realized. Another instance of the model has a property called
accidental stability, because it shows how a mutation
can spread or die out depending on which grammar a population
chooses before the mutation appears. In short, an essentially
accidental choice determines the genetic makeup of future
generations, and any straightforward notion of
“fitness” is thrown away. These models indicate
that there are probably
not explanations of the form
“Human language has feature X because it provides survival
benefit Y” for most of the features of human language, and
that we must step back and re-think what it means to have an
evolutionary explanation for something.
Why math?
So where does math come in? Historical linguistics and
biological anthropology are for the most part observational
sciences rather than experimental sciences. We don't get
to do experiments like reset England to its state in 200 AD and
let history repeat itself and see if it takes the same path.
Nor can we interview people from 8000 years ago to see what
their language was like. We also don't get to resurrect
pre-human ancestor species and see if we can teach them language
(like experiments to teach gorillas sign language). That's
not to say that no experiments can ever be done, but for the
most part, investigators in these fields have to stick to
manuscripts and fossil records and just deal with the limited
data. And here's where math comes in: Mathematics can
provide models, such as the differential equations from my
dissertation, that can fill in some of the experiment gap. If
nothing else, a model of a hypothesis can improve the precision
in which it is stated, check it for consistency, and perhaps
uncover predictions that might be testable. Also, powerful
tools from probability and statistics allow us to get more and
more information from limited data (for example, the constant
rate effect in how a change spreads in different parts of a
grammar). Detailed simulations are also becoming popular, and
they have the side effect of drawing together theories from
different parts of linguistics (historical studies, abstract
theory, and child and adult learning) and trying to improve each
area through considerations from the others.
About my Middle English project
|
Terminology
|
|
syntax: Part of grammar dealing with how
words are organized into larger structures.
Ex: “John has read those books”
is thought of as a sentence built from
John[noun, 3rd person singular, agent,
nominative], has[inflection, perfect
auxiliary, present], read[verb, past
participle], those[determiner,
demonstrative], and books[noun, 3rd person,
plural, theme, accusative]. Also those books
is a phrase within the sentence as a whole.
morphology: Part of grammar dealing with
how words are assembled from morphemes. Morphemes are
stems, prefixes, suffixes, clitics and such.
Ex: “shouldn't've”
(spoken, but not written in formal English) is formed
from the stem should, the clitic
-n't derived from the negative particle
not, and the clitic 've derived
from the auxiliary have.
phonology: Part of grammar dealing with the
sound system. Ex: The English plural suffix
for nouns is pronounced [s] in general
(“cats”), but becomes [z] if the last
sound is voiced (“dogs”), and [əz]
after coronal sounds (“foxes”).
|
At the moment, I'm trying to understand the word order of
Middle English, and how it changed to the modern order.
I'm working closely with linguist
Anthony
Kroch at the University of Pennsylvania. There are a number
of reasons for picking this change as a topic of study. First,
there is a parsed corpus of Middle English manuscripts that
provide data for testing hypotheses. This written record is
thought to reflect the spoken language fairly well, as opposed
to written Old English which seems to have become a literary
standard maintained in monasteries and divergent from spoken Old
English. (Something similar seems to have happened in the case
of written Latin, which was maintained in scientific and
religious communities long after it ceased to be spoken, and
never fully reflected the spoken Latin dialects that eventually
gave rise to Italian, Spanish, French, etc.) At any rate,
parsed corpora are simply not available yet for other languages
over long periods of time. Second, Middle English underwent
several changes that appear to be primarily syntactic: The
verb-second rule and object-verb order both changed. This means
that a model can probably be developed for understanding these
changes without including morphology and phonology. Third, the
loss of verb-second is particularly interesting, as it
didn't occur in similar languages (such as Icelandic) and
there are several proposals for what caused the change in Middle
English.
|
What is Verb-Second?
|
|
The verb-second rule causes the verb that agrees with
the subject to move to the front of the sentence, and
something else must move in front of that. That is,
it's a combination of verb fronting followed by
topic fronting. Modern German and Icelandic use
different forms of this operation. It's used in
Modern English only to form questions (“What did
you see?”) but it was used to form declarative
statements in Old and Middle English. This rule was
lost in favor of the current subject-verb-object (SVO)
word order.
|
Specifically, these facts about Middle English seem to be keys
to why it lost verb-second but other Germanic languages did not:
- It switched to verb-object word order early on.
- It allowed adverbs to be adjoined to the beginning of a
sentence.
- It allowed pronouns to be cliticized to the left of the
fronted verb.
- There were at least two regional dialects,
northern and southern, with different forms of the
verb-second rule.
So, my current project is to put together a simulation that
includes a significant amount of linguistic realism in the hopes
that it will be able to simulate the loss of verb-second in
Middle English, while simulating Icelandic and other languages
that stably maintain verb-second.
The general formula by which the diachronic linguistics
community explains a language change is the following:
|
Terminology
|
|
diachronic: Studying the same language over
two or more time periods spanning a change.
synchronic: Studying a language across a
population during a narrow time period.
|
-
Describe the language before and after the change as
specifically and cleanly as possible, using some
well-studied theory of grammar. For Middle English, this
constitutes various hypothetical descriptions of the
internal structure of sentences with and without
verb-second.
-
Identify a shift in usage patterns, usually away from
triggering sentences. Children must learn the new grammar
of the language by hearing speakers of the old grammar, so
something about the old grammar must allow for children to
make that mistake. The typical explanations are that
either (1) some type of sentence that forces children to
learn the old grammar declines or (2) many sentences
become ambiguous between the two grammars. The change is
driven in part by children who hear insufficient
triggering data and select the new grammar rather than the
old one.
There's also a chicken-and-egg problem here that
remains to be resolved: Do shifts in usage patterns cause
the grammar to change, or does grammar change result in
the observed shifts in usage patterns?
As an example,
sentences consisting of just subject, verb, and object are
ambiguous between the Modern English SVO and Middle
English SVO+v2 word orders.
-
Identify something that caused the change to happen once
usage patterns made the change possible. The presence of
ambiguous forms doesn't guarantee that a change will
take place. A current sticky point in the theory is that
it's not clear whether random chance is sufficient to
explain why a change took place once it became possible,
or why it took the amount of time it did once it became
possible.
The northern and southern dialects of Middle English
developed slightly different word orders, and increased
contact between them seems to have brought about a decline
in sentences that can only be parsed as the northern
variant of SVO+v2, thereby allowing more children at the
contact point to learn SVO.
-
Understand how the change spread. There really ought to
be a some number of “seed speakers” of any
given innovation, that is the children that learned it by
accident, followed by children who learn from the seed
speakers and so forth. So how does a word order change
spread in a population?
In the case of Middle English, it seems to have spread out
from the boundary between the northern and southern
dialects, but the manuscript data isn't conclusive.
With any luck, the simulation will provide insight into what
specific properties of Middle English were most instrumental in
driving the loss of verb-second. It should be able to answer
questions about whether random chance is enough, given the
circumstances, or indicate that we need a further explanation of
what drove the change. The simulation also includes a system
for dealing with written as well as spoken language, so
there's the possibility of comparing the results of the
simulation directly to corpus data.
The last important point is that once we know what drove a
language change and pinpoint when the change became possible,
we're left with the question of why did it happen when it
did, rather than sooner or later? Imagine for example that you
have fair coin, and you flip it until it comes up heads ten
times in a row. You try this one day and it takes 1479 flips.
Is this surprising? Would you expect it to take more or fewer
flips on average? How big of a range should you expect? Since
we can't repeat language changes, simulations can give a
partial answer to this question. If repeated runs of the
simulation show that the change tends to take place one or two
centuries, then there's no surprise if this is what the
manuscript record shows. If the simulation indicates that the
change should generally take place immediately thus
contradicting the manuscript record, then either the simulation
is flawed or the manuscript record bears a second look.
Now all I have to do is write and run the simulation...
Last modified: Wed Feb 14 13:27:51 EST 2007
Revision $Id: ResearchSummary.html,v 1.8 2004/11/16 21:56:40 wgm Exp $