This page is a presentation of my theory of phonemes, targeted at an audience of phoneticians who are familiar with the issues that arise. But I'm not an academic linguist; if you feel that's a disqualification, you can stop reading here.
We want to define phonemes as the units of the mental representation of sounds. The key word is actually units: we want to define phonemes so that they're the final discrete units in the mental process of sound production - everything downstream is continuous. It's the same distinction as between digital and analog, or between integers and real numbers. A phoneme is the last point at which we can think of a sound unit, as opposed to speech organs, features, or acoustics. (Credit to Mark Liberman for this idea.) We then want to use phonemes to spell the entries in our lexicons.
The current mainstream theory is that phonemes can be identified using two tests: minimal pairs and complementary distribution. A third test is often applied: the idea that the set of phonemes of a language is the minimal set that satisfies the other criteria. As Occam advised, "Entities must not be multiplied beyond necessity". I will refer to that theory as miniphonemes: the minimal set of minimal pairs.
But these criteria don't derive directly from the definition of phonemes, and there's no particular reason to assume they're correct. There's no shortage of difficulties with this miniphoneme theory. Here's an overview of some of them that was one of the first matches when I googled "problems with phonemes": Problems with Phonemes, Coleman 2006.
The theory I'm proposing isn't so different, but it moves the frontier between discrete and continuous - between morphology and phonology - one step further downstream, to allophones, so I'm going to call it the allophoneme theory. A traditionalist says that the complementary distribution of aspirated [tʰ] as in top and unvoiced [t] as in stop means that they're the same miniphoneme; I say that the fact that stʰop sounds wrong to native speakers and it's a substitution they never make, means that they're different allophonemes: natives think of them as different sounds, even if they're in complementary distribution. After all, English /h/ and /ng/ are in complementary distribution, and we don't think of them as allophones of the same phoneme.
Can't we just ask speakers? After all, if phonemes are mental constructs, can't we rely on speaker intuition? Well, no: it seems that the most natural unit of speech in the minds of illiterate and uneducated speakers is the syllable; breaking speech up into segments must be taught, and that's much of what we're learning when we learn to read and write. For instance, traditional Chinese phonology breaks syllables into initials and finals: the final includes the medial, the vowel, a coda consonant, and the tone, and in fact it's not trivial to split that into phonemes even now; there are multiple competing miniphonemic analyses of Standard Chinese.
Clearly, speech production and perception both reside in brains, so they must use mental entities, but these processes are normally subconscious. Happily, we can use surprise as a surrogate: if you say [ˋstʰɪl ] instead of [ˋstɪɫ ], are listeners surprised? I would say that they are, and that that proves that listeners distinguish allophones, and thus allophonemes.
Let's talk about some examples. In English, the b sound and the v sound can distinguish otherwise identical words, like bat / vat or fiber / fiver. Those are minimal pairs; the /b/ and /v/ are different miniphonemes of English. But the b's of bull and able are the same miniphoneme, even though the first is unvoiced (devoiced after a pause) and the second is voiced. What's more, the b of bull is the same phone as the p of apple, but they're not the same miniphoneme!
In Spanish, the sounds [b] and [v] (actually [β] or [β̞ ]) are in complementary distribution: there are no Spanish words where you could replace one with the other and get a difference in meaning. There are word pairs like baño and vaño which are spelled differently and mean different things, but they're pronounced alike. Spanish speakers could agree to spell Balencia or Córdova (instead of Valencia or Córdoba as now spelled), and the spelling would reflect the pronunciation just as well, because b and v represent the same miniphoneme. I'm describing the phenomenon in orthographic terms, but the orthography isn't determining - even illiterate Spanish speakers are said to think that b and v are the "same sound". The general rule is that this miniphoneme is pronounced as [b] after a pause or a nasal consonant, as in bomba, and as [β̞ ] elsewhere.
With allophonemes, my analysis is a little different. I recognize that the b's of bound and rebound have a common origin, just as do the f of life and the v of lives, but this commonality is upstream of my allophonemes: all four are different allophonemes. In my theory, the b's of bull and able are different allophonemes, while the b of bull and the p of apple are the same allophoneme. In Spanish, the initial sounds of Barcelona and Valencia are a different allophoneme from the spelled-alike sound in hablar and haver.
Here are a few more interesting cases that I hope will make you more open-minded about allophonemes.
The English words to and do distinguish aspirated t from unvoiced (devoiced) d, the words intent and indent distinguish aspirated t from voiced d, and the pair shanty and shandy distinguish unvoiced t from voiced d. Thus those three sounds are different phonemes in English! And yet the traditional analysis claims the three t's are one miniphoneme, and the three d's are another: only two phonemes. What, you say that devoiced and unvoiced are distinct? As in disgust versus discussed?
English has three nasal phonemes - m n ng - although ng only occurs in syllable codas. But English has several more nasal allophones, including ɱ as in comfort, ɲ as in senior, and ȵ as in lunch. I don't raise those allophones to the status of allophonemes; there is room in my theory for non-phonemic "performance" allophones which are downstream of the lexicon. Some theorists question whether ng is a true miniphoneme, proposing that it is - as it's spelled - a contraction of the sequence n+g or a "performance" assimilation before k. And it's true that there don't seem to be any examples of ng+g as in finger but final. One small advantage of the allophoneme theory is that we don't care: the allophone ng is definitely an allophoneme.
Spelling the lexicon in allophonemes offers us several other advantages. In one of Geoff Lindsey's videos, he asks whether the p in speech is the same phoneme as the p in peach? Our orthography would so suggest, and most academic analyses claim that onset p is aspirated in stressed syllables except after s, classifying it as fortis p. Lindsey suggests (and I agree) that it should be spelled sbeech, but how could we determine whether it's fortis or lenis? There's no quandary in my theory: the p in speech is the same allophoneme as the b in beach.
The same question can be asked of other mergers. In my theory, vowel reduction takes place upstream of the lexicon, in morphology, and reduced vowels are allophonemes. If reduction is phonetic, as in most miniphoneme theories, how might we know what the reduced vowel phonemes are in carat caret carrot merit?
In my dialect of General American English, short vowels tend to reduce to [ᵻ] schwi and longer low vowels to [ɐ]. But because [ɐ] is itself a short vowel (usually analyzed as [ʌ] when stressed), sometimes it reduces again to schwi, for example when Lennon is pronounced like Lenin. That second reduction is not lexical, although it reduces one allophoneme to another: carrot is lexically [ˈkɛɹɐt̚] but can be reduced to [ˈkɛɹᵻt̚], while caret is never [ˈkɛɹɐt̚];
Some foreign cases: it's possible to analyze Standard Chinese as having only two miniphonemic vowels. But native speakers consistently pronounce the eight allophones of the mid vowel differently in different contexts, so they're different allophonemes: those different pronunciations are part of the lexicon, and if you choose the wrong one - if you substitute one allophone for another - native listeners will flag it as an error or misinterpret it.
Another example from Standard Chinese: the miniphoneme analysis usually claims that Chinese has three nasal consonants: /m n ŋ/. But then it notes that /m/ is never final and /ŋ/ is never initial. In several dialects, initial /n/ can be pronounced [l], but final /n/ can never be, while final /n/ becomes a nasalization of the vowel in fast speech or erhua. So isn't it clearer to say that Chinese has two initial nasal allophonemes, /m n/, and two final allophonemes, /n ŋ/, even though initial and final /n/ are in complementary distribution? Then we can reserve the word allophone for the [l] variant and the nasalization.
© 2002-2025 The Musa Academy | musa@musa.bet | 05apr25 |