In spoken language, a phoneme is a basic, theoretical unit of sound that can distinguish words (that is, changing a phoneme in a word, produces another word, that has a different meaning). The phoneme is a basic sound segment, whose linguistic function is to distinguish a word's morphemes. Phonemes are not physical sounds, but abstractions. Phonemes are recognized as a family of phones, that are regarded as a single sound, and represented by a common symbol. Phones that are phoneme instance variants and represent the phonetic qualities of actual sound segments are known as allophones. The basic sound unit revealed via phonemic analysis. Phonemic symbology is enclosed within slashes (//). A succinct way to describe the idea of a phoneme is the smallest difference between words, that results in a difference in meaning. For example, the English words cat and rat each have three phonemes (represented by IPA letters), and . A pair of words that are identical, except for a single phoneme are known as a minimal pair.

A phoneme may well represent categorically several phonetically similar or phonologically related sounds. The relationship may not be phonetically obvious, which is one of the problems with this conceptual scheme.


Background and related ideas

Phonemics, a branch of phonology, is the study of the system of phonemes of a language.

The phoneme is a structuralist abstraction that was introduced by the Polish-Russian linguist Jan Niecislaw Baudouin de Courtenay (1845-1929) and elaborated in the works of Nikolai Trubetzkoi (1890-1938). It was later adapted to and formally psychologized in generative linguistics (after Chomsky and Halle). Rather than a basic mental unit of language, however, it may well be a perceptual artifact of alphabetic literacy (see the terms Phonemic awareness and Phonological awareness). If not that, it may be an epiphenomenal aspect to listening removed from face-to-face encounters, that is, text-like listening (qv phone and feature).

The exact number of phonemes in English depends on the speaker and the method of determining phoneme vs. allophone, but estimates typically range from 40 to 45, which is above average across all languages. Pirah has only 10, while !X has 141.

Depending on the language and the alphabet used, a phoneme may be written consistently with one letter; however there are many exceptions to this rule — see Writing systems below.

Some languages make use of pitch for the precise same purpose. In this case, the tones used are called tonemes. Some languages distinguish words made up of the same phonemes (and tonemes) by using different durations of some elements, which are called chronemes. The equivalents of phonemes in sign languages are called cheremes.


The common notation used in linguistics employs slashes (/ /) around the symbol that stands for the phoneme. For example, the phoneme for the initial consonant sound in the word "phoneme" would be written as . In other words, the graphemes are <ph>, but this digraph represents one sound . Allophones, real speech variants of a phoneme, are often denoted in linguistics by the use of diacritical or other marks added to the phoneme symbols and then placed in square brackets ([ ]) to differentiate them from the phoneme in slant brackets (/ /). The conventions of orthography are then kept separate from both phonemes and allophones by the use of the markers < > to enclose the spelling.

The symbols of the International Phonetic Alphabet (IPA) and extended sets adapted to a particular language are often used by linguists to write phonemes, with the principle being one symbol equals one categorical sound. Due to problems displaying some symbols in the early days of the Internet, systems such as X-SAMPA and Kirshenbaum were developed to represent IPA symbols in plain text. As of 2004, any modern web browser can display IPA symbols (as long as the operating system provides the appropriate fonts), and we use this system in this article.


Examples of phonemes in the English language would include sounds from the set of English consonants, like and . These two are most often written consistently with one letter for each sound. However, phonemes might not be so apparent in written English, such as when they are typically represented with combined letters, called digraphs, like <sh> (pronounced ) or <ch> (pronounced ).

To see a list of the phonemes in the English language, see English Phonemes.

Two sounds that may be allophones (sound variants belonging to the same phoneme) in one language may belong to separate phonemes in another language or dialect. In English, for example, has aspirated and non-aspirated allophones:aspirated as in , and non-aspirated as in . However, in many languages (e. g. Chinese), aspirated is a phoneme distinct from unaspirated . As another example, there is no distinction between and in Japanese, there is only one phoneme in Japanese, although the Japanese has allophones that make it sound more like an or to English speakers. The sounds and are distinct phonemes in English, but allophones in Spanish. (as in run) and (as in rung) are phonemes in English, but allophones in Italian and Spanish.

Phonological extremes

Of all the sounds that a human vocal tract can create, different languages vary considerably in the number of these sounds that are considered to be distinctive phonemes in the speech of that language. Some dialects of Abkhaz have only 2 phonemic vowels, and many Native American languages have 3, while Punjabi has over 25. Rotokas (spoken in Papua New Guinea) has only 6 consonants, while !Xu~ (spoken in southern Africa, in the vicinity of the Kalahari desert) has over 100. The total number of phonemes in languages varies from as few as 10 in Pirah and 13 in Hawaiian to as many as 141 in !Xu~. These may range from familiar sounds like [t], or to very unusual ones produced in extraordinary ways (see: Click consonant, phonation, airstream mechanism). The English language itself uses a rather large set of 13 to 22 vowels, though its 22 to 26 consonants are pretty close to average. This differs from the lay definition based on the Latin alphabet, where there are 21 consonants and five vowels (although sometimes y and w are included as vowels).

The most common vowel system consists of five vowels: . The most common consonants are . Not all languages have these; the Hawai'ian language lacks , and the Mohawk language lacks , and Hupa lacks both and . If one of the three is missing, the language will have (glottal stop).

The ways that sounds are pronounced can vary slightly from language to language even if the same IPA symbol is used. The Spanish word sin ("without") sounds different from the English word seen even though both would be transcribed in IPA as .

Restricted phonemes

A restricted phoneme is a phoneme that can only occur in a certain environment and has restrictions as to where it can occur.

Restricted phonemes in English include:

  • as in sing can occur only at the end of a syllable or word and can never occur at the beginning of a word.
  • Under most interpretations, and can occur only before a vowel and can never occur at the end of a syllable or word. However, some would interpret the diphthongs as , in which case these phonemes are restricted only with respect to what vowels they may follow.
  • can occur only at the beginning of a syllable or word or at the beginning of a cluster and can never occur at the end of a syllable or word.
  • Under most interpretations, in American accents with the cot-caught merger can occur only before /r/ and can never occur elsewhere. Some would interpret the diphthong (or ) as also containing the phoneme . On the other hand, some would interpret words like cord and horse as and (since all accents with the cot-caught merger also have the horse-hoarse merger), in which case the accents in question simply have no phoneme if is interpreted monophonemically.
  • In non-rhotic accents, /r/ can only occur before a vowel or intervocalically and can never occur at the end of a word or before a consonant.

Writing systems

Languages where a given symbol represents only one phoneme and every phoneme is represented only by one symbol are known by the layman as "phonetic languages", which might be better described as "phonemically written". English is often given as an example of an "unphonetic" language as its spelling system is highly erratic. There are numerous cases in which it is not possible to predict the pronunciation from the spelling or vice versa. English and French are probably the "worst" in this respect among languages written in the Roman alphabet.

However, the split between phonemically-written and non-phonemically-written languages is usually exaggerated. All languages are in fact written with conventional signs that represent meaning and are inspired to some degree by pronunciation. This is true at both ends of the scale: Chinese characters are first and foremost symbols of meaning, but they do also have some minimal phonetic information. At the other extreme, there are some few orthographies which are perfect phonemic representations of the standard accent, but since they make no effort to represent the variation in pronunciation within a language, they too are partially conventional.

All other languages fall somewhere between these extremes. Although English is often given as an example of an "unphonetic" language, in reality its system is nowhere near as close to being a purely conventional system as Chinese writing is. English spelling conveys etymological information, but also vast amounts of phonetic information. Spanish is often given as an example of a "phonetic" language; however, it has numerous imperfections including silent letters. It is, at least, possible to know the correct pronunciation of any written Spanish word. Another phonetic language is Serbian, its phoneticity was established by serbian "Webster" Vuk Stefanovic Karadzic;, he folowed a strict phonemical principle, which is best told by his own words "Write as you speak and read as it is written.".

