  1.  The investigation of the phoneme as a language unit…………………….. 6
  2.   Phoneme as a unity of three aspects……………………………………… 6
  3.   Conceptions of the phoneme…………………………………………….. 10
  4.   The system of English phoneme………………………………………… 15
    1.  General characteristics of vowel phonemes……………………………… 15
    2.  General characteristics of consonant phonemes………………………….  22
  5.   Main trends in the phoneme theory……………………………………… 29
    1.  Phonological schools……………………………………………………..  29
    2.  Methods of phonological analysis………………………………………..  32

1.5       Phonemes in sign languages……………………………………………… 37

  1.  Differences in the articulation basis of English, Russian and Kazakh…… 41






Phonetics deals with speech sounds. In Greek language «phonetikos» means pertaining to voice and sound. Phonetics studies the sound system of a language that is segmental phonemes, word stress, syllabic structure and intonation. Phonetics is the scientific study of sounds used in language: how the sounds are produced, how they are transferred from the speaker to the hearer and how they are heard and perceived. Sounds of the language provide an accessible and general introduction to phonetics with a special emphasis on English.

However, phonetics is obliged to take the content level into consideration too, because at any stage of the analysis, a considerate part of the phonetician’s concern is with the effect which the expression unit he is examining and its different characteristics have on meaning. Only meaningful sound sequences are regarded as speech and the science of phonetics, in principle at least, is concerned with such sounds produced by a human vocal apparatus as are, or may be carries of organized information of a language.

Human speech is the result of highly complicated series of events. The formation of the concept takes place at the linguistic level that is in the brain of the speaker: this stage may be called psychological. The message formed within the brain is transmitted along the nervous system to the speech organs. Therefore we may say the human brain controls the behavior of the articulating organs which effect in producing a particular pattern of speech sounds.

The speech sounds of a language, which constitute all its morphemes and words, are instances, manifestations or realizations of its segmental phonemes.

The phoneme is the unity of three aspects: functional, material and abstract. The phoneme performs a distinctive function. The opposition of phonemes in the same phonetic environment differentiates the meaning of morphemes, words and even utterances. Phoneme is realized in speech in the form of speech sounds and its allophones. Allophones of the same phoneme possess similar articulatory features. The difference between the allophones is predictable and is the result of the influence of the neighboring sounds. The actually pronounced speech sounds are modified by phonostylistic, dialectical and individual factors.

Native speakers abstract themselves from the difference between the allophones of the same phoneme because it has no functional value but they have a generalized idea of a complex of distinctive features which cannot be changed without the change of meaning. This functionally relevant bundle of articulatory features is called the invariant of the phoneme.

The founder of the phoneme theory was the Russian-polish scientist I.A. Baudoin de Courtenay. He did a lot in the study of phonemic alternations and was the first linguist who demanded accurate distinction between synchronic and diachronic approach to the investigation of the phoneme.

As we probably know from the course of general linguistics there exist different opinions on the point of the definition of the phoneme.

The truly materialistic view of the phoneme was originated by famous linguist L.V. Shcherba. According to L.V. Shcherba the phoneme may be viewed as a functional, material and abstract unit [1].  These three aspects of the phoneme are concentrated in the definition of the phoneme suggested by V. A. Vassilyev who looks upon the phoneme as a dialectical unity of these three aspects because they determine one another and are thus independent [2].

Quite different is the opinion of another linguist Bloch who defined the phoneme as a class of phonemically similar sounds contrasting and mutually exclusive with all similar classes in the language [3]. According to Jacobson phoneme is a minimal sound by which meaning may be discriminated [3].

As we have seen, the definition of the phoneme varies greatly. This phonetic phenomenon is still the object of linguistic interest and thus causes a great number of phonetic investigation and analysis.

So, the topicality of the phoneme investigation and different opinions of its definition and phonemic status serves as the main reason for the choice of the theme of our work:  «Phoneme as a language unit».

The aim of our investigation is to analyze a complex nature of the phoneme as a phonetic phenomenon of English language

According to the aim of our investigation we have formulated the following objectives:

  1.  to study a complex nature of the phoneme from the point of three aspects;
  2.  to investigate characteristic  features of the phoneme and its allophones;
  3.  to analyze functions of the phoneme;
  4.  to reveal differences in the articulation basis of English, Russian and Kazakh

Object: Phoneme as a language unit

Subject: Characteristic features of English phonemes

Hypothesis: we suppose that a detailed study of the phoneme and its different allophones, its features and functions in speech will provide learners with some ways and knowledge of the phonetic analysis and thus make the process of learning English pronunciation more effective.

Methods of investigation: in the course of our research we have used methods of description and comparative analysis.

Theoretical value of our investigation: significant points of our work can be developed and serve as a subject matter for further project and scientific works on the point of theoretical phonetics.

Practical value of the work: the results of the phonetic analysis of phonemes in the system of vowels and consonants may be used at the practical lessons of studying the phonetic structure of English language.

Basis of scientific investigation: our investigation has been carried out on the material of scientific works of such famous linguists as L.A. Shcherba, V.A. Vassilyev, Bloch and others whose contribution to the study oh phoneme is of special scientific importance.

The diploma paper consists of the following parts: introduction, main chapters, conclusion, bibliography and appendix.

Scientific apparatus is presented in introduction which includes the topicality of the research, its aim, object and subject.

The theoretical part contains the study and descriptive analysis of the phoneme, its nature and use in speech, its functions in the phonetic system from the point of different phoneticians and linguists. Practical part of the work is represented by the comparative analysis of differences of the articulation basis of English, Kazakh and Russian. It also includes practical tasks on teaching pronunciation and learning a system of English vowel and consonant phonemes.

In conclusion we outline and sum up the results of our investigation.

Bibliography includes a list of literary sources used in the process of the research which is followed by appendix.

1The investigation of the phoneme as a language unit

  1.  Phoneme as a unity of three aspects

As it has already been mentioned the truly materialistic view of the phoneme was originated by the Russian linguist L.V. Shcherba. He viewed the phoneme as a functional, material and abstract unit [4]. V.A. Vassilyev defined the phoneme as a dialectical unity of these aspects because they determine one another and are thus independent. He considered phoneme as a minimal abstract linguistic unit, realized in speech in the form of speech sounds opposable to other phonemes of the same language to distinguish the meaning of morphemes and words [4].

Let us consider the phoneme from the point of view of its three aspects.

Firstly, phoneme is a functional unit. Function is usually understood to mean discriminatory function, that is, the role of the various components of the phonetic system of the language in distinguishing one morpheme from another, one word from another or also one utterance from another.

The opposition of phonemes in the same phonetic environment differentiates the meaning of morphemes and words, e.g. said— says, sleeper — sleepy, bath — path, light — like.

Also phoneme can fulfill a distinctive function. Sometimes the opposition of phonemes serves to distinguish the meaning of the whole phrases, e.g. He was heard badly — He was hurt badly. Thus we may say that the phoneme can fulfil the distinctive function.

Secondly, phoneme is material, real and objective. That means that it is realized in speech of all English speaking people in the form of speech sounds, its allophones. The sets of speech sounds that is the allophones belonging to the same phoneme are not identical in their articulatory content though there remains some phonetic similarity between them.

As a first example, let us consider the English phoneme [d], at least those of its allophones which are known to everybody who study English pronunciation.  

English phoneme [d] when not affected by the articulation of the preceding or following sounds is a plosive, fore-lingual apical, alveolar, lenis stop. This is how it sounds in isolation or in such words as door, darn, down, etc., when it retains its typical articulatory characteristics. In this case the consonant [d] is called the principal allophone. The allophones which do not undergo any distinguishable changes in the chain of speech are called principal. At the same time there are quite predictable changes in the articulation of allophones that occur under the influence of the neighboring sounds in different phonetic situations. Such allophones are called subsidiary. The following examples illustrate the articulatory modifications of the phoneme [d] in various phonetic contexts:

[d]   is slightly palatalized before front vowels and the sonorant [j], e.g. deal, day, did, did you;

[d] is pronounced without any plosion before another stop,. e.g. bedtime, bad pain, good dog;

It is pronounced with the nasal plosion before the nasal sonorants [n] and [m], e.g. sudden, admit, could not, could meet;

The plosion is lateral before the lateral sonorant [l], e.g. middle, badly, bad light. The alveolar position is particularly sensitive to the influence of the place of articulation of a following consonant. Thus followed by [r] the consonant [d] becomes post-alveolar, e.g. dry, dream; followed by the interdental [θ], [ð] it becomes dental, e.g. breadth, lead the way, good thing.

When [d] is followed by the labial [w] it becomes labialized, e.g. dweller. In the initial position [d] is partially devoiced, e.g. dog, dean; in the intervocalic position or when followed by a sonorant it is fully voiced, e.g. order, leader, driver; in the word-final position it is voiceless, e.g. road, raised, old.

These modifications of the phoneme [d] are quite sufficient to demonstrate the articulatory difference between its allophones, though the list of them could be easily extended. If you consider the production of the allophones of the phoneme above you will find that they possess three articulatory features in common, all of them are forelingual lenis stops.

Consequently, though allophones of the same phoneme possess similar articulatory features they may frequently show considerable phonetic differences.

It is perfectly obvious that in teaching English pronunciation the difference between the allophones of the same phoneme should be necessarily considered. The starting point is of course the articulation of the principal allophone, e.g. [d-d-d]: door, double, daughter, dark etc. Special training of the subsidiary allophones  should be provided too. Not all the subsidiary allophones are generally paid equal attention to. In teaching pronunciation of [d], for instance, it is hardly necessary to concentrate on an allophone such as [d] before a front vowel as in Russian similar consonants in this position are also palatalized. Neither is it necessary to practice specially the labialized [d] after the labial [w] because in this position [d] cannot be pronounced in any other way. Carefully made up exercises will exclude the danger of foreign accent.

Allophones are arranged into functionally similar groups that are groups of sounds in which the members of each group are opposed to one another, but are opposable to members of any other group to distinguish meanings in otherwise similar sequences.

Consequently allophones of the same phoneme never occur in similar phonetic contexts, they are entirely predictable according to the phonetic environment, and thus carry no useful information, that is they cannot differentiate meanings.

But the phones which are realized in speech do not correspond exactly to the allophone predicted by this or that phonetic environment. They are modified by phonostylistic, dialectal and individual factors. In fact, no speech sounds are absolutely alike.

Phonemes are important foe distinguishing meanings, for knowing whether, for instance, the message was take it or tape it. But there is more to speaker-listener exchange than just the “message” itself. The listener may pick up a variety of information about the speaker: about the locality he lives in, regional origin, his social status, age and even emotional state (angry, tired, excited), and much other information. Most of this other social information comes not from phonemic distinctions, but from phonetic ones. Thus, while phonemic evidence is important for lexical and grammatical meaning, most other aspects of a communication are conveyed by more subtle differences of speech sounds, requiring more detailed description at the phonetic level. There is more to a speech act than just the meaning of the words.

Thirdly, allophones of the same phoneme, no matter how different their articulation may be, function as the same linguistic unit. The question arises why phonetically naïve native speakers seldom observe differences in the actual articulatory qualities between the allophones of the same phonemes.

The native speaker is quite readily aware of the phonemes of his language but much less aware of the allophones: it is possible, in fact, that he will not hear the difference between two allophones like the alveolar and dental consonants [d] in the words bread and breadth even when the distinction is pointed out; a certain amount of ear-training may be needed. The reason is that the phonemes have an important function in the language: they differentiate words like tie and die from each other, and to be able to hear and produce phonemic differences is part of what it means to be a competent speaker of the language.

Allophones, on the other hand, have no such function: they usually occur in different positions in the word (i.e. in different environments) and hence cannot be opposed to each other to make meaningful distinctions. For example the dark [l] occurs following a vowel as in pill, cold, but it is not found before a vowel, whereas the clear [l] only occurs before a vowel, as in lip, like. These two vowels cannot therefore contrast with each other in the way that [1] contrasts with [r] in lip - rip or lake - rake, there are no pairs of words which differ only in that one has [l] and the other - [1].

Thirdly, the phoneme is abstract or generalized and that is reflected in its definition as a language unit. It is an abstraction because we make it abstract from concrete realizations for classificatory purposes.

So the answer appears to be in the functioning of such sounds in the language concerned. Sounds which have similar functions in the language tend to be considered the “same” by the community using that language while those which have different functions tend to be classed as “different”. In linguistics, as it has been mentioned above, function is generally understood as the role of the various elements of the language in distinguishing the meaning.

The function of phonemes is to distinguish the meaning of morphemes and words. The native speaker does not notice the difference between the allophones of the same phoneme because this difference does not distinguish meanings. In other words, native speakers abstract themselves from the difference between the allophones of the same phoneme because it has no functional value. The actual difference between the allophones of the phoneme [d], for instance, does not affect the meaning.

That’s why members of the English speech community do not realize that in the word dog [d] is alveolar, in dry it is post-alveolar, in breadth it is dental. Another example: in the Russian word посадит the stressed vowel [a] is more front than it is in the word посадка. It is even more front in the word сядет. But Russian-speaking people do not observe this difference because the three vowel  sounds belong to the same phoneme and thus the changes in their quality do not distinguish the meaning.

So we have good grounds to state that the phoneme is an abstract linguistic unit, it is an abstraction from actual speech sounds that is allophonic modifications.

As it has been said before, native speakers do not observe the difference between the allophones of the same phoneme. At the same time they realize, quite subconsciously of course, that allophones of each phoneme possess a bundle of distinctive features that makes this phoneme functionally different from all other phonemes of the language concerned. This functionally relevant bundle of articulatory features is called the invariant of the phoneme. Neilher of the articulatory features that form the invariant of the phoneme can be changed without affecting the meaning. All the allophones of the phoneme [d], for instance, are occlusive, forelingual, lenis. If occlusive articulation is changed for constrictive one [d] will be replaced by [z], cf. breed— breeze, deal - zeal; [d] will be replaced by [g] if the forelingual articulation is replaced by the backlingual one, cf. dear - gear, day - gay. The lenis articulation of [d] cannot be substituted by the fortis one because it will also bring about changes in meaning, cf. dry - try, ladder - latter, bid - bit.

That is why it is possible to state that occlusive, forelingual and lenis characteristics of the phoneme [d] are generalized in the mind of the speaker into what is called the invariant of this phoneme.

On the one hand, the phoneme is objective real, because it is realized in speech in the material form of speech sounds, its allophones. On the other hand, it is an abstract language unit. That is why we can look upon the phoneme as a dialectical unity of the material and abstract aspects. Thus we may state that it exists in the material form of speech sounds, its allophones. Speech sounds are necessarily allophones of one of the phonemes of the language concerned. All the allophones of the same phoneme have some articulatory features in common, that is all of them possess the same invariant.

Simultaneously each allophone possesses quite particular phonetic features, which may not be traced in the articulation of other allophones of the same phoneme. That is why while teaching pronunciation we cannot ask our pupils to pronounce this or that phoneme. We can only teach them to pronounce one of its allophones.

The articulatory features which form the invariant of the phoneme are called distinctive or relevant. If opposed sounds differ in one articulatory feature and this difference brings about changes in the meaning of the words the contrasting features are called relevant.

For example, the words port and court differ in one consonant only, that is the word port has the initial consonant [p], and the word court begins with [k]. Both sounds are occlusive and fortis, the only difference being that [p] is labial and [k] is backlingual. Therefore it is possible to say that labial and backlingual are relevant in the system of English consonants.

The articulatory features which do not serve to distinguish meaning are called non-distinctive, irrelevant or redundant; for instance, it is impossible in English to oppose an aspirated [p] to a non-aspirated one in the same phonetic context to distinguish meanings. That is why aspiration is a non-distinctive feature of English consonants.

As it has been mentioned above any change in the invariant of the phoneme affects the meaning. Naturally, anyone who studies a foreign language makes mistakes in the articulation of particular sounds. L.V. Shcherba classifies the pronunciation errors as phonological and phonetic [5].

If an allophone of some phoneme is replaced by an allophone of a different phoneme the mistake is called phonological, because the meaning of the word is inevitable affected. It happens when one or more relevant features of the phoneme are not realized, e.g.:

When the vowel [i:] in the word beat becomes slightly more open, more advanced or is no longer diphthongized the word beat may be perceived as quite a different word bit. It is perfectly clear that this type of mistakes is not admitted in teaching pronunciation to any type of language learner.  

If an allophone of the phoneme is replaced by another allophone of the same phoneme the mistake is called phonetic. It happens when the invariant of the phoneme is not modified and consequently the meaning of the word is not affected, e.g.:

When the vowel [i:] is fully long in such a word as sheep, for instance, the quality of it remaining the same, the meaning of the word does not change. Nevertheless language learners are advised not to let phonetic mistakes into their pronunciation. If they do make them the degree of their foreign accent will certainly be an obstacle to the listener’s perception.  

1.2 Conceptions of the phoneme

Views of the phoneme seem to fall into four main classes. The "mentalistic" or "psychological" view regards the phoneme as an ideal "mental image" or a target at which the speaker aims. He deviates from this ideal sound partly because an identical repetition of a sound is next to impossible and partly because of the influence exerted by neighboring sounds.

According to this conception allophones of the phoneme are varying materializations of it. This view was originated by the founder of the phoneme theory, the Russian linguist I.A. Baudouin de Courtenay and something like it appears to have been adopted by E.D. Sapir, Alf. Sommerfelt,  M. Tatham [6].

The so-called "functional" view regards the phoneme as the minimal sound unit by which meanings may be differentiated without much regard to actually pronounced speech sounds. Meaning differentiation is taken to be a defining characteristic of phonemes.

Thus the absence of palatalization in [l] and palatalization of the dark [ł] in English do not differentiate meanings, and therefore [l] and [ł] cannot be assigned to different phonemes but both form allophones of the phoneme [l]. This view is shared by many foreign linguists such as N. Trubetskoy, L. Bloomfield, R. Jakobson, M. Halle [7].

The functional view of the phoneme gave rise to a branch of linguistics called "phonology" or "phonemics" which is concerned with relationships between contrasting sounds in a language.

Its special interest lies in establishing the system of distinctive features of the language concerned. Phonetics is limited in this case with the precise description of acoustic and physiological aspects of physical sounds without any concern to their linguistic function. A stronger form of the "functional" approach is advocated in the so-called "abstract" view of the phoneme, which regards phonemes as essentially independent of the acoustic and physiological properties associated with them that is of speech sounds. This view of the phoneme was pioneered by L. Hjelmslev and his associates in the Copenhagen Linguistic Circle, H.J. Uldall and K. Togby [8].

The views of the phoneme discussed above can be qualified as idealistic since all of them regard the phoneme as an abstract conception existing in the mind but not in the reality that is in human speech, speech sounds being only phonetic manifestations of these conceptions.

The "physical" view regards the phoneme as a "family" of related sounds satisfying certain conditions, notably:

1) various members of the "family" must show phonetic similarity to one another, in other words be related in character.

2) no member of the "family" may occur in the same phonetic context as any other.

The extreme form of the "physical" conception, as propounded by D. Jones and shared by B. Bloch and G. Trager, excludes all reference to non-articulatory criteria in the grouping of sounds into phonemes [9].

The concept of the phoneme was central to the development of phonological theory. In the early twentieth century, phonological theory was all about the phoneme: how to define it, how to recognize it, how to discover it. The American structuralist term for phonology, phonemics, indicates to what extent the field was considered to be about the phoneme.

Things have now changed. The phoneme, to all appearances, no longer holds a central place in phonological theory. Two recent and voluminous handbooks devoted to phonology, edited by Goldsmith  and by de Lacy, have no chapter on the phoneme. It is barely mentioned in the indexes. This does not mean that the phoneme plays no role in modern phonology; closer inspection reveals that the phoneme is far from dead. However, it is not much talked about, and when it is, it is more often to dispute its existence than to affirm it.

Such a dramatic change in fortunes for a concept bears some looking into, and this chapter will be devoted to trying to understand what has happened to the phoneme in its journey into the twenty-first century, and what its prospects are for the future. S. R. Anderson  cites Godel  and Jakobson as locating the origin of the term phoneme in the French word phoneme, coined in the early 1870s by the French linguist Dufriche-Desgenettes [10]. He proposed the term to substitute for the German Sprachlaut (“speech sound”), so it did not have the modern sense of phoneme, but rather corresponded to what we would now call “speech sound” or “phone.” The term was taken up by Saussure, who used it in yet a different sense, and from Saussure it was taken up by the Polish Kazan school linguists Jan Baudouin de Courtenay [10].

S. R. Anderson traces how the meaning of the term evolved from Saussure’s use to the one that ultimately emerged from the Kazan school. Saussure used it in his historical work on Indo-European to refer to a hypothesized sound in a proto-language together with its reflexes in the daughter languages, what we might call a “correspondence set” [11]. For example, if a sound that is reconstructed as g in the proto-language has reflexes g, h, and k in three daughter languages, then the set [g, h, k] would constitute a “phoneme” for Saussure.

Kruszewski recast the notion in synchronic terms to refer to a set of alternating elements; for example, if the same morpheme has a final [g] before suffixes beginning with a back vowel, a palatalized [gj] before suffixes beginning with a front vowel, and a [k] when it is word-final, the alternation “[g] before a back vowel, [gj] before a front vowel, and [k] when final” would constitute a “phoneme” [11].

Subsequently, Baudouin reinterpreted the term “phonemes” as referring to the abstract, invariant psychophonetic elements that alternate; in the above example, one could posit a phoneme [g] that participates in the alternations that cause it to be realized as [g], [gj], or [k], depending on the context. In a final step, the term was extended also to sounds that do not alternate, thereby arriving at a conception of the phoneme as “the psychological equivalent of a speech sound”. It is in this sense that the phoneme entered phonological theory in Europe and North America [10].

The general concept of the phoneme preceded the term or its exact definition, which is a more difficult enterprise. The basic concept is that of the unity of sounds that are objectively different but in some sense functionally the same.

As Twaddell observes, this concept is not new: if a special term was not needed before the late nineteenth century, it is because in the absence of close phonetic observation, it is not necessary to distinguish between “phoneme” and “speech sound” [12]. Alphabetic writing systems tend to have separate letters only for sounds that have a distinctive function, though deviations from this principle occur. In ordinary parlance one talks of the sound “d” or “k” as if each of these represents a single sound, rather than, as is the case, a range of sounds.

Parallel to the development of the phonemic concept as part of phonological theory mentioned above, British and French phoneticians who laid the foundations for what became the International Phonetic Association (IPA) arrived at a similar notion motivated by more practical concerns. According to Jones, Henry Sweet  was the first to draw a distinction between “narrow” and “broad” transcription: narrow transcription aims (in principle) to record sounds in as much detail as possible, whereas broad transcription records only distinctive differences in sound [13]. It was recognized early on that the goal of assigning a unique symbol to every sound in every language, even if it could be realized, would lead to transcriptions for particular languages that would be impractical and virtually illegible.

Therefore, Paul Passy insisted in 1888 that only distinctive differences should be recorded, and called this principle une regle d’or (“a golden rule”) from which one should never depart [14].

Thus, while the IPA is popularly known for developing a universal phonetic alphabet that is associated with phonetic (“narrow”) transcription, its founders insisted on “broad” (i.e. phonemic) transcription for purely practical reasons. The practical strain remained influential in phonological theory, as attested by the subtitle of Pike’s “Phonemics: a technique for reducing languages to writing”.

It is hard to imagine what linguistic description would be like without a phoneme concept of some sort. To take one entirely typical example, the Australian language Pitta-Pitta (Pama-Nyungan) is said to have three vowels, i, a, and u [14].

In describing their pronunciation, Blake writes that they “are similar to the vowels of ‘been’, ‘balm’, and ‘boot’ respectively” (presumably [i], [a], and [u]). Further reading reveals that this is only true in open syllables, and when stressed, and when near certain consonants. In a closed syllable, “they are similar to the vowels of ‘bin’, ‘bun’, and ‘put’ ” ([i], [a], and [u]). Further, the vowel /a/ is pronounced in the vicinity of a palatal consonant, and unstressed a has a schwa-like pronunciation [o]. Objectively, then, Pitta-Pitta has at least eight different vowel sounds, and probably many more if we were to attend to further distinctions in different segmental and prosodic contexts, and in different situations and for different speakers.

This variation does not detract from the fact that there is an important sense in which this language has three vowels. In the distribution given above, we recognize that the variation is a consequence of the influence of context, and has no contrastive function.

Put differently, in every slot where a vowel belongs we have only three choices in this language. If we are told that a word begins with the sequence m-vowel-rr-, we know that the vowel must be one of the variants of /a/ (e.g. marra ‘open’), /i/ (e.g. mirri ‘little girl’), or /u/ (e.g. murra ‘stick’).

In the 1930s many linguists came to share the intuition that a concept like the phoneme is needed in phonological description. Pinning down the definition of this concept proved to be difficult. Like other linguistic notions, such as “sentence,”“syllable,” and “topic,” what starts out as a relatively unproblematic intuitive concept inevitably gets caught up in theory-internal considerations. In the case of the phoneme, three issues have been particularly contentious: (1) what sort of entity the phoneme is (physical, psychological, other); (2) what the content of the phoneme is; and (3) how one identifies phonemes.

Twaddell  surveyed the various definitions of the phoneme that were then in circulation, and classified them as being of two main types One type assumes that the phoneme is a physical reality, and the other assumes that it is a psychological notion [12].

One class of definitions assumes that the phoneme is a physical reality of some sort. Thus, Jones considers the phoneme to be a “family” of sounds in a particular language that “count for practical purposes as if they were one and the same.” While such a definition (“explanation” is Jones’s preferred term) is fine for practical purposes, it leaves unaddressed the essential nature of the phoneme: what it is about certain sounds that cause them to count as part of the same family [15].

A more ambitious proposal was made by Bloomfield. He characterized the phoneme as “a minimum unit of distinctive sound-feature…” [15]. The speaker has been trained to make sound-producing movements in such a way that the phoneme features will be present in the sound waves, and he has been trained to respond only to these features. Such a definition fits well with the behaviorist psychology assumed by Bloomfield, which sees behavior (including language, which is defined as verbal behavior; as being shaped by the association of stimuli with responses; if phonemes are crucial to behavior, according to this view, one might expect them to be overtly present in the signal.

Nevertheless, Twaddell observes that the acoustic constants required by such a theory had not been observed by experimental phoneticians, and he doubts that advances in laboratory technology would reveal them in the future. Twaddell’s judgment has turned out to be prescient. In the 1970s and 1980s Blumstein and Stevens tried to identify invariant acoustic correlates for the phonetic features that make up phonemes. Despite some early successes, a considerable amount of variability was found when different contexts were considered.

The emphasis of this line of research ultimately shifted to consider the role of “enhancing” gestures in helping listeners identify features when the primary acoustic cue has been weakened or obliterated. Thus it has not been demonstrated that there is some acoustic constant that characterizes every instance of a phoneme or distinctive feature.

If the phoneme cannot be identified with a physical constant, a natural alternative is that it is a mental or psychological reality. Many early writers on the phoneme thought of it in psychological terms, and Twaddell assembles some characteristic definitions: the phoneme is a constant acoustic and auditory image; a thought sound; a sound idea; a psychological equivalent of an empirical sound [12].

In modern terms all these definitions amount to the claim that the phoneme is some sort of mental representation. Twaddell criticizes these psychological accounts on two grounds. First, he points out, correctly, that such definitions are not particularly helpful in characterizing what phonemes are. His second critique is more sweeping, and arises from his empiricist view of philosophy and psychology: following Bloomfield, Twaddell argues that mentalistic notions have no place in science, because they cannot be empirically tested. While it is no doubt correct that appealing to a vague and unknown “mind” cannot serve as an adequate explanation (explanans) of any phenomenon, the cognitive revolution that began in the 1950s has shown the fruitfulness of studying mental representations and processes as things to be explained (explananda) [12].

The consequence of rejecting both physical and psychological reality for the phoneme is that Twaddell is forced to conclude that the phoneme, though an“eminently useful” term, is a fictitious unit. There exist philosophies of science in which useful, indeed indispensible, units can be fictions, but most linguists have taken a “realist” view of linguistics [12]. From this perspective, a unit that is required to give an adequate account of some phenomenon must be real at some level. Once we abandon empiricist assumptions about science and psychology, there is no obstacle to considering the phoneme to be a psychological entity.

1.3   The system of English phonemes

1.3.1 General characteristics of vowel phonemes

There are two major classes of sounds traditionally distinguished by phoneticians in any language. They are termed consonants and vowels. The distinction is based mainly on auditory effect.

Consonants are known to have voice and noise combined, while vowels are sounds consisting of voice only. From the articulatory point of view the difference is due to the work of speech organs. In case of vowels no obstruction is made. In case of consonants various obstructions are made. So consonants are characterized by so-called close articulation that is by a complete, partial or intermittent blockage of the air-passage by an organ or organs. The closure is formed in such a way that the air-stream is blocked or hindered or otherwise gives rise to audible friction. As a result consonants are sounds which have noise as their indispensable and most defining characteristic.

Vowels unlike consonants are produced with no obstruction to the stream of air, so on the perception level their integral characteristic is naturally tone, not noise. The most important characteristic of the quality of these vowels is that they are acoustically stable. They are known to be entirely different from one another both articulatorily and acoustically. In English vowel system there are 12 vowel monophthongs and 8 or 9 diphthongs.

The quality of a vowel is known to be determined by the size, volume, and shape of the mouth resonator, which are modified by the movement of active speech organs, that is the tongue and the lips. Besides, the particular quality of a vowel can depend on a lot of other articulatory characteristics, such as the relative stability of the tongue, the position of the lips, physical duration of the segment, the force of articulation, the degree of tenseness of speech organs. So vowel quality could be thought of as a bundle of definite articulatory characteristics which are sometimes intricately interconnected and interdependent.

For example, the back position of the tongue causes the lip rounding, the front position of the tongue makes it rise higher in the mouth cavity, the lengthening of a vowel makes the organs of speech tenser at the moment of production and so on.

The analysis of the articulatory constituents of the quality of vowels allowed phoneticians to suggest the criteria which are conceived to be of great importance in classificatory description. First to be concerned here are the following criteria termed:

1. stability of articulation;

2. tongue position;

3. lip position;

4. character of the vowel end;

5. length;

6. tenseness

Stability of articulation specifies the actual position of the articulating organ in the process of the articulation of a vowel. There are two possible varieties: a) the tongue position is stable; b) it changes, that is the tongue moves from one position to another. In the first case the articulated vowel is relatively pure, in the second case a vowel consists of two clearly perceptible elements.

There exists in addition a third variety, an intermediate case, when the change in the tongue position is fairly weak. So according to this principle the English vowels are subdivided into:

1. monophthongs,

2. diphthongs,

3. diphthongoids

This interpretation is not shared by British phoneticians. A.C. Gimson, for example, distinguishes twenty vocalic phonemes which are made of vowels and vowel glides. Seven of them are treated as short phonemes: [i], [e], [.], [ɒ], [u], [٨], [ə] and thirteen as long ones: [a:], [ɔ:], [з:], [i:], [u:], [ei], [зu], [ai], [au], [ɒ u], [iə], [εə], [uə] five of which are considered relatively pure: [a:], [ɔ:] [з:], [i:], [u:]; the rest are referred to long phonemes with different glides: [ei], [ai], [ɒ I] with a glide to [i]; [зu], [au] with a glide to [u]; and [iə], [εə], [uə] with a glide to [ə]  [16].

Diphthongs are complex entities just like affricates, so essentially similar complications are known to exist with them. The question is whether they are monophonemic or biphonemic units. Scholars like V.A. Vasilyev and L.R. Zinder grant the English diphthongs monophonemic status on the basis of articulatory, morphonological and syllabic indivisibility as well as the criteria of duration and commutability [17].

As to articulatory indivisibility of the diphthongs it could be proved by the fact that neither morpheme nor syllable boundary that separate the nucleus and the glide can pass within it, for example: [′sei-iŋ] saying, [′krai-iŋ] crying, [′slзu-ə] slower, [′plзu-iŋ] ploughing, [′kliə-rə] clearer, [′εə-riŋ] airing, [′рuə-rə] poorer. The present study of the duration of diphthongs  shows that the length of diphthongs is the same as that that characterizes the English long monophthongs in the same phonetic context, cf. [sait – si:t], [кзut – kɔ:t]. Finally the application of commutation test proves the monophonemic status of diphthongs because any diphthong could be commutated with practically any vowel. It could be exemplified in the following oppositions:

[bait — bit] bite – bit

[bait—b٨t] bite – but

[bait — bɔ:t] bite – bought and so on.

Monophonemic character of English diphthongs is proved by native speakers’ intuitions who perceive these sound complexes as a single segment.

Another principle we should consider from phonological point of view is the position of the tongue. For the sake of convenience the position of the tongue in the mouth cavity is characterized from two aspects that is the horizontal and vertical movement.

According to the horizontal movement Russian phoneticians distinguish five classes of English vowels. They are:

1.front: [i:], [e], [ei], [ε(ə)];

2.front-retracted: [I], [I(ə)];

3.central: [٨] [з:] [ə], [з(u)], [ε(ə)];

4.back [ɒ], [ɔ:], [u:], [a:];

5.back-advanced: [u], [u(ə)] [18]

British phoneticians do not single out the classes of front-retracted and back-advanced vowels. So both [i:] and [i] vowels are classed as front, and both [u:] and [u] vowels are classed as back.

As to the tongue position in its vertical movement British scholars distinguish three classes of vowels: high (or close), mid (or half-open), and low (or open) vowels. Russian phoneticians made the classification more detailed distinguishing two subclasses in each class, i.e. broad and narrow variations of the three vertical positions of the tongue. Thus the following six groups of vowels are distinguished:

1.close a) narrow: [i:] [u:];

b)broad: [i], [u], [i(ə)], [u(ə)];

2.mid a) narrow: [e], [з:], [ə], [e(i)], [з(u)];

b)broad: [ə], [٨];

3.open a) narrow: [ε(ə)], [ɔ:], [ɒ (i)];

b)broad: [.], [a(i, u)], [ɒ], [a:] [15]

Another feature of English vowels which is sometimes included into the principles of classification is lip rounding. Traditionally three lip positions are distinguished, that is spread, neutral and rounded. For the purpose of classification it is sufficient to distinguish between two lip positions: rounded and unrounded, or neutral. The fact is that any back vowel in English is produced with rounded lips, the degree of rounding is different and depends on the height of the raised part of the tongue; the higher it is raised the more rounded the lips are. So lip rounding is a phoneme constitutive indispensable feature, because no back vowel can exist without it.

Another property of English vowel sounds checkness depends on the character of the articulatory transition from a vowel to a consonant. This kind of transition (VC) is very close in English. As a result all English short vowels are checked when stressed. The degree of checkness may vary and depends on the following consonant. Before fortis voiceless consonant it is more perceptible than before a lenis voiced consonant or sonorant. All long vowels are free. The English monophthongs are traditionally divided into two varieties according to their length:

a) short vowels: [i], [e], [.], [ɒ], [u], [٨], [ə];

b) long vowels: [i:], [a:], [ɔ:], [з:], [u:];

A vowel like any sound has physical duration – time which is required for its production (articulation). When sounds are used in connected speech they cannot help being influenced by one another. Duration is one of the characteristics of a vowel which is modified by and depends on the following factors:

1. its own length;

2. the accent of the syllable in which it occurs;

3. phonetic context;

4. the position of the sound in a syllable;

5. the position in a rhythmic structure;

6. the position in a tone group;

7. the position in a phrase;

8. the position in an utterance;

9. the tempo of the whole utterance;

10. the type of pronunciation;

11. the style of pronunciation;

The problem the analysts are concerned with is whether variations in quantity or length are meaningful (relevant), that is whether vowel length can be treated as a relevant feature of English vowel system. Different scholars attach varying significance to vowel quantity.

The approach of D. Jones, an outstanding British phonetician, extends the principle, underlying phonological relevance of vowel quantity. That means that words in such pairs as [bid] – [bi:d], [sit] – [si:t], [ful] – [fu:d], [′fɒ:wə:d] (foreword) – [′fɔ:wəd] (forward) are distinguished from one another by the opposition of different length, which D. Jones calls chronemes [15]. The difference in quantity is considered to be decisive and the difference in quality (the position of the active organ of speech) is considered to be subordinate to the difference in quantity. According to the point of view of V.A. Vassilyev, English is not a language in which chronemes as separate prosodic phonological units can exist [17].

One more articulatory characteristic needs our attention. That is tenseness. It characterizes the state of the organs of speech at the moment of production of a vowel. Special instrumental analysis shows that historically long vowels are tense while historically short vowels are lax.

Summarizing we could say that phonological analysis of the articulatory features of English vowels allows considering functionally relevant the following two characteristics:

a) stability of articulation;

b) tongue position;

The rest of the features mentioned above, that is lip position, character of vowel end, length and tenseness are indispensable constituents of vowel quality. Though they have no phonological value they are considerably important in teaching English phonetics. It is well-known that a vowel in an unstressed syllable is perceived as very short, weak, and indistinct. The unstressed syllables are usually associated with vowels of central or centralized quality [ə], [i], sometimes [u] and the diphthongs [зu], [ai] (or a syllabic consonant), e.g. among [ə’m٨ŋ], before [bi’fɔ:], useful [‘ju:sful], tomato [tə’ma:tзu], exercise [‘eksəsaiz], sudden [‘s٨dn].

Also vowels of full quality sometimes occur in unstressed positions, often in borrowed words of Latin and Greek origin, e.g. architect [‘a:kitekt], paragraph [‘p.rəgra:f], canteen [kaen’ti:n]. These non-reduced vowels in unstressed syllables are typical of all styles of pronunciation. Then again partially reduced sounds are found in unstressed positions. They appear in more formal and careful style of pronunciation instead of the neutral sound used in informal casual speech. Cf.: phonetics [fзu’netiks – fз’netiks – fə’netiks] [18].

Our next point should be made in connection with the phonemic status of the neutral sound [ə]. The phonological analysis marks the opposition of the neutral sound to other unstressed vowels, the most common among them being [I]. In the minimal pairs: officers [′ɒ fIsəz] – offices [′ɒ fisiz]; accept [ək′sept] – except [ik′sept], armour [′a:mə] – army [′a:mi] the neutral sound is phonologically opposed to the phoneme [i] with its own distinctive features capable of differentiating the meaning of lexical units.

So the neutral sound [ə] in officers, accept, armour is an independent phoneme opposed to the [i] phoneme of the minimal pairs given above.

On the other hand, the problem of the phonemic status of the neutral sound has a morphological aspect. In English there are numerous alternations of vowels in stressed and unstressed syllables between the derivatives of the same root or different grammatical forms of the same word. Cf.:

[æ] – [ə] man – sportsman

[٨] – [ə] some – wholesome

[ɒ ] [ə] combine combine

[ei] [ə] operation – operative

[зu] – [ə] post – postpone

The alternated sounds are allophones of one and the same phoneme as they are derivatives of the same lexical units, the same morphemes. Thus the neutral sounds in the examples above are the neutralized allophones of the non-reduced vowels of full formation; so [ə] in sportsman is an allophone of the [æ] phoneme as in man; [ə] in photography is an allophone of the [зu] phoneme as in photograph.

The modifications of vowels in a speech chain are traced in the following directions: they are either quantitative or qualitative or both. These changes of vowels in a speech continuum are determined by a number of factors such as the position of the vowel in the word, accentual structure, tempo of speech, rhythm, etc.

The decrease of the vowel quantity or in other words the shortening of the vowel length is known as a quantitative modification of vowels, which may be illustrated as follows:

1.The shortening of the vowel length occurs in unstressed positions, e.g. blackboard, sorrow (reduction). In these cases reduction affects both the length of the unstressed vowels and their quality.

Form words often demonstrate quantitative reduction in unstressed positions: e.g. Is he or ̖she to blame? – [hi:], but: At last he has ̖come – [hi].

2. The length of a vowel depends on its position in a word. It varies in different phonetic environments. English vowels are said to have positional length, e.g. knee – need– neat (accommodation). The vowel [i:] is the longest in the final position, it is obviously shorter before the lenis voiced consonant [d], and it is the shortest before the fortis voiceless consonant [t].

Qualitative modification of most vowels occurs in unstressed positions. Unstressed vowels lose their "color", their quality, which is illustrated by the examples below:

1. In unstressed syllables vowels of full value are usually subjected to qualitative changes, e.g. man sportsman, conductconduct.

In such cases the quality of the vowel is reduced to the neutral sound [ə].

Nearly one sound in five is either [ə] or the unstressed [i]. This high frequency of [ə] is the result of the rhythmic pattern: if unstressed syllables are given only a short duration, the vowel in them which might be otherwise full is reduced.

It is common knowledge that English rhythm prefers a pattern in which stressed syllables alternate with unstressed ones. The effect of this can be seen even in single words, where a shift of stress is often accompanied by a change of vowel quality; a full vowel becomes [ə], and [ə] becomes a full vowel.

Compare: analyse analysis.

2. Slight degree of nasalization marks vowels preceded or followed by the nasal consonants [n], [m], e.g. never, no, then, men (accommodation).

The realization of reduction as well as assimilation and accommodation is connected with the style of speech. In rapid colloquial speech reduction may result in vowel elision, the complete omission of the unstressed vowel, which is also known as zero reduction.

Zero reduction is likely to occur in a sequence of unstressed syllables, e.g. history, factory, literature, territory. It often occurs in initial unstressed syllables preceding the stressed one, e.g. correct, believe, suppose, perhaps.

The example below illustrates a stage-by-stage reduction (including zero reduction) of a phrase.

Has he done it? [hæz hi· ,d٨n it] - [həz hI ,d٨n it] - [əz i ,d٨n it] - [z i ,d٨n it]

3. Sound Alternations: the sound variations in words, their derivatives and grammatical forms of words are known as sound alternations. It is perfectly obvious that sound alternations are caused by assimilation, accommodation and reduction in speech. Alternations of consonants are mainly due to contextual assimilations: the dark [ł] in spell alternates with the clear [l] in spelling. Vowel alternations are the result of the reduction in unstressed positions: combine ['kɒmbain] (n) – combine [kəm'bain] (v) where [ɒ] in the stressed syllable of the noun alternates with the neutral sound in the unstressed syllable of the verb. Some sound alternations are traced to the phonetic changes in earlier periods of the language development and are known as historical.

Sound alternations are also widely spread on the synchronical level in the presentday English and are known as contextual. In connection with contextual sound alternations there arises a problem of phonemic identification of alternated sounds. The functioning of sounds in different grammatical forms and derivatives of words seems very complicated and flexible. The study of the relationship between phonemes and morphemes is called morphophonemics. 

The interrelation of phonology and morphology in linguistic investigations is also known as morphophonology or morphonology which is actually the phonology of morphemes. Morphonology studies the way in which sounds can alternate as different realizations of one and the same morpheme. A morpheme is a minimal unit of meaning. We would all agree that such words as windy, dusty, sunny consist of two morphemes. Similarly, demonstration, alternation have two component morphemes.

The meanings of wind, dust, sun as well as of demonstrate, situate are obvious. But what function the morphemes -y and -ion perform. On the basis of the examples, it appears that the function of -y is to convert a noun into an adjective. Similarly -ion converts a verb into a noun. These morphemes have a grammatical meaning whose main purpose is to convert one part of speech into another. Each set of data below exemplifies a sound alternation in one and the same morpheme of two different parts of speech: malice – malicious, active – activity, 'abstract – abs'tract etc.

We are interested now in the sound in its weak position. Vowels are said to be in their strong position when they are in stressed syllables and in the weak position when they are in the unstressed ones. Consonants may well be said to be in their strong position before vowels and in the intervocalic position; they are in weak positions when they are word final or precede other consonants.

There may be different solutions to the problem of phoneme identification in weak position of alternated words. The question arises whether the sound [ə] in the words ac'tivity and con'trast is a neutral phoneme or it is an allophone of the [a] or [o] phonemes (as in 'active, 'contrast,) which loses some of its distinctive features in the unstressed position.

The difference is quite essential as in the first case the neutral sound is identified as an independent neutral phoneme, in the second – it is a neutralized allophone of the [a] or [o] phonemes of the corresponding alternated words.

The loss of one or more distinctive features of a phoneme in the weak position is called phonemic neutralization. In English, the voicing opposition is neutralized after the initial [s]. We are well aware of the fact that the phonemes [t] and [d], for example, contrast in most environments: initially (tick —Dick), finally (bid – bit); after nasals (bend – bent), after [l] (cold – colt). But after [s] no contrast between [t], [d] is possible, nor there is a contrast between [p], [b] and [k], [g] in this environment. The voicing contrast is neutralized after initial [s].

1.3.2 General characteristics of consonant phonemes

In the English consonant system the following 24 consonant phonemes are distinguished: [p, b, t, d, k, g, f, v, s, z, h, m, n, w, j, r, l, ʒ, tʃ, dʒ, ŋ, ʃ].The quality of the consonants depends on several aspects:

1) work of the vocal cords;

2) what cavity is used as a resonator;

3) the force of the articulation and some other factors.

There are few ways of classifying English consonants. According to V.A.Vassilyev primary importance should be given to the type of obstruction and the manner of production of noise [19].

On this ground he distinguishes two large classes of consonants:

1. occlusive, in the production of which a complete obstruction is formed;

2. constrictive, in the production of which an incomplete obstruction is formed.

The phonological relevance of this feature could be exemplified in the following oppositions:

[ti] – [si] tea – sea (occlusive – constrictive)

[si:d] – [si:z] seed – seas (occlusive – constrictive)

[pul] – [ful] pull – full (occlusive —constrictive)

[bəut] – [vəut] boat – vote (occlusive —constrictive)

Each of the two classes is subdivided into noise consonants and sonorants. The division is based on the factor of prevailing either noise or tone component in the auditory characteristic of a sound. In their turn noise consonants are divided into plosive consonants (or stops) and affricates.

Another point of view is shared by M.A. Sokolova, K.P. Gintovt,

G.S. Tikhonova, R.M. Tikhonova [19]. They suggest that the first and basic principle of classification should be the degree of noise. Such consideration leads to dividing English consonants into two general kinds: noise consonants and sonorants.

Sonorants are sounds that differ greatly from all other consonants of the language.

This is largely due to the fact that in their production the air passage between the two organs of speech is fairly wide, that is much wider than in the production of noise consonants.

As a result, the auditory effect is tone, not noise. This peculiarity of articulation makes sonorants sound more like vowels than consonants. On this ground some of the British phoneticians refer some of these consonants to the class of semivowels, [r], [j], [w], for example. Acoustically sonorants are opposed to all other consonants because they are characterized by sharply defined formant structure and the total energy of most of them is very high. However, on functional grounds, according to their position in the syllable, [r], [j], [w] are included in the consonantal category, but from the point of view of their phonetic description they are more perfectly treated as vowel glides.

The place of articulation is another characteristic of English consonants which should be considered from the phonological point of view. The place of articulation is determined by the active organ of speech against the point of articulation. According to this principle the English consonants are classed into: labial, lingual, glottal. The class of labial consonants is subdivided into:

a) bilabial;

b) labio-dental; and among the class of lingual consonants three subclasses are distinguished. They are:

a) forelingual;

b) mediolingual;

c) backlingual;

The importance of this characteristic as phonologically relevant could be proved by means of a simple example. In the system of English consonants there could be found oppositions based on the active organ of speech and the place of obstruction:

[pæn] — [tæn] pan – tan (bilabial – forelingual)

[wai] – [lai] why – lie (bilabial – forelingual)

[weil] – [jeil] weil – yale (bilabial – mediolingual)

[pik] – [kik] pick – kick (bilabial – backlingual)

[les] – [jes] less – yes (forelingual – mediolingual)

[dei] – [gei] day – gay (forelingual – backlingual)

[sai] – [hai] sigh – high (forelingual – glottal)

[fi:t] – [si:t] feet – seat (labio-dental – forelingual)

Our next point should be made in connection, with another sound property, that is voiced — voiceless characteristic which depends on the work of the vocal cords. It has long been believed that from the articulatory point of view the distinction between such pairs of consonants as [p, b], [t, d], [k, g], [s, z], [f, v], is based on the absence or presence of vibrations of the vocal cords, or on the absence or presence of voice or tone component. However, there is also energy difference. All voiced consonants are weak (lenis) and all voiceless consonants are strong (fortis).

According to the position of the soft palate consonants can be oral and nasal. There are relatively few consonantal types in English which require the lowered position of the soft palate. They are the nasal occlusive sonorants [m], [n] and [ŋ]. They differ from oral plosives in that the soft palate is lowered allowing the escape of air into the nasal cavity. It is a well-known fact that no differences of meaning in English can be attributed to the presence or absence of nasalization. It is for this reason that it cannot be a phonologically relevant feature of English consonants, so it is an indispensable concomitant feature of English nasal consonants.

Another problem of a phonological character in the English consonantal system is the problem of affricates that is their phonological status and their number.

Language in everyday use is not conducted in terms of isolated, separate units; it is performed in connected sequences of larger units, in words, phrases and longer utterances. Consonants are modified according to the place of articulation. Assimilation takes place when a sound changes its character in order to become more like a neighboring sound.

The characteristic which can vary in this way is nearly always the place of articulation, and the sounds concerned are commonly those which involve a complete closure at some point in the mouth that is plosives and nasals which may be illustrated as follows:

1. The dental [t], [d], followed by the interdental [θ], [ð] sounds (partial regressive assimilation when the influence goes backwards from a "latter" sound to an "earlier" one), e.g. "eigth","at the", "breadth", "said that";

2. The post-alveolar [t], [d] under the influence of the post-alveolar [r] (partial regressive assimilation), e.g. "free", "true", "that right word", "dry", "dream", "the third room".

3. The post-alveolar [s], [z] (complete regressive assimilation), e.g.  horse-shoe, this shop , does she;

4. The affricative [t + j], [d + j] combinations (incomplete regressive assimilation), e.g. graduate, congratulate, did you, could you, what do you say.

The manner of articulation is also changed as a result of assimilation, which includes:

1. Loss of plosion: in the sequence of two plosive consonants the former loses its plosion: glad to see you, great trouble, and old clock (partial regressive assimilations).

2. Nasal plosion: in the sequence of a plosive followed by a nasal sonorant the manner of articulation of the plosive sound and the work of the soft palate are involved, which results in the nasal character of plosion release: sudden, nor now, at night, let me see (partial regressive assimilations).

3. Lateral plosion: in the sequence of a plosive followed by the lateral sonorant [l] the noise production of the plosive stop is changed into that of the lateral stop: settle, table, at last (partial regressive assimilations). It is obvious that in each of the occasions one characteristic feature of the phoneme is lost.

The voicing value of a consonant may also change through assimilation. This type of assimilation affects the work of the vocal cords and the force of articulation. In the particular voiced lenis sounds become voiceless fortis when followed by another voiceless sound, e.g.:

1. Fortis voiceless/lenis voiced type of assimilation is best manifested by the regressive assimilation in such words as newspaper (news [z] + paper); goosebeny (goose [s] + berry). In casual informal speech voicing assimilation is often met, e.g. have to do it, five past two. The sounds which assimilate their voicing are usually, as the examples show, voiced lenis fricatives assimilated to the initial voiceless fortis consonant of the following word. Grammatical items, in particular, are most affected: [z] of has, is, does changes to [s], and [v] of of, have becomes [f], e.g. She's five. Of course. She has fine eyes. You've spoiled it. Does Pete like it?

2. The weak forms of the verbs is and has are also assimilated to the final voiceless fortis consonants of the preceding word thus the assimilation is functioning in the progressive direction, e.g. Your aunt's coming. What’s your name? (partial progressive assimilation).

3. English sonorants [m, n, r, 1, j, w] preceded by the fortis voiceless consonants [p, t, k, s] are partially devoiced, e.g. smart, snake, tray, quick, twins, play, pride (partial progressive assimilation).

Lip position may be affected by the accommodation, the interchange of consonant + vowel type.

Labialisation of consonants is traced under the influence of the neighboring back vowels (accommodation), e.g. pool, moon, rude, soon, who, cool, etc. It is possible to speak about the spread lip position of consonants followed or preceded by front vowels [i:], [i], e.g. tea – beat; meet – team; feat – leaf, keep – leak; sit – miss (accommodation). The position of the soft palate is also involved in the accommodation.

Slight nasalization as the result of prolonged lowering of the soft palate is sometimes traced in vowels under the influence of the neighboring sonant [m] and [n], e.g. and, morning, men, come in (accommodation).

Elision or complete loss of sounds, both vowels and consonants, is observed in the structure of English words. It is typical of rapid colloquial speech and marks the following sounds:

1. Loss of [h] in personal and possessive pronouns he, his, her, him and the forms of the auxiliary verb have, has, had is widespread, e.g. What has he done?

2. [1] tends to be lost when preceded by [a:], e.g. always, already, all right.

3. Alveolar plosives are often elided in case the cluster is followed by another consonant, e.g. next day, just one, mashed potatoes. If a vowel follows, the consonant remains, e.g. first of all, passed in time. Whole syllables may be elided in rapid speech: library ['laibri], literary ['litri].

Examples of historical elision are also known. They are initial consonants in write, know, knight, the medial consonant [t] in fasten, listen, whistle, castle.

While the elision is a very common process in connected speech, we also occasionally find sounds being inserted. When a word which ends in a vowel is followed by another word beginning with a vowel, the so-called intrusive "r" is sometimes pronounced between the vowels, e.g. Asia and Africa, the idea of it [ði:ai'diər əvit] ma and pa ['mа:r ənd 'pa:]. The so-called linking "r," is a common example of insertion, e.g. clearer, a teacher of English. When the word-final vowel is a diphthong which glides to [i] such as [ai], [ei] the palatal sonorant [j] tends to be inserted, e.g. saying ['seijiŋ]; trying ['traiiŋ].

In case of the [U]-gliding diphthongs [əu], [au] the bilabial sonorant [w] is sometimes inserted, e.g. going ['gəuwiŋ], allowing [ə'lauwiŋ].

The process of inserting the sonorants [r], [j] or [w] may seem to contradict the tendency towards the economy of articulatory efforts. The explanation for it lies in the fact that it is apparently easier from the articulatory point of view to insert those sounds than to leave them out.

The insertion of a consonant-like sound, namely a sonorant, interrupts the sequence of two vowels (VV) to make it a more optional syllable type: consonant + vowel (CV). Thus, insertion occurs in connected speech in order to facilitate the process of articulation for the speaker, and not as a way of providing extra information for the listener.

The ability to produce English with an English-like pattern of stress and rhythm involves stress-timing (= the placement of stress only on selected syllables), which in turn requires speakers to take shortcuts in how they pronounce words.

Natural sounding pronunciation in conversational English is achieved through blends, overlapping, reduction and omissions of sounds to accommodate its stresstimed rhythmic pattern, i.e. to squeeze syllables between stressed elements and facilitate their articulation so that the regular timing can be maintained.

Such processes are called co-articulatory/adjustment phenomena and they comprise:

(1) change of consonant or vowel quality;

(2) loss of consonant or vowels, and even

(3) loss of entire syllables;

I must go = vowel change and consonant loss;

memory = vowel and syllable loss;

did you = consonant blending and vowel change;

actually = consonant blending, vowel and syllable loss.

Syllables or words which are articulated precisely are those high in information

content, while those which are weakened, shortened or dropped are predictable and can be guessed from the context.

English consonants have been remarkably stable over time, and have undergone few changes in the last 1500 years. On the other hand, English vowels have been quite unstable. Not surprisingly, then, the main differences between modern dialects almost always involve vowels.

Around the late 14th century, English began to undergo the Great Vowel Shift, in which the high long vowels [i] and [u] in words like price and mouth became diphthongized, first to [əɪ] and [əʊ] (where they remain today in some environments in some accents such as Canadian English) and later to their modern values [aɪ] and [aʊ]. This is not unique to English, as this also happened in Dutch (first shift only, but in dialects and other non-standard varieties frequently both) and German (both shifts) [20].

The other long vowels became higher:

[e] became [i] (for example meet).

[a] became [e] (later diphthongized to [eɪ] (for example name).

[o] became [u] (for example goose).

[ɔ] became [o] (later diphthongized to [əʊ] (RP) and [oʊ] (GA), for example bone).

Later developments complicate the picture: whereas in Geoffrey Chaucer's time food, good, and blood all had the vowel [o] and in William Shakespeare's time they all had the vowel [u], in modern pronunciation good has shortened its vowel to [ʊ] and blood has shortened and lowered its vowel to [ʌ] in most accents.

Speaking of English consonants it must be said that there are some problems of phonological character in the English consonantal system; it is the problem of affricates - their phonological status and their number. The question is: what kind of facts a phonological theory has to explain.

1) Are the English [t∫, dʒ] sounds monophonemic entities or biphonemic combinations (sequences, clusters)?

2) If they are monophonemic, how many phonemes of the same kind exist in English, or, in other words, can such clusters as [tr, dr] and [tθ, dð] be considered affricates?

To define it is not an easy matter. One thing is clear: these sounds are complexes because articulatory we can distinguish two elements. Considering phonemic duality of affricates, it is necessary to analyze the relation of affricates to other consonant phonemes to be able to define their status in the system. The problem of affricates is a point of considerable controversy among phoneticians. According to Russian specialists in English phonetics, there are two affricates in English: [t∫, dʒ]. D. Jones points out there are six of them: [t∫, dʒ ], [ts, dz], and [tr, dr]. A.C. Gimson increases their number adding two more affricates: [tθ, tð]. Russian phoneticians look at English affricates through the eyes of a phoneme theory, according to which a phoneme has three aspects: articulatory, acoustic and functional, the latter being the most significant one. As to British phoneticians, their primary concern is the articulatory-acoustic unity of these complexes [21]. Before looking at these complexes from a functional point of view it is necessary to define their articulatory indivisibility. According to N.S. Trubetzkoy's point of view a sound complex may be considered monophonemic if: a) its elements belong to the same syllable; b) it is produced by one articulatory effort; c) its duration should not exceed normal duration of elements. Let’s apply these criteria to the sound complexes [22].

         1.Syllabic indivisibility

butcher [but∫ -ə] lightship [lait-∫ip]

mattress [mætr-is] footrest [fut-rest]

curtsey [kз:-tsi] out-set [aut-set]

eighth [eitθ] whitethorn [wait-θo:n]

In the words in the left column the sounds [t∫], [tr], [ts], [tθ] belong to one syllable and cannot be divided into two elements by a syllable dividing line.

2. Articulatory indivisibility. Special instrumental analysis shows that all the sound complexes are homogeneous and produced by one articulatory effort.

3. Duration. With G.P. Torsuyev we could state that length of sounds depends on the position in the phonetic context, therefore it cannot serve a reliable basis in phonological analysis. He writes that the length of English [t∫] in the words chair and match is different; [t∫] in match is considerably longer than |t| in mat and may be even longer than [∫] in mash. This does not prove, however, that [t∫] is biphonemic [23].  

According to morphological criterion a sound complex is considered to be monophonemic if a morpheme boundary cannot pass within it because it is generally assumed that a phoneme is morphologically indivisible. If we consider [t∫], [dʒ] from this point of view we could be secure to grant them a monophonemic status, since they are indispensable. As to [ts], [dz] and [tθ], [dð] complexes their last elements are separate morphemes [s], [z], [θ], [ð] so these elements are easily singled out by the native speaker in any kind of phonetic context. These complexes do not correspond to the phonological models of the English language and cannot exist in the system of phonemes. The case with [tr], [dr] complexes is still more difficult.

By way of conclusion we could say that the two approaches have been adopted towards this phenomenon are as follows: the finding that there are eight affricates in English [t∫], [dʒ], [tr], [dr], [ts], [dz], [tð], [dθ] is consistent with articulatory and acoustic point of view, because in this respect the entities are indivisible. This is the way the British phoneticians see the situation.

On the other hand, Russian phoneticians are consistent in looking at the phenomenon from the morphological and the phonological point of view which allows them to define [t∫], [dʒ] as monophonemic units and [tr], [dr], [ts], [dz], [tð], [dθ] as biphonemic complexes. However, this point of view reveals the possibility of ignoring the articulatory and acoustic indivisibility.

1.4 Main trends in the phoneme theory

1.4.1 Phonological schools

As it has been already mentioned the term” phoneme” appeared in the linguistic literature of the 19th century in the works of the French linguist F. de Saussure. According to him a phoneme is defined as a total sum of acoustic impressions and articulatory movements. The linguistic aspect is lacking in this definition. He ignores the sense differentiating function of the phoneme (his phisiologysm) and draws a line between language and speech, considering it as a system of signs, expressing ideas. His conceptions greatly influenced a great number of linguists and schools.

The phoneme theory came into being in Russia. Its originator was Prof. B. de Courtenay, the founder of the Kazan linguistic school. His work on the phoneme theory may be roughly subdivided into two periods.

Firstly, he considered a phoneme to be a component of a morpheme. He stated that one and the same morpheme was always represented by the same combination of sounds. He centered his attention mainly on the phenomenon of phonetic and historical alternations.

Secondly, he abandoned this conception in the 90th of the XIX century and began to search for a unit not bound by the limits of a morpheme. He defined a phoneme as an idea of a sound which appears in the mind of a speaker before the sound is uttered. A speech sound is an invention of the scientists. What really exists is the perception of a sound, the complex perception of the articulatory movements, muscular sensation and acoustic impressions.

This complex perception is a phoneme.  This theory was developed by Prof.Shcherba, Krushevsky and by other Russian and foreign linguists. According to Shcherba sounds must be studied not only from the acoustic points of view, but as sounds capable of distinguishing one word of a language from other words of the same language. They fulfill a communicative function in speech. According to Shcherba, a phoneme is realized in speech in concrete sound combinations, which he calls allophones. The most typical, which may be pronounced in isolation, represent a speech element, opposed to other sounds. It is “tipichniy ottenok” [24].

The number of phonemes in a given language is defined by the principal members. In English there are 44 phonemes, in Russian – 36. Phonemic variants are very important, because they may develop into new phonemes or they may stop functioning the theory of the phoneme was then further developed by Shcherba’s disciples. A phoneme is understood as a historical category. It functions in a language at a certain stage of its development. It may be characterized as a unit of different aspects:  its material and objective aspects. It really exists in a language. It is a concrete sound, characterized by definite formation and definite acoustic qualities. It exists independently in the speech of all the members of the community; it does not depend on the will of an individual, it is obligatory for all, as it is a product of the historical development of a given collective body.

Thus, it is a social phenomenon. The phoneme has two main functions:

a) to serve as a material integument of words and morphemes;

b) to differentiate the meaning of words, their grammatical forms and morphemes.

The phoneme is the result of generalization. It is a dialectical unit of the general and the particular. It is realized in speech in concrete sound combinations as allophones, being at the same time something typical and general when opposed to other phonemes in speech.

The theory of the phoneme is being developed into two main directions in our country: the Moscow linguistic school, the Leningrad linguistic school. There are many different linguistic schools of the phoneme abroad: the Prague phonological school, the London phonological school, the American phonological school and the Copenhagen phonological school.

The phoneme theory was first formulated at the end of the 19th century. Its founder was Prof. I.A. Baudouin de Courtenay. Though his theory lacks consistency and there are some drawbacks in it is initiated the development of the phoneme theory in Russia as well as abroad. The opinions on the nature of the phoneme and the definition of the phoneme differ among scientists in our country and abroad.

The conceptions of the phoneme nowadays are numerous and varied. Nevertheless, they may be grouped and classified because some phonological conceptions have a number of features in common.

The various phonological schools chiefly differ in their solution to the two main problems of phonology:

1) the definition of the inventory of the phonemes of a given language and

2) the definition of the phonemic status of speech sounds in unstressed positions.

The phoneme theory in the Russia is developing in two directions. Hence, two phonological schools are distinguished here: the Moscow School and the Leningrad School.

To the Moscow School belong R.I. Avanessov, A.A. Reformatsky, P.S. Kuznetsov, N.F. Yakovlev, V.N. Sidorov and their supporters. They have developed Baudouin’s morphological conception of the early period. They investigate the phoneme mostly on the basis of the Russian language.

To the Leningrad School belong L.V. Shcherba and his followers (L.R. Zinder, O.I. Dikushina, M.I. Matusevitch, V.A. Vassilyev, G.P. Torsuyev and others). They investigate the problem on the basis of foreign languages.

Prof. L.V. Shcherba has adopted and developed I.A. Baudouin de Courtenay’s psychological conception of the late period. Continuing the work of his teacher L.V. Shcherba has created a truly materialistic phoneme theory and was the first to advance the idea of the distinctive function of the phoneme.

The representatives of the Moscow phonological school consider that the same speech sound may belong to different phonemes. For instance, the following pairs of words are pronounced identically:

лук – луг                                сидеть – седеть

рос – роз                                серп - серб

рот – род                                пять – пядь

кос – коз                                 рок - рог

бок – бог                                 бачок - бочок

вот – вод                                 предать – придать

(The voiced consonants in final position are devoiced; the vowels in unstressed position are reduced.)

According to the Moscow School the [к] sound of the word “лук” is an allophone of the [к] phoneme, whereas the [к] sound in the word “луг” is an allophone of the [г] phoneme. Consequently, the [Λ] sound of the word “бачок” is an allophone of the [a] phoneme, but the [Λ] sound of the word “бочок” is an allophone of the [o] phoneme.

According to the Moscow School the neutral vowel sound in “progressive” [prә’gresiv] belongs to the English [ou] phoneme because [ou] occurs in a stressed position in “progress” [’prougres]. The neutral vowel sound in “activity” [әk’tiviti] belongs to the English [әe]phoneme because [әe] occurs in a stressed position in “act” [әekt]. The neutral vowel sound in “gooseberry” [guzbәri/ belongs to the /e/ phoneme, because /e/ occurs in a stressed position in “berry” [beri]. Consequently, the [z] sound in the word “gooseberry” [guzbәri] belongs to the [s] phoneme, because [s] is used in a strong position in “goose” [gu:s]. But the [s] sound in the word “newspaper” [’nju:speipә] belongs to the [z] phoneme because [z] is used in a strong position in “news” [nju:z].

The representatives of the Leningrad phonological school consider that the [к] sound of the words “лук” and “луг” are allophones of the [к] phoneme. The neutral sounds of the words “бочок” and “бачок” are allophones of the neutral vowel phoneme [Λ].

According to the Leningrad School the neutral vowel sounds in the words “progressive” [’prougresiv],“activity” [әk’tiviti], “gooseberry” [guzbәri], etc. belong to the neutral phoneme [ә]. Consequently, the [s] sounds in the words “goose” [gu:s] and “newspaper” [’nju:speipә] belong to the [s] phoneme, whereas the [z] sounds in the words “gooseberry” [guzbәri] and “news” [nju:z] belong to the /z/ phoneme.

The Leningrad School analyses and investigates sounds as real speech units, which is of great practical value in the process of teaching a foreign language to students.

There is a third phonological school which is known as the Prague Linguistic Circle. To this school belong N.S. Trubetzkoy, R. Jacobson, H. Martinet and others. The originator of it was N.S. Trubetzkoy. He became acquainted with Baudouin’s phoneme theory when he was studying at Moscow University. He admits that his own theory is a development of Baudouin de Courtenay and Shcherba’s phnoneme systems.

One of the main points of his theory is that of archiphonemes. According to N.S. Trubetzkoy the archiphoneme is a combination of distinctive features common to two phonemes. For instance, the speech sounds  [к] and  [г] (in the words “лук” and “луг” and «кот», «год») differ only by the work of the vocal cords but possess the following identical features: (1) plosive, (2) backlingual. These two common features are called relevant and they constitute the archiphoneme to which both [к] and [г] belong. It is neither voiced nor voiceless and is designated by the capital letter [К]. According to N.S. Trubetzkoy, a speech sound is a combination of all the features, both relevant and irrelevant, while the archiphoneme is a combination of only relevant features [25].

The London phonological school is represented by Prof. D. Jones of London University. In his monograph “The phoneme: its Nature and Use” he says that the phoneme theory was first introduced to him in 1911 by L.V. Shcherba of Leningrad. D. Jones’ own definition of the phoneme is a follows: “… a phoneme is a family of sounds in a given language which are related in character and are used in such a way that no one member ever occurs in the same phonetic context as any other member” [26].

In this and other definitions of the phoneme he does not mention the distinctive function of the phoneme but he tells about it in his later works.

In his work “the Phoneme: its Nature and Use” D. Jones develops the so-called “atomistic” conception of the phoneme. According to it he breaks up the phonemes into atoms which are different features of the phonemes, such as quality, length, tone, etc. Such distinctive features exist independently from each other. Jones’ atomistic theory is criticized because one distinctive feature cannot exist apart from all the others. For example, length by itself is an abstraction, while a long name is a reality [26].

The American phonological school is headed by Leonard Bloomfield and Edward Sapir. Here also belong W. F. Twaddel, Ch. F. Hockett and others.

Bloomfield’s definition of the phoneme runs as follows: “…a minimum unit of distinctive sound-features…” [27].

W.F. Twaddel defines it as “an abstractional fiction”. The representatives of the American phonological school tend more and more to develop an abstractional view of the phoneme [27]. Ch. F. Hockett says that language may be compared to any system of codes, such as the Morse code or the waving flags code [27]. The Copenhagen Trend is known as structuralism. Their treatment of the phoneme is mathematical. They consider the phoneme in mathematical ratios and compare the language with a system of signs. Their approach is synchronical as well.

1.4.2     Methods of phonological analysis

The aim of the phonological analysis is, firstly, to determine which differences of sounds are phonemic (i.e. relevant for the differentiation of the phonemes) and which are non-phonemic and, secondly, to find the inventory of the phonemes of this or that language.

A number of principles have been established for ascertaining the phonemic structure of a language. For an unknown language the procedure of identifying the phonemes of a language as the smallest language units has several stages. The first step is to determine the minimum recurrent segments segmentation of speech continuum) and to record them graphically by means of allophonic transcription. To do this an analyst gathers a number of sound sequences with different meanings and compares them. For example, the comparison of [stik] and [sti:k] reveals the segments (sounds) [i] and [i:], comparison of [stik] and [spik] reveals the segments [st] and [sp] and the further comparison of these two with [tIk] and [taek], [sik] and [si:k] splits these segments into smaller segments [s], [t], [p]. If we try to divide them further there is no comparison that allows us to divide [s] or [t] or [p] into two, and we have therefore arrived at the minimal segments. From what we have shown it follows that it is possible to single out the minimal segments opposing them to one another in the same phonetic context or, in other words, in sequences which differ in one element only.

The next step in the procedure is the arranging of sounds into functionally similar groups. We do not know yet what sounds are contrastive in this language and what sounds are merely allophones of one and the same phoneme. There are two most widely used methods of finding it out. They are the distributional method and the semantic method. The distributional method is mainly used by phoneticians of "structuralist" persuasions [28].

These phoneticians consider it to group all the sounds pronounced by native speakers into phonemes according to the two laws of phonemic and allophonic distribution.

These laws were discovered long ago and are as follows:

1. Allophones of different phonemes occur in the same phonetic context.

2. Allophones of the same phoneme never occur in the same phonetic context.

The fact is that the sounds of a language combine according to a certain pattern characteristic of this language. Phonemic opposability depends on the way the phonemes are distributed in their occurrence. That means that in any language certain sounds do not occur in certain positions.

If more or less different sounds occur in the same phonetic context they should be allophones of different phonemes. In this case their distribution is contrastive.

If more or less similar speech sounds occur in different positions and never occur in the same phonetic context they are allophones of one and the same phoneme. In this case their distribution is complementary.

Still there are cases when two sounds are in complementary distribution but are not referred to the same phoneme.

This is the case with the English [h] and [n]. The [h] occurs only initially or before a vowel while [n] occurs only medially or finally after a vowel and never occurs initially. In such case the method of distribution is modified by addition of the criterion of phonetic similarity/ dissimilarity. The decisions are not made purely on distributional grounds. Articulatory features are taken into account as well.

So far we have considered cases when the distribution of sounds was either contrastive or complementary. There is, however, a third possibility, namely, that the sounds both occur in a language but the speakers are inconsistent in the way they use them. In such cases we must take them as free variants of a single phoneme. We could explain it on the basis of "dialect" or on the basis of sociolinguistics. It could be that one variant is a "prestige" form which the speaker uses when he is constantly "monitoring" what he says while the other variant of pronunciation is found in casual or less formal speech.

The semantic method is applied for phonological analysis of both unknown languages and languages already described. In case of the latter it is used to determine the phonemic status of sounds which are not easily identified from phonological point of view. The method is based on a phonemic rule that phonemes can distinguish words and morphemes when opposed to one another. The semantic method of identifying the phonemes of a language attaches great significance to meaning.

It consists in systematic substitution of the sound for another in order to ascertain in which cases where the phonetic context remains the same such substitution leads to a change of meaning. It is with the help of an informant that the change of meaning is stated. This procedure is called the commutation test. 

It consists in finding minimal pairs of words and their grammatical forms. For example, an analyst arrives at the sequence [pin]. He substitutes the sound [p] for the sound [b] or [s], [d], [w]. The substitution leads to the change of meaning, cf.: pin, bin, sin, din, win. This would be a strong evidence that [p], [b], [s], [d], [w] can be regarded as allophones of different phonemes.

To establish the phonemic structure of a language it is necessary to establish the whole system of oppositions. All the sounds should be opposed in word-initial, word-medial and word-final positions. There are three kinds of oppositions. If members of the opposition differ in one feature the opposition is said to be single, e.g. pen – ben. Common features are: occlusive – occlusive, labial – labial. Differentiating features are: fortis – lenis. If two distinctive features are marked, the opposition is said to be double, e.g. pen – den. Common features are: occlusive – occlusive. Differentiating features are are: labial – lingual, fortis voiceless – lenis voiced.

If three distinctive features are marked the opposition is said to be triple, e.g. pen – then. Differentiating features are: occlusive – constrictive, labial – dental, fortis voiceless – lenis voiced.

Phonemes are the linguistically contrastive or significant sounds (or sets of sounds) of a language. Such a contrast is usually demonstrated by the existence of minimal pairs or contrast in identical environment. Minimal pairs are pairs of words which vary only by the identity of the segment (another word for a single speech sound) at a single location in the word (eg. [mæt] and [kæt]). If two segments contrast in identical environment then they must belong to different phonemes. A paradigm of minimal phonological contrasts is a set of words differing only by one speech sound. In most languages it is rare to find a paradigm that contrasts a complete class of phonemes (eg. all vowels, all consonants, all stops etc.). The English stop consonants could be defined by the following set of minimally contrasting words:

1) [pin] vs [bin] vs [tin] vs [din] vs [kin].

Only [ɡ] does not occur in this paradigm and at least one minimal pair must be found with each of the other 5 stops to prove conclusively that it is not a variant form of one of them.

2) [ɡæn] vs [pæn] vs [bæn] vs [tæn] vs [dæn].

Again, only five stops belong to this paradigm. A single minimal pair contrasting [ɡ] and [k] is required now to fully demonstrate the set of English stop consonants.

3) [ɡein] vs [kein].

Sometimes it is not possible to find a minimal pair which would support the contrastiveness of two phonemes and it is necessary to resort to examples of contrast in analogous environment (C.A.E.). C.A.E. is almost a minimal pair, however the pair of words differs by more than just the pair of sounds in question. Preferably, the other points of variation in the pair of words are as remote as possible (and certainly never adjacent and preferably not in the same syllable) from the environment of the pairs of sounds being tested. eg. [ʃ] vs [ʒ] in English are usually supported by examples of pairs such as "pressure" [preʃə] vs "treasure" [treʒə], where only the initial consonants differ and are sufficiently remote from the opposition being examined to be considered unlikely to have any conditioning effect on the selection of phones. The only true minimal pairs for these two sounds in English involve at least one word (often a proper noun) that has been borrowed from another language (eg. "confucian" [kənfjʉʃən] vs "confusion" [kənfjʉʒən], and "aleutian" [əlʉʃən] vs "allusion" [əlʉʒən]). A syntagmatic analysis of a speech sound, on the other hand, identifies a unit's identity within a language. In other words, it indicates all of the locations or contexts within the words of a particular language where the sound can be found [29].

Allophones are the linguistically non-significant variants of each phoneme. In other words a phoneme may be realized by more than one speech sound and the selection of each variant is usually conditioned by the phonetic environment of the phoneme. Occasionally allophone selection is not conditioned but may vary from person to person and occasion to occasion (i.e. free variation). A phoneme is a set of allophones or individual non-contrastive speech segments. Allophones are sounds, whilst a phoneme is a set of such sounds.

Allophones are usually relatively similar sounds which are in mutually exclusive or complementary distribution (C.D.). The C.D. of two phones means that the two phones can never be found in the same environment (i.e. the same environment in the senses of position in the word and the identity of adjacent phonemes). If two sounds are phonetically similar and they are in C.D. then they can be assumed to be allophones of the same phoneme.

E.g.: in many languages voiced and voiceless stops with the same place of articulation do not contrast linguistically but are rather two phonetic realizations of a single phoneme (i.e. /p/=[p,b],/t/=[t,d], and /k/=[k,ɡ]). In other words, voicing is not contrastive (at least for stops) and the selection of the appropriate allophone is in some contexts fully conditioned by phonetic context (e.g. word medially and depending upon the voicing of adjacent consonants), and is in some contexts either partially conditioned or even completely unconditioned (e.g. word initially, where in some dialects of a language the voiceless allophone is preferred, in others the voiced allophone is preferred, and in others the choice of allophone is a matter of individual choice).

Some French speakers choose to use the alveolar trill [r] when in the village and the more prestigious uvular trill [r] when in Paris. Such a choice is made for sociological reasons.

Allophones must be phonetically similar to each other. In analysis, this means you can assume that highly dissimilar sounds are separate phonemes (even if they are in complementary distribution). For this reason no attempt is made to find minimal pairs which contrast vowels with consonants. Exactly what can be considered phonetically similar may vary somewhat from language family to language family and so the notion of phonetic similarity can seem to be quite unclear at times.

Sounds can be phonetically similar from both articulatory and auditory points of view and for this reason one often finds a pair of sounds that vary greatly in their place of articulation but are sufficiently similar auditorily to be considered phonetically similar (e.g. [h] and [ç] are voiceless fricatives which are distant in terms of glottal and palatal places of articulation, but which nevertheless are sufficiently similar auditorily to be allophones of a single phoneme in some languages such as Japanese).

In English, [h] and [ŋ] are in complementary distribution. The sound [h] only ever occurs at the beginning of a syllable (head, heart, enhance, perhaps) whilst [ŋ] only ever occurs at the end of a syllable (sing, singer, finger). They are, however, so dissimilar that no one regards them as allophones of the one phoneme. They vary in place and manner of articulation, as well as voicing. Further the places of articulation (velar versus glottal) are quite remote from each other and [h] is oral whilst [ŋ] is nasal.

Phonetic similarity is therefore based on the notion of shared features. Such judgments of similarity will vary from language to language and there are no universal criteria of similarity. Although it is implied above that the notion of "phonetic similarity" is in some way less linguistically abstract (more phonetic) than the notion of complementary distribution, it is, nevertheless, a quite abstract concept. There are no obvious and consistent acoustic, auditory or articulatory criteria for phonetic similarity. Further, since a notion of similarity implies a continuum the following question must be asked of two phones in complementary distribution.

There are many examples of very similar phones which are perceived by native speakers to belong to separate phonemes. In English, for example, a word terminal voiceless stop may be either released and aspirated or unreleased. The homorganic voiced stop may also be released or unreleased. Often the unreleased voiced and voiceless stops may actually be identical in every way except that the preceding vowel is lengthened before the phonologically voiced stop. In terms of phonetic similarity, the two unreleased stops may actually be identical and yet be perceived by native speakers to belong to different phonemes.

1.5 Phonemes in sign languages

Phonemes are conventionally placed between slashes in transcription, whereas speech sounds (phones) are placed between square brackets. Thus [pʊʃ] represents a sequence of three phonemes [p], [ʊ], [ʃ] (the word push in standard English), while [pʰʊʃ] represents the phonetic sequence of sounds [pʰ] (aspirated "p"), [ʊ], [ʃ] (the usual pronunciation of push).(Another similar convention is the use of angle brackets to enclose the units of orthography, namely graphemes; for example, 〈f〉 represents the written letter (grapheme) f.)

The symbols used for particular phonemes are often taken from the International Phonetic Alphabet (IPA), the same set of symbols that are most commonly used for phones. (For computer typing purposes, systems such as X-SAMPA and Kirshenbaum exist to represent IPA symbols in plain text.) However descriptions of particular languages may use different conventional symbols to represent the phonemes of those languages. For languages whose writing systems employ the phonemic principle, ordinary letters may be used to denote phonemes, although this approach is often hampered by the complexity of the relationship between orthography and pronunciation.

A phoneme is a sound or a group of different sounds perceived to have the same function by speakers of the language or dialect in question. An example is the English phoneme [k], which occurs in words such as cat, kit, school, skill. Although most native speakers do not notice this, in most English dialects the "c/k" sounds in these words are not identical: in cat and kit the sound is aspirated, while in school and skill it is unaspirated. The words therefore contain different speech sounds, or phones, transcribed [kʰ] for the aspirated form, [k] for the unaspirated one. These different sounds are nonetheless considered to belong to the same phoneme, because if a speaker used one instead of the other, the meaning of the word would not change: using the aspirated form [kʰ] in skill might sound odd, but the word would still be recognized. By contrast, some other sounds would cause a change in meaning if substituted: for example, substitution of the sound [t] would produce the different word still, and that sound must therefore be considered to represent a different phoneme (the phoneme [t]).

The above shows that in English, [k] and [kʰ] are allophones of a single phoneme [k]. In some languages, however, [kʰ] and [k] are perceived by native speakers as different sounds, and substituting one for the other can change the meaning of a word; this means that in those languages, the two sounds represent different phonemes. For example, in Icelandic, [kʰ] is the first sound of kátur meaning "cheerful", while [k] is the first sound of gátur meaning "riddles". Icelandic therefore has two separate phonemes /kʰ/ and /k/.

A pair of words like kátur and gátur (above) that differ only in one phone is called a minimal pair for the two alternative phones in question (in this case, [kʰ] and [k]).

The existence of minimal pairs is a common test to decide whether two phones represent different phonemes or are allophones of the same phoneme. To take another example, the minimal pair tip and dip illustrates that in English, [t] and [d] belong to separate phonemes, [t] and [d]; since these two words have different meanings, English speakers must be conscious of the distinction between the two sounds. In other languages, though, including Korean, even though both sounds [t] and [d] occur, no such minimal pair exists.

The lack of minimal pairs distinguishing [t] and [d] in Korean provides evidence that in this language they are allophones of a single phoneme [t].

The word “tata” is pronounced [tada], for example. That is, when they hear this word, Korean speakers perceive the same sound in both the beginning and middle of the word, whereas an English speaker would perceive different sounds in these two locations.

However, the absence of minimal pairs for a given pair of phones does not always mean that they belong to the same phoneme: they may be too dissimilar phonetically for it to be likely that speakers perceive them as the same sound. For example, English has no minimal pair for the sounds [h] (as in hat) and [ŋ] (as in bang), and the fact that they can be shown to be in complementary distribution could be used to argue for them being allophones of the same phoneme. However, they are so dissimilar phonetically that they are considered separate phonemes.

Phonologists have sometimes had recourse to "near minimal pairs" to show that speakers of the language perceive two sounds as significantly different even if no exact minimal pair exists in the lexicon. It is virtually impossible to find a minimal pair to distinguish English [ʃ] from [ʒ], yet it seems uncontroversial to claim that the two consonants are distinct phonemes. The two words 'pressure' [preʃə] and 'pleasure' [pleʒə] can serve as a near minimal pair.

While phonemes are normally conceived of as abstractions of discrete segmental speech sounds (vowels and consonants), there are other features of pronunciation – principally tone and stress – which in some languages can change the meaning of words in the way that phoneme contrasts do, and are consequently called phonemic features of those languages.

Phonemic stress is encountered in languages such as English. For example, the word invite stressed on the second syllable is a verb, but when stressed on the first syllable (without changing any of the individual sounds) it becomes a noun. The position of the stress in the word affects the meaning, and therefore a full phonemic specification (providing enough detail to enable the word to be pronounced unambiguously) would include indication of the position of the stress: [in’vait] for the verb, [‘invait] for the noun. In other languages, such as French, word stress cannot have this function (its position is generally predictable) and is therefore not phonemic (and is not usually indicated in dictionaries).

Phonemic tones are found in languages such as Mandarin Chinese, in which a given syllable can have five different tonal pronunciations. For example, the character   (pronounced mā, high level pitch) means "mom", (má, rising pitch) means "hemp", (mǎ, falling then rising) means "horse",  (mà, falling) means "scold", and (ma, neutral tone) is an interrogative particle. The tone "phonemes" in such languages are sometimes called tonemes [30].

Languages such as English do not have phonemic tone, although they use intonation for functions such as emphasis and attitude.

When a phoneme has more than one allophone, the one actually heard at a given occurrence of that phoneme may be dependent on the phonetic environment (surrounding sounds) – allophones which normally cannot appear in the same environment are said to be in complementary distribution. In other cases the choice of allophone may be dependent on the individual speaker or other unpredictable factors – such allophones are said to be in free variation.

A somewhat different example is found in English, with the three nasal phonemes [m, n, ŋ]. In word-final position these all contrast, as shown by the minimal triplet sum [sʌm], sun [sʌn], sung [sʌŋ].

However, before a stop such as [p, t, k] (provided there is no morpheme boundary between them), only one of the nasals is possible in any given position: [m] before [p], [n] before [t] or [d], and [ŋ] before [k], as in limp, lint, link ( [lɪmp], [lɪnt/, /lɪŋk]). The nasals are therefore not contrastive in these environments, and according to some theorists this makes it inappropriate to assign the nasal phones heard here to any one of the phonemes (even though, in this case, the phonetic evidence is unambiguous). Instead they may analyze these phones as belonging to a single archiphoneme, written something like |N|, and state the underlying representations of limp, lint, link to be |liMp|, |liNt|, |liNk|.

This latter type of analysis is often associated with Nikolai Trubetzkoy of the Prague school. Archiphonemes are often notated with a capital letter within pipes, as with the examples |A| and |N| given above. Other ways the second of these might be notated include |m-n-ŋ|, [m, n, ŋ], or [n]|.

Another example from English, but this time involving complete phonetic convergence as in the Russian example, is the flapping of [t] and [d] in some American English (described above under Biuniqueness). Here the words betting and bedding might both be pronounced [bɛɾɪŋ], and if a speaker applies such flapping consistently, it would be necessary to look for morphological evidence (the pronunciation of the related forms bet and bed, for example) in order to determine which phoneme the flap represents. As in the previous examples, some theorists would prefer not to make such a determination, and simply assign the flap in both cases to a single archiphoneme, written (for example) |D|.

For a special kind of neutralization proposed in generative phonology, see absolute neutralization.

A morphophoneme is a theoretical unit at a deeper level of abstraction than traditional phonemes, and is taken to be a unit from which morphemes are built up [31].

A morphophoneme within a morpheme can be expressed in different ways in different allomorphs of that morpheme (according to morphophonological rules).

For example, the English plural morpheme -s appearing in words such as cats and dogs can be considered to consist of a single morphophoneme, which might be written (for example) |z|, and which is pronounced as [s] after most voiceless consonants (as in cats) and [z] in most other cases (as in dogs).

The language will use only a small subset of the many possible sounds that the human speech organs can produce, and (because of allophony) the number of distinct phonemes will generally be smaller than the number of identifiably different sounds. Different languages vary considerably in the number of phonemes they have in their systems (although apparent variation may sometimes result from the different approaches taken by the linguists doing the analysis).

The English language uses a rather large set of 13 to 21 vowel phonemes, including diphthongs, although its 22 to 26 consonants are close to average.

Phonemes are considered to be the basis for alphabetic writing systems. In such systems the written symbols (graphemes) represent, in principle, the phonemes of the language being written.

However, because changes in the spoken language are often not accompanied by changes in the established orthography (as well as other reasons, including dialect differences, the effects of morphophonology on orthography, and the use of foreign spellings for some loanwords), the correspondence between spelling and pronunciation in a given language may be highly distorted; this is the case with English, for example. (Occasionally, though, such discrepancies are reduced through the establishment of spelling pronunciations.)

The correspondence between symbols and phonemes in alphabetic writing systems is not necessarily a one-to-one correspondence. A phoneme might be represented by a combination of two or more letters (digraph, trigraph, etc.), like <sh> in English or <sch> in German (both representing phonemes /ʃ/). Also a single letter may represent two phonemes, as the Cyrilic letter я in some positions. There may also exist spelling/pronunciation rules (such as those for the pronunciation of <c> in Italian) that further complicate the correspondence of letters to phonemes, although they need not affect the ability to predict the pronunciation from the spelling and vice versa, provided the rules are known.

In sign languages, the basic elements of gesture and location were formerly called cheremes or cheiremes but they are now generally referred to as phonemes, as with oral languages [32].

Sign language phonemes are combinations of articulation bundles in ASL. These bundles may be classified as tab (elements of location, from Latin tabula), dez (the hand shape, from designator), sig (the motion, from signation), and with some researchers, ori (orientation). Facial expression and mouthing are also considered articulation bundles. Just as with spoken languages, when these bundles are combined, they create phonemes.

Stokoe's notation is no longer used by researchers to denote the phonemes of sign languages; his research, while still considered seminal, has been found to not describe American Sign Language and cannot be used interchangeable with other signed languages. Originally developed for American Sign Language, it has also been applied to British Sign Language by Kyle and Woll, and to Australian Aboriginal sign languages by Adam Kendon [33].

Other sign notations, such as the Hamburg Notation System and Sign Writing are phonetic scripts capable of writing any sign language. Stokoe's work has been succeeded and improved upon by researcher Scott Liddell in his book Grammar, Gesture and Meaning, and both Stokoe's and Liddell's work have been included in the Linguistics of American Sign language [33].

  1.  Differences in the articulation basis of English, Russian and Kazakh   phonemes

People belonging to different races and nationalities possess an identical speech apparatus. That is why in all existing languages there are typologically identical sounds, such as consonants, vowels and sonorants. For instance, in all European languages of the Soviet Union there are such typologically identical sounds as [a, o, u, i, e, t, m, k, 1, s, d] etc. And yet, not a single sound of one language is absolutely identical spectrally with a typologically identical sound of another language.

This is due to the fact that people use their speech organs differently, or, as phoneticians say, it is due to the difference in the articulation basis. The articulation basis may be defined as the general tendencies (or habits) in the way native speakers use their speech organs both during speech and at rest. The articulation basis influences the phonetic system of a language. The articulation basis of one language may differ from the articulation basis of another language.

Though the articulation bases of English, Kazakh and Russian have not yet been studied we may only speak about the most characteristic features of the Received Pronunciation articulation basis as compared with the Kazakh and Russian Standard articulation bases [34].

Articulatory differences between vowels, consonants and sonorants depend on  three articulatory criteria:

a) the presence or absence of an articulation abstraction to the air stream in the larynx or the super glottal cavities;

b) the concentrated or defused character or muscular tension;

c) the force of exhalation;

On the basis of these criteria consonants may be defined as sounds in the production of which:

1) there is an articulatory abstraction to the air stream;

2) muscular tension is concentrated in the place of abstraction;

3) exhaling force is rather strong;

Vowels may be defined as sounds in the production of which there is:

a) no artic abstraction to the air stream;

b) muscular tension is defused;

c) the exhalation force is rather weak;

Sonorants are intermediate sounds between noise consonants and vowels, because they have features common to both. There is an obstruction but not narrow enough to produce noise. Muscular tension is concentrated in the place of obstruction but the exhaling force is rather weak. English sonorants are: [m, n, l, r, w, j, ŋ].

Speech sounds according to its physical nature are vibratory movements of air environment evoked by the resonant body (speech organs).

Speech sounds are divided into musical (they are called also by tones) and non-musical (they are called noises).

Speech sounds differ from each other by the pitch, force and duration adn also by tembre. Pitch of sound defined by number of vibratory in units of time. For vowels  o  and  y  pitch is equal to 400 gtz, for  a  it’ss consist of already about 800 gtz. In speech pitch of voice depends on length and strained of voice copula.

Force of sound defined by the swing (amplitude) of vibratory. From the point of perception by the hearing aid of sound is called loudliness which defines not only by the strengtherning of wave but with hightness also, sounds of similiar force, but various heightness is percepted as sounds of various loudiness. Force of the sound has great meaning for the clearness in rendering and perception of speech, at the definition of the stress type.

Sound of speech acoustically is compound, because it contains not only the main tones but sounding-boarded tones.

In our research paper we will point out some peculiarities of English, Russian and Kazakh phonemes. At the same time we will try to analyze the differences and similarities in the articulatory bases of the consonants of these three languages.

Kazakh has nine vowels: а, ә, е, і, ы, о, ө, ұ, ү. The following sounds [и] and [у] are called dipthongoids by some linguists. The sound у is considered a semi-consonant by others. As such it can appear between vowels, such as in "ауыз" mouth. Kazakh vowels are generally pronounced short. Vowels followed by the consonant [й] are pronounced long, e.g.  үй [ui]i (home, house) [35].

The vowels are divided into:   

  1.  back (hard) vowels: а, о, ұ, ы;
  2.  front (soft) vowels: ә, ө, ү, і, е;

It is important to remember this classification as the law of vowel harmony is based on it.

Consonants 25 of the 42 letters of the alphabet are consonants. They are divided into 3 groups:

  1.  voiceless: к, қ, п, с, т, ф, х, ц, ч, ш, щ;
  2.  voiced: г, ғ, б, з, д, в;
  3.  sonorants: л, м, н, р, й, у;

Some consonants came into Kazakh from the Russian language. They are: в, ф, ц, ч, щ.

The consonant х usually occurs in words borrowed from the Arabic, Russian, and other languages. Very often х is replaced by the Kazakh қ, e.g.: хош-қош, хал-қал, рахмет-рақмет.

The law of vowel harmony (syngarmonism) is a characteristic feature of all Turkic languages. According to the Law the first vowel of a word determines the character of the remaining vowels. If the first vowel is back, the remaining vowels are back too, as in бала (child), ағылшын (English), қайталау (repeat), жүмыс (work). All the syllables in these words are hard. If the first vowel is front, the remaining vowels are front, as in әке (father), түсіну (understand). It follows that Kazakh words will either contain back or front vowels. If a word has both back and front vowels, like мүғалім (teacher), кітап (book), рахмет (thanks), it is of foreign origin [35].

Consonant assimilation consists of one sound being either totally or partially made similar to another. The main types of change are the following. When suffixes with the initial consonants л-, б-, м-, н-, д- are added to stems with a final consonant, the initial consonant of the suffix is assimilated to the stem final consonant. For example:

1. after voiceless consonants ( -п, -т, -с, -к, -қ) the plural suffix +лар / +лер changes to +тар / +тер:

  1.  ат+лар    "horses"   > аттар
  2.  кітап+лар    "books"   > кітаптар

2. after voiced consonants ( -з, -ж, -л, -м, -н, -ң ) the plural suffix is changed to +дар / +дер:

  1.  жыл+лар    "years"  > жылдар
  2.  қыз+лар    "girls"   > қыздар

3. similar rules of consonant assimilation exist for all other suffixes with the above mentioned initial consonants.

There is a general voicing of к/қ to г/ғ:

  1.  тарақ   "comb"      тарағым
  2.  шык       "go out"   шығу

4. The consonant п voices to б between vowels:

  1.  кітап   "b- кітабым "my book"
  2.  көп   "many, all"   - көбіміз "most of us"

Differences in the articulation bases of English, Kazakh and Russian, reflected in the system of consonants, are as follows: the English have a tendency to hold the tip of the tongue in neutral position at the level of the alveoli (or teeth-ridge), whereas the Russians and the Kazakh keep it much lower, at tooth level. That is why there are about 50 % of all the consonants in R.P. which are articulated with the tip of the tongue against the alveoli, as in [t, d, n, 1, s, z, ʒ, tʃ, dʒ, ʃ, r ].

They are alveolar, palato-alveolar and post-alveolar/and post-alveolar) in accordance with the place of obstruction. The tip of the tongue in the articulation of Russian and Kazakh fore lingual consonants occupies dental position.

The English and the Kazakhs have a general habit to hold the bulk of the tongue in neutral position a little further back, lower and flatter than the Russians. This may be observed in the articulation of the consonants /h, ŋ, / in British R.P. and [h, ң, қ, ғ] in Kazakh.

In the production of the English and Kazakh [h] the root of the tongue moves in the direction of the pharyngeal cavity. In the articulation of the Kazakh [ң, қ, ғ] the back part of the tongue is raised in the direction of the soft palate.

In the production of English and Kazakh [ŋ] the soft palate makes up a complete obstruction with the back part of the tongue. Russian students are apt to substitute the fore lingual [n] for the back lingual [ŋ].

The flatter and lower position of the bulk of the tongue limits the system of English "soft" consonants of which there are only five [ ʒ, tʃ, dʒ, ʃ, l ] whereas in Russian almost all the consonants may be "soft" (or palatalized). Compare the palatalized and velarized consonants in Russian:

рад-ряд          жар-жарь                борозда-бороздя

мот-мед          вес- весь                 казна- казня

рвы-рви         рожь-рощь              угол- уголь

The English have a specific way of articulating final consonants. Voiced consonants in final position are always weak in English (even partially devoiced). They are called lenis. Voiceless consonants in final position, on the contrary, are strong. They are called fortis.

In Russian voiced consonants are impossible in final positions (except sonorants), and voiceless consonants in final position are always weak.

In Kazakh sonorants and [з] are possible in final position, e.g.  көз, сабаз, азыксыз. There is a specific way of articulating voiceless plosive consonants in English. When they are followed by a stressed vowel they are aspirated, as in "teacher", "paper", 'comrade". In Kazakh and Russian they are non-aspirated. There is a tendency to lengthen the English word-ending sonorants before a pause, especially when they are preceded by a short vowel as in "doll", "long", "sin". The similar Russian and Kazakh sonorants are short in the same position.                                                                                                                      Differences in the articulation bases of English, Russian and Kazakh reflected in the system of vowels are as follows: the positions and movements of the lips are very peculiar. On the one hand, when the English is silent, his lips occupy the so-called flat-type position, they are more or less tense and the corners are raised as in a smile. Russians and Kazakhs keep the lips rather lax with the corners of the lips lowered.

Spreading of the lips for front vowels is rather typical of English. In Russian and Kazakh the lip position for unrounded vowels is neutral.

On the other hand, in the production of the Russian vowels [o, y] and the Kazakh [o, e , y, ү, ұ] the lips are rounded and considerably protruded. In English such protrusion does not take place, as in [ɒ, ɒ, u, u: ].

In the production of English vowels the bulk of the tongue is more often at the back of the mouth; in the production of Russian and Kazakh vowels the tongue is mostly in the front part of the mouth. Besides, the tongue may occupy more positions when articulating English vowels than in Russian or Kazakh vowel production.

English and Kazakh vowels are more tense than Russian. This is especially felt in unstressed syllables. In English and Kazakh an unstressed vowel does not always differ greatly from a stressed one. In Russian it is always short, lax and reduced. There are in English short and long vowels which are different both in quality and quantity. There are no such phonemic oppositions in the Russian and Kazakh languages.

Kazakh exhibits tongue-root vowel harmony, with some words of recent foreign origin (usually of Russian or Arabic origin) as exceptions [36]. There is also a system of rounding harmony which resembles that of Kyrgyz, but which does not apply as strongly and is not reflected in the orthography.

When teaching English pronunciation in Kazakh school each group of sounds and intonation patterns should be considered separately by the teacher, depending on the difficulty of perception and articulation, as well as depending on the similarity of the sound phenomena of the Kazakh language. This makes it possible to determine, taking into account any difficulties should be based learning English pronunciation in Kazakh schools. For example, learning by students of Kazakhs English sound [a:] is a difficulty. Students are apt to replace the English long back vowel [a:] (in the words of garden, star) qualitatively and quantitatively different from Kazakh vowels (a) (in the words of the Kazakh: bala – English: child). Consequently, over the sound [a:] teacher has to work longer and hard to prevent the influence of the corresponding sound of the native language learners.

We have characterized the English vowel sounds that have particular or other similarities with the vowels of the Kazakh language. As it can be seen from the description, almost all the vowels in English have more or less similar couples in the Kazakh language.

Several Kazakh vowels do not have similar in the English language - (ұ), (ү) and so they usually do not caused influence of assimilation of English vowels. These vowels are specific for the Kazakh language. Sounds (ұ) and (ү) are brief, incomplete formation, lip, narrow, upper lift. In the formation of sound (ұ) the language takes on the same position, and in the formation of sound (ы). When the lips are rounded and protrude forward, however, mouth hole turns out not so narrow as in formation (ү).

Vowels (ұ) and (ү) mainly differ from each other only in hardness and softness: (ұ) is solid, i.e. back row (ү) is soft, i.e. of front row. The presence of these sounds is a distinctive feature in relation to each other is confirmed by the following comparison: ұн (flour) - үн (voice), тұр (stand) - түр (sort, kind), ұш (fly) - үш (three). These sounds are used, mainly, in the first syllable of the word.

Special attention from the teacher is required to introduce students to new concepts for them, reflecting the phonetic system of the English language. One of these concepts is a complex vowel (diphthong). Each diphthong in the English language is a separate phoneme and is part of the vowels: [ai], [ei], [ei], [au], [ou], [iә], [εә], [uә]. Part of the English diphthongs can be likened to some combinations of vowels in Kazakh: ай (moon) қой (sheep), aу (network). But such English diphthongs as [iә], [εә], [uә], [ou] does not have similar combination in the Kazakh language. Above shown of the Kazakh vowels differ from diphthongs, they sound like two separate sounds, while top (nucleus) of the English diphthong pronounced quite clearly then followed by sliding in the direction of the second sound.

The main difference of English diphthongs from these similar of Kazakh vowels is that the latter falls easily into two syllables and can be separated by a morphological boundary (e.g тай, та-ый; бой, бо-ый; бау, ба-уыр).

In English, such phenomenon is excluded. English diphthongs cannot apart into two syllables. They are always pronounced together, i.e one effort with an emphasis on the core.

Each diphthong has lax, fading end. That is, the second element of the diphthong is a weak, sliding, extremely brief faint sound. His voice may not be identical to the sound of corresponding isolated vowel, as it is in the Kazakh language.

Although the transcription of the second element is transferred by sign of the vowel complete formation, it should be noted that this sign indicates only the movement of the speech organs to this vowel.

1) 3 diphthongs with a glide to [i]: [ei-ai-ei]

2) 2 diphthongs with a glide to [u]: [ou-au]

3) 3 diphthongs with a glide to [ә] [iә-εә-uә].

In setting up the pronunciation of diphthongs [ai], [ei], [ei], [au] is necessary to consider the inherent common patterns and contrast with the Kazakh diphthongs (aй), (ей), (ой), (ay). In the final position before pausing English diphthongs pronounced drawl, before a voiced consonant is some shorter, and before voiceless consonants is very briefly [37].

Let’s compare the following:




aй (moon)

май (oil)

бай (rich)






кейде (sometimes)

бейне (image)

мейле (let)






қой (sheep)

той (holiday)

бой (growth)






тау (mountain)

ау (network)

бау (ligament)





Above description of the specific articulation of the vowel sounds allows us to identify the most important for the pronunciation production of differences between articulatory bases of Kazakh and English languages in the area of fields [37].

One of the main features of the English vowels pronunciation is their great strength compared with the Kazakh vowels. English labial vowel characteristic flat rounding of the lips like Kazakh labial vowels are pronounced with bulging lips. When pronouncing Kazakh (и) (ы), (e) unstressed loose lips are neutral (no special way of), the lower jaw is natural.

English vowels [i:], [i], [e], [ei] are pronounced in flat grin: lips slightly elongated strips to expose the upper and lower teeth; the lower jaw is launched so that the lower incisors were directly under the upper incisors.

In English, the pronunciation of vowels are mixed [ә:], [ә], and also moved back and moved forward (i, u, Λ, ou] way of the tongue. There is no way of the tongue in Kazakh.

English is clearly compared lingering articulation of vowels and some brief articulation of others (long connection in average of 60%). It is not such a distinctive feature of vowels in the Kazakh language.

In English difference from Kazakh is widely used moving articulation of vowel sounds (diphthongs).

In Kazakh language the organizing center in the word is a vowel sound, which creates a system of vowel harmony.

According to the law of vowel harmony in a single word can combine only similar sounds from the point of view the front (soft) or back (hard) formation. Therefore, all Kazakh words are divided into hard and soft: көл (lake), ән (song) are soft, қoл (hand), жан (the soul) are hard.

In this case, of soft are added affixes with vowels only from the front row, for example, in сен - дер - дeн (from you), and added to the hard affixes with vowels only from back row: ба - лa - лар - ды (children – accusative case.).

Whereas the English language, there is complete independence of vowels and affixes the end of the vowel root, alternation vowels of front row with vowels of back row in the same word (army, answer, public, language).

Thus, the system of English vowels is marked the large number of contrasts than in Kazakh. So, there is no similarity between Kazakh and English vowels:

1) mixed sound of the front and back row, and

2) long and short; monophtong - diphthong.

Nowadays, English is taught in many schools and high schools in the Kazakh Republic. Since the students will eventually learn English on the basis of mother tongue, there is a need for a number of research tools based on a comparison of phonetics, vocabulary and grammar of English and Kazakh languages. The given research is considered the issues of teaching English pronunciation in attracting of such comparisons. In the practice of language teaching two ways of teaching pronunciation are mainly distributed. The first is based on imitation, i.e. by unconscious assimilation of phonetic phenomenon. On the basis of second is a meaningful learning.  

This means that teaching pronunciation skills cannot be mechanical. Students need to maintain awareness of the linguistic features of foreign speech to the development of skills, and not vice versa, so this information is not given at all. This method of teaching pronunciation is called analytical and imitative. For example, learning English sound [a:] by students of Kazakh is a difficulty. Students are inclined to change the English long vowel [a:] (in the words garden, star) qualitatively and quantitatively, it’s different from Kazakh vowel (a) (in the words of the ball-child).

Consequently, over the sound [a:] the teacher has to work longer and more laborious, to prevent the influence of the corresponding sound of the native language of students. On the other hand, the pronunciation of the English sound [һ] are very easy to digest to Kazakh students, as the same sound is also in the Kazakh language (қaһaрмaн - hero, aһ!)

Now we would like to suggest a number of effective phonetic exercises we have used in the process of teaching English pronunciation during our school practice. We have adopted them from the following textbooks “Practical phonetics of English language” by Dubovsky A.S. [38], “Practical phonetics of English language” by Sokolova M.A. [39], “Language files” by Crabtree M. and Powers I. [40], also “Practical phonetics of English language” by Arakin V.D. [41].

The following elaborated exercises which will help the learners develop correct pronunciation of English vowel and consonant phonemes. Practicing with phonetic materials allow pupils to activate their knowledge. There are some amounts of exercises which are suitable only for the learners, beginners and also there are some for the learners of intermediate level.


Give the conventional spelling for the following phonetically transcribed words:

  1.  siŋ   hæŋk   gud mɔ:niŋ

lɔŋ   tæŋk   gud a:ftə,nu:n

i:tiŋ   liŋkiŋ   gud ,i:vniŋ

skeitiŋ θiŋkiŋ  gud bai

  1.  ri:d   beri   ri:d ðə,raimz

rait   nær3u   bi:t ðə,riðm

r3ud   fɔrin   ðæts ,rait

reid:3u            very,sɔri  greit britn


  1.  tri:            fri:   twinz

trai  frend   twelve

træm  praud   kwik

drai  pr3unaun  kwait

dri:m  θr3u   kwesʧn

  1.  litl  sʌdn   pai–spai

teibl  ritn   pein–spein

pi:pl  teikn   keit–skeit

trʌbl  bi:tn   ku:l – sku:l 

  1.  wil  ju   ri:d

wil  ju   kʌm

wil  ju   ʤu

wil  ju   həv  braun  bred  fəbrekfəst

Exercise 3

      Pronounce the following sounds [t, d, n, s, z] correctly:

did  sit  siti  sid-sit

dig  nit  kiti  dik-dig

kid  sik  tikit  ti-tig

sin  kis  gidi

Exercise 4

Pronounce the following consonants [p, t, k] with aspiration:

ten  get  en  det-ded  gets

pen  pet  eg  bet-bed  pets

men  net  et  set-sed  bedz

sit – set    bizi – beni

bit – bet   piti – beti

big – beg    mini – meni

did – ded

Exercise 5

Pronounce these words with loss of plosion:

ka: - ka:m  – ka:t

fa: - fa:m – pa:t

ba: - ba:d – ba:k

a:m   əfa:,sta:

a:t            a:sk  fa:ðə

a:sk  ðəda:k , ga:dn

Exercise 6

Give the conventional spelling for the following phonetically transcribed words:

  1.  sin – siŋ – siŋk   siŋ – siŋiŋ

θin – θiŋ – θiŋk  riŋ – riŋiŋ

win – wiŋ – wiŋk  briŋ – briŋiŋ

ræn – ræŋ – ræŋk  bæŋ – bæŋiŋ

  1.  w3:d - w3:dz   w3:k - wɔ:k

w3:k - w3:kt   w3:d - wɔ:d

w3:s - w3:m   w3:m - wɔ:m

  1.  mʌðə sʌm - θʌm  def

fa:ðə  tin – θin  deθ

ə’nʌðə  tik – θik  pa:s

ðiʌðə  fin – θin  pa:θ

Exercise 7

  1.  hi – hiz   hə’l3u

ha:m – ha:t            h3u’tel

hiz ,hed   hɔspitl

hiz ,ha:t   g3u h3um

  1.  tɔ:t - θɔ:t -sɔ:t - fɔ:t

wʌn-ba:θ – θri:  ba:ðz

wʌn  -mauθ – θri – mauðz

wʌnpa:θ  - θripə:ðz

wʌnju:θ  -  θrijuðz

  1.  nɔt ət,ɔ:l ; iz ðiz  ‘bɔ:l , big  ɔ  ,smɔ:l

Exercise 8

Make [p, t, k] aspirated:

pen  ten  came

pack  tart  court

ben  dean  give

back  dot  goal

Exercise 9

Read the word-contrasts. Concentrate on the difference between an initial voiceless stop and its voiced counterpart:

[ p – b ]   [ t – d ]

pig – big   tin – dean

port – bought   ton – done

[ k – g ]   [ t – d ]

curl – girl    hearten – harden

card – guard   putting – pudding

Exercise 10

Make clear distinction between the Russian-English counterparts

in the following sets of words:

пик – peak   такт – tact

порт – port   табло – table

бить – beat   дата – date

боб – Bob   диск – disk

кипа – keeper  грипп – grippe

колония – colony  галантный – gallant

Exercise 11

Practice reading the following word-contrasts:

[ f – θ ]           [ v – ð ]

finn – thin         vote – though

fought – thought              vain – they

[ v – w ]   [ ð – s ]

vest – west   thick – sick

verse – worse             thing – sing

[ s – θ ]   [ s - ∫ ]

sin – thin   see – she

sick – thick   sips – ships

Exercise 12

Practice reading the families of words at normal conversational speed. Concentrate on the clusters of two plosives:

[ p ]  + another plosive: kept, slept,dropped, snapped, stop trying, ripe tomato, a deep pool

[b] + another plosive: bobbed, robbed, sub-title

[ t] +  another plosive: football, foot path, hot toast, act two

bad beer

[ k] + another plosive: blackboard, desk-chair, picked, tricked, black coffee, black dog, look carefully

Exercise 13

Pronounce the following consonant cluster correctly:

[θs]:  depths, length

[sθ]:  sixth, this thermometer

[sð]:  takes this, it’s that

[θr]:  three, thrash

[fð]:  if those, enough though

[ðz]:  truths, wreaths

[zð]:  was that, raise them

[zθ]:  these thieves, those thoughts

[fθ]:  fifth, diphthong

Exercise 14

Practice reading the following with [h] and no [h] initially:

Helen is arty

Helen eats up the pie

Helen looks after her hair

Ellen is hearty

Ellen heats up the pie

Ellen looks after her heir


Pronounce the following words observing close articulation of plosive and fricative consonant:

streets            takes   snobs   Betsy

wants   thanks  crabs   outside

gets   eggs   bulbs   midsummer

Exercise 16

Pronounce the following word combinations and phrases observing fricative plosion at the junction of words:

the Black sea    I hope so

the Baltic sea    I think so

a dark valley    I need some milk

a good cigarette   look sharp

We picked some flowers

I’d like some tea

He didn’t finish it

It cost a hundred shilling

Exercise 17

Pronounce the following words and phrases. Use the dental variants of the alveolar consonants before [θ, ð]:

sixth     thirteenth

seventh    fourteenth

seventeenth  at the  party.

eighteenth   in  the  morning.

nineteenth   in  the  afternoon.

  1.  Did  the  bell  ring?

Turn  round  the  corner

That’s  the  right  thing

Exercise 18

Pronounce the following words and word-combinations. Observe assimilation in consonant cluster with [w]:

  1.  twice quick

twist  quarrel

twelve  quinsy


twelve seas

a quarter of an hour

  1.  sweet




out of the question

language laboratory

to master the language

  1.  dwell  sweep   the  floor

dwindle  twice  a  week

dwarf            switch  off  the  light


Reading matter

Exercise 19

Read the phrases below:

[ f:]  Fine fellows met at five on the first of February. “Philip” said Ferdinand, “I fear we must fight. Then Philip and Ferdinand fought fairly for fifty-five minutes, after which they fell down in a faint, for the fight had been fearfully furious. When Philip came out of the faint, Ferdinand offered his hand. “Fair’s fair”, said Philip and “I think this affair shows neither of us fears to fight”.

[ v:]  Every evening Victor and Vivian visit Eve. Victor and Vivian are rivals. Both vow to love Eve forever. But Eve is very vain. Eventually, Victor gives Eve up and goes over to Vivienne, leaving Eve to Vivian.

[θ ]   Arthur Smith, a thick-set, healthy athlete sees three thieves throw a thong round Thea’s throat and threaten to throttle her. He throws one thug to with a thud that shakes his teeth. Both the other thieves run off with a filthy oath.

[ ð ]   These are three brothers. This is their other brother. These are their father and mother. Their other brother is teething.

[ s ]  Sue and Cecily are sisters. Sue is sixteen this summer. Cecily was seven last Sunday. Sue is sowing grass seed. She sees Cecily asleep with a glass of cider and nice sixpenny ice by her side. Sue slips across, sips the glass of cider and eats the ice. Cecily gets such s surprise when she wakes.

[ z ]   Zoe is visiting the Zoo. A lazy zebra called Desmond is dozing at the Zoo. He feels flies buzzing round his eyes, ears and nose. He rouses, opens his eyes, rises and goes to Zoe. Zoe is wearing a rose on her blouse. Zoe gives Desmond the buns.

[ ∫ ]    She showed me some machine-made horse shoes. I wish to be shown the latest fashion in short shirts. Mr. Mash sells fish and shell-fish fresh from the ocean. She was still shaking from the shock of being crushed in the rush.

[ ʒ ]   I can’t measure the pleasure. I have in viewing this treasure at leisure. The decision was that on that occasion the collision was due to faulty vision.

[ h ]  Humble hairy Herbert has his hand on his heart because he sees how his brother’s Henry horse has hurt his hoof in a hole while hunting. Henry helps him to hobble home. Henry is very humorous.

[ ʧ]   Charles is a cheerful chicken-farmer. A poacher is watching Charles’ chickens, choosing which to snatch. He chuckles at the chance of a choice –chicken to chew for his lunch. They cheered the cheerful chap who chose to venture to match his skill with the champion’s.

[ ʤ ]   The aged judge urges the jury to be just but generous. In June and July we usually enjoy a few jaunts to that region. He injured his thumb on the jagged edge of a broken jar.

All the activities suggested will provide the practical basis for the effective learning of English phonetic structure in general and the system of English phonemes in particular.



In the course of our investigation of the complex nature of the phoneme we have revealed its significant role as a basic unit of speech. We have tried to show that there are a great number of definitions of the phoneme offered by different phonological schools and outstanding scholars.

Our contention is that all the definitions are valid within the frame of the theories in which they were postulated but they should not be meant to be universally valid. Nearly every phonological school offers its own way to describe various speech phenomena and the basic formative unit they choose to operate with at the level of phonology is usually called the phoneme but it should not be concluded that the concept that is called the phoneme is always the same thing. It is hardly so. In fact, some of the various concepts of the phoneme are not compatible with others. Some of the concepts of the phoneme may yet be found compatible or may at least supplement each other.

We have found out that the phoneme is material, real and objective. That means that it is realized in speech in the form of speech sounds, its allophones. The sets of speech sounds that are the allophones belonging to the same phoneme are not identical in their articulatory content though there remains some phonetic similarity between them. In this respect we have studied and analyzed all the distinctive features of the consonant and vowel phonemes on the basis of several languages: English, Russian and Kazakh. We have also studied functions of the phoneme and have come to the conclusion that the most important among them is the distinctive one as it differentiates not only the meaning of words but also the meaning of utterances.

Having studied theoretical aspects of the research we would like to state that there are two major classes of sounds traditionally distinguished by phoneticians in any language. According to the specific character of the work of the speech organs, sounds in practically all the languages are subdivided into two major subtypes: vowels (V) and consonants (C). Consonants articulations are relatively easy to feel, and as a result are most conveniently described in terms of place and manner of articulation.

We have  revealed that vowels have no place of obstruction, the whole of speech apparatus takes place in their formation, while the articulation of consonants can be localized, an obstruction or narrowing for each consonant is made in a definite place of the speech apparatus.

The particular quality of vowels depends on the volume and shape of the mouth resonator, as well as on the shape and the size of the resonator opening. The mouth resonator is changed by the movements of the tongue and the lips. The particular quality of consonants depends on the kind of noise that results when the tongue or the lips obstruct the air passage. The kind of noise produced depends in its turn on the type of obstruction, on the shape and the type of the narrowing. The vocal cords also determine the quality of consonants. From the acoustic point of view, vowels are called the sounds of voice, they have high acoustic energy, while consonants are also the sounds of noise but which have low acoustic energy. Functional differences between vowels and consonants are defined by their role in syllable formation: vowels are syllable forming elements and consonants are units which function at the margins of syllables, either singly or in clusters.

This fact makes it clear that these differences make it logical to consider each class of sounds independently. That is why we have elaborated and worked out different practical assignments for teaching English pronunciation at schools. We consider that since the students will eventually learn English on the basis of mother tongue, there is a need for a number of research tools based on a comparison of phonetics, vocabulary and grammar of English and native languages. Students need to maintain awareness of the linguistic features of foreign speech to the development of skills.


There are approximately 44 phonemes in English

 Vowel phonemes

Phoneme examples             


a   cat            

e   peg bread         

i   pig wanted      

o   log want         

u   plug love         

ai   pain day gate station   

ee    sweet heat thief these   

ie    tried light my shine mind

oe    road blow bone cold   

ue   moon blue grew tune   

oo  look would put      

ar   cart fast (regional)         

ur  burn first term heard work

or  torn door warn (regional)      

au  haul law call      

er  wooden circus sister      

ow down shout         

oi  coin  boy         

air stairs bear hare      

ear fear beer here   


Consonant Phonemes


Phoneme examples  



b  baby            

d  dog            

f  field photo         

g  game            

h  hat            

j  judge giant barge      

k cook   quick mix Chris   

l lamb            

m monkey comb         

n nut knife gnat      

p paper            

r rabbit  wrong         

s sun mouse city science   

t   tap            

v  van            

w  was            

wh  where (regional)            

y     yes            

z  zebra please is      

th   then            

th   thin            

ch  chip  watch         

sh  ship  mission chef      

zh   treasure            

ng   ring sink

The following table shows typical examples of the occurrence of the above consonant phonemes in word:

/p/ pit /b/ bit

/t/ tin /d/ din

/k/ cut /ɡ/ gut

/tʃ/ cheap /dʒ/ jeep

/f/ fat /v/ vat

/θ/ thin /ð/ then

/s/ sap /z/ zap

/ʃ/ she /ʒ/ measure

/h / loch  

/w/ we /m/ map

/l/ left /n/ nap

/r/ run /ŋ/ bang

/j/ yes /h/ ham

Practical task

Study articulatory features of RP consonants:

RP Consonant Phonemes /C ph: 24

[p] a labial, bilabial, occlusive, plosive, voiceless, fortis consonant phoneme

(=C ph)

[b] a labial, bilabial, occlusive, plosive, voiced, lenis C ph

[t] a lingual, forelingual, alveolar, occlusive, plosive, voiceless, fortis C ph

[d] a lingual, forelingual, alveolar, occlusive, plosive, voiced, lenis C ph

[k] a lingual, backlingual, occlusive, plosive, voiceless, fortis C ph

[g] a lingual, backlingual, occlusive, plosive, voiced, lenis C ph

[f] a labial, labio-dental, constrictive, fricative, voiceless, fortis C ph

[v] a labial, labio-dental, constrictive, fricative, voiced, lenis C ph

[θ] a foreligual, interdental, constrictive, fricative , voiceless, fortis C ph

[ð] a foreligual, interdental, constrictive, fricative , voiced, lenis C ph

[s] a forelingual, alveolar, constrictive, fricative, voiceless, fortis C ph

[z] a forelingual, alveolar, constrictive, fricative, voiced, lenis C ph

[Ʒ] a foreligual, palato-alveolar, constrictive, fricative, voiced, lenis C ph

[h] a glottal, constrictive, fricative, voiceless, fortis C ph

[ʧ] a voiceless affricate

[ʤ] a voiced affricate

[m] a bilabial, occlusive, plosive nasal sonant (S)

[n] an alveolar-apical, occlusive, plosive nasal S

[ŋ] a backlingual, velar, occlusive, plosive nasal S

[1] an alveolar-apical, constrictive, fricative, lateral S

[w] a bilabial, constrictive, fricative, medial S

[r] a post-alveolar, constrictive, fricative, medial S

[j] a medio-lingual, palatal, constrictive, fricative S

Study articulatory features of RP vowels

RP Vowel Phonemes / Vph: 20

RP Monophthongs / M): 12

[i:] a monophthong , long, tense, unrounded, front, high / close vowel phoneme of the narrow variety (=v.)

[i] a M, short, lax, unrounded, front retracted, high / close V ph of the wide v.

[e] a M, short, lax, unrounded, front, mid / half-open V ph of the narrow v.

[٨] a M, short, lax, unrounded, central / mixed, mid V ph of the wide v.

[a:] a M, long, tense, unrounded, back, low / open V ph of the wide v.

[ɒ] a M, short, lax, rounded, back, low / open V ph. of the wide v.

[u] a M, short, lax, rounded, back advanced, low / open V ph of the wide v.

[u:] a M, long, tense, rounded, back, high / close V ph of the narrow v

[з:] a M, long, tense, unrounded, central / mixed, mid V ph of the narrow v.

[з] a M, short, lax, unrounded, central / mixed, mid V ph of the wide v.

RP Diphthongs = 8

[ei] a closing diphthong (= D) with the i-glide

[ai] a closing D with the i-glide

[oi] a closing D with the i-glide

[зu] a closing D with the u-glide

[au] a closing D with the u-glide

[iə] a centering D with the 3-glide

[eə] a centering D with the 3-glide

[uə] a centering D with the a-glide


