SLTinfo logo

Linguistic Knowledge Bases

Overview of transmission models

The encode-decode model of communication proposes that:

  1. a person formulates a concept and encodes this linguistically into strings of sounds, syllables and words;
  2. transmits this encoded thought as a sound wave, whereupon…
  3. another person (the receiver) decodes the sound wave back into the original concept

There are many problems with the encode-decode model but these are not our concern here. Whilst the model has several weaknesses, it is useful as a starting point for considering what processes are involved in human communication.

According to the model, in order for humans to encode thoughts, transmit them and also decode thoughts transmitted linguistically by others, we require knowledge of the rules that are accessed by the relevant encoding or decoding algorithms. Much of this knowledge is unconscious. We are not at all aware how we construct certain utterances – it all seems quite natural. Some aspects of our overall knowledge base, however, can be brought to conscious attention. For example, when I make a deliberate attempt to impersonate someone, I consciously alter the articulation of certain speech sounds that I perceive to be characteristic of the target person.

So, how is this knowledge organized?

Organization of linguistic knowledge

The overall knowledge base may be considered as subdivided into three components:

  1. semantic-syntactic
  2. phonological
  3. phonetic

Each component consists of sets of elements and sets of rules for manipulating the elements (Figure 1).

Linguistic knowledge bases

Figure 1. Representation of three knowledge bases in transmission models (adapted from Tatham, 1989)

Semantic-syntactic knowledge base

Within the semantic-syntactic knowledge base the element is the mental lexicon. The lexicon is essentially a dictionary that stores the entire stock of words known to the speaker-listener. Words within the lexicon are stored, amongst other methods, according to the conceptual meaning they convey, together with information about their grammatical role (e.g. whether or not they can function as a noun or a verb; whether or not they can operate as the Subject or Object of an utterance). Rules within the semantic-syntactic knowledge base constrain the ordering of, and relationship between, the words and other structural elements in phrases and clauses.

Phonological knowledge base

The phonological knowledge base contains a so-called feature set of phonological elements and sets of rules for manipulating these. Examples of the sorts of constraints these rules impose include:

  • which particular speech sounds may combine with which other speech sounds, e.g. the combination /lm/ is allowable at the ends of words in English (cf. elm) but not at the beginning
  • how speech sounds may be transformed when they occur in particular contexts with other speech sounds
  • how an appropriate prosodic contour (pattern of rhythm, stress and intonation) may be applied to words and phrases, and so on

Analogous to the mental lexicon, the phonological knowledge base contains information about which speech sounds are operative in the particular language being spoken. For instance, in the particular variety of English that I speak I use a so-called plosive sound /k/ at the end of the word loch (meaning lake or tarn). So, for me, the two words loch and lock sound the same when I speak them, i.e. they are homophones. For some speakers of English, however, the final sound in loch is not a short, hard plosive sound but a longer, softer so-called fricative sound: as may occur in some Scottish accents. My phonological knowledge base (my phonology), therefore, excludes this particular speech sound from my repertoire – it is not a choice that is typically available to me. In sum, phonological knowledge operates at the level of sound systems – it is sensitive to the context in which speech sounds occur, e.g. whether or not a sound is preceded by a vowel or a consonant, whether or not a preceding sound is a plosive, and so on.

Phonetic knowledge base

The phonetic knowledge base, like the phonological knowledge base, contains a feature set of phonetic elements and rules for manipulating these elements (see phonetics). Now, whereas the phonological knowledge base operates at the level of sound systems, phonetic knowledge arguably operates only at the level of the sounds themselves. Phones (speech sounds considered as physical events without regard to their place in the sound system) are executed through neural firing sequences that produce coordinated movements of the muscles responsible for speech production. This includes the muscles governing breathing, movement of the vocal folds within the larynx, the tongue, the jaw, the lips, and so on. It is evident that speakers are able to consciously manipulate the various physical constraints when producing connected speech. For example, impersonators entertain us by capturing many different phonetic features of a target speaker: some individual phonetic features appearing to be more important for a successful voice imitation than others (Zetterholm, 2002). Consequently, “if cognitive control of physical constraints is possible, the nature of those constraints must be known to the system” (Tatham, 1989).

Interconnected knowledge bases

The three knowledge bases are linked in Figure 1 so as to indicate that one component is logically prior to another. That is to say, logically, when producing a meaningful utterance, the semantic-syntactic knowledge base is accessed in order to select the appropriate word or combination of words from the mental lexicon. The phonological knowledge base is also accessed in order to map the necessary phonological features onto the selected word or word string that will transmit the message. In addition, the phonetic knowledge base is accessed in order to program the necessary speech movements that create the speech sound wave. This is the encoding algorithm. A parallel logical sequence is implied when invoking the decoding algorithm, i.e. when decoding the linguistic sound to reconstruct the linguistic meaning.

However, just because particular knowledge bases are logically prior to others, one must be careful not to assume that they are, therefore, temporally or procedurally prior. The brain is not structured into discrete components that function in an invariable step-by-step, A-B-C, 1-2-3, fashion. Rather, the representation of knowledge is distributed throughout the brain and many interconnections hook up the knowledge bases (see The Modular Mind). The brain does not perform serial, one-by-one, computations like most of our modern computers. Instead, it computes in parallel, performing many millions of calculations all at the same time.

References

Tatham, M.A.A. (1989) ‘Intelligent speech synthesis as part of an integrated speech synthesis / automatic speech recognition system’ in Taylor, M.M., Néel, F. and Bouwhuis, D.G. (eds), The Structure of Multimodal Dialogue Elsevier Science Publishers B.V. (North-Holland), pp. 301-312. Available at http://www.morton-tatham.co.uk/publications/to_1994/intelligent%20speech%20synthesis.pdf Accessed 17.01.2011.

Zetterholm, E. (2002) ‘A comparative survey of phonetic features of two impersonators’ Speech, Music and Hearing – Quarterly Progress and Status Report (TMH-QPSR) Vol. 44 – Fonetik 2002, 129-132.