Arabic speech and language technology

doi:10.4324/9781315147062-16

ABSTRACT

Imagine a set of boxes, each of which accepts some type of input, X, and generates some type of output, Y. Inside each box, there is a programmed mathematical model of the joint probability distribution, p(X,Y), that defines the relationship between input and output. Each of these probability models can usefully be labeled a model of “Computational…,” where the “…” is replaced by the name of a discipline from the science of Linguistics. Computational Phonetics, for example, represents the relationship between acoustic signals and phoneme labels. Computational Phonology represents the relationship between the canonical and actual pronunciations of each word. Morphology studies word formation processes, while Computational Morphology studies the decomposition of each word into its component morphemes. Models of Computational Syntax accept plain text as input, and generate text tagged with grammatical function and/or phrase structure. Models of Computational Semantics generate text tagged with categorical indicators of meaning and/or logical structure, while models of Computational Pragmatics label the discourse and/or sociolinguistic functions of the words and phrases in a sentence. The input-output mappings defined by these boxes are the subject of Arabic Speech and Language Technology (SLT), and will be the subject of this chapter.