Mapping between Disjoining and Conjoining Writing Systems in Bantu Languages: Implementation on Kwanyama
Published 2001-12-31
Keywords
- writing system,
- language technology,
- Kwanyama,
- Bantu language
How to Cite
Abstract
Several Bantu languages have adopted a disjoining writing system, which forms a special challenge for automatic analysis of written text. In those systems, part of bound morphemes is treated as independent words, while other languages treat equivalent morphemes as affixes of a head morpheme. The concept of word is blurred in disjoining writing systems, because there is no systematic rule system for writing conventions. Not only are bound morphemes written as separate words, also independent words are sometimes written together as one string. The paper makes a claim that disjoining writing systems require a special treatment, before they can be analyzed successfully. This can be done either by pre-processing the text first into a conjoining format, so that the analysis program would get the input in a form that conforms to the linguistically more motivated writing system. Another possibility is to construct the morphological analyzer so that it identifies bound morphemes although they are written as separate words. This paper applies the first choice and describes a system, which maps between a disjoining and conjoining writing system. The implementation was made on Kwanyama, a Bantu language spoken in Southern Angola and Northern Namibia. The performance of the system is evaluated.