Speech technology for under-resourced languages

Speech technology is becoming an integral
part of our daily lives, especially if you happen to speak one of the dozen major languages,
for which speech technology applications are available. The initial investment to build the technology
for a new language is formidable, requiring massive data collection and expensive analysis
and annotation. Thousands of languages are not targeted because
of these required initial investments. The lack of technologies for these under-resourced
languages threatens not only the languages themselves but also the welfare and culture
of their speakers. One of these languages is Northern Sami spoken
by only 20000 people in Swedish, Norwegian and Finnish Lapland in the Northernmost corner
of Europe. We have developed both a speech recogniser
and a speech synthesiser for Northern Sami to showcase a range of techniques which can
be used to reduce the initial investment for bringing language technology to a new language. The defining part of under-resourced languages
is the lack of data from which to generate models of written and spoken language. One solution is to use existing data from
the internet, such as wikipedia, radio broadcasts and audio books. Another way to combat the lack of data is
by extracting useful information from a larger similar language and adaptation by machine
learning. One such example, applied here, was comparing
letter to sound mappings between Sami and Finnish for generating new acoustic models
and dictionaries. Unsupervised learning can be used as a replacement
for language experts. For example, the Simple4all Ossian system
makes it possible for naive users to generate their own speech synthesizers independently. Aalto’s Morfessor tool can segment words
into smaller units to reduce the size of the vocabularies while maintaining full coverage of conversational speech.

Leave a Reply

Your email address will not be published. Required fields are marked *