Лингвистические автоматы в современных гуманитарных технологиях. Беляева Л.Н. - 101 стр.

UptoLike

Составители: 

Рубрика: 

ФУНКЦИЯ АННОТИРОВАНИЯ (РЕФЕРИРОВАНИЯ) ТЕКСТА
101
America) and the speakers of majority languages, and the
potential preservation of endangered languages – over half
of the 6,000 presently existing languages worldwide. The
primary scientific challenge is the creation of MT systems
for languages of little economic importance at very low cost
per language, including the acquisition of linguistic
information with minimal preexisting bilingual corpora and
little or no previous linguistic analysis of the minority
language.
In order to address these needs, we are investigating
omnivorous MT systems, including statistical and example
based MT when some parallel training corpora can be
acquired, and machine learning of transferbased MT rules
when access to a native nonlinguist informant permits
partial elicitation of linguistic information, such as
translations of model sentences and lexicallevel bilingual
alignments. This paper focuses on this last objective of our
project:supervised learning of transfer rules with the aid of
an elicitation interface to a bilingual native speaker without
any assumptions regarding his or her linguistic
sophistication. While our technology is eventually aimed at
lowdensity languages, it is intended to be target language
independent. Hence, we are developing the system using
examples from various languages, such as Chinese, German,
Mapudungun
1
, and Swahili. For illustration purposes, we
present examples in these languages throughout the paper.
1
Mapudungun, spoken in Chile, is one of the minority languages we focus on.