Publications

Refine Results

(Filters Applied) Clear All

Interlingua-based broad-coverage Korean-to-English translation in CCLINC

Published in:
Proc. First Int. Conf. on Human Language Technology, 18-21 March 2001.

Summary

At MIT Lincoln Laboratory, we have been developing a Korean-to-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). The CCLINC Korean-to-English translation system consists of two core modules, language understanding and generation modules mediated by a language neutral meaning representation called a semantic frame. The key features of the system include: (i) Robust efficient parsing of Korean (a verb final language with overt case markers, relatively free word order, and frequent omissions of arguments). (ii) High quality translation via word sense disambiguation and accurate word order generation of the target language. (iii) Rapid system development and porting to new domains via knowledge-based automated acquisition of grammars. Having been trained on Korean newspaper articles on "missiles" and "chemical biological warfare," the system produces the translation output sufficient for content understanding of the original document.
READ LESS

Summary

At MIT Lincoln Laboratory, we have been developing a Korean-to-English machine translation system CCLINC (Common Coalition Language System at Lincoln Laboratory). The CCLINC Korean-to-English translation system consists of two core modules, language understanding and generation modules mediated by a language neutral meaning representation called a semantic frame. The key features...

READ MORE

The use of dynamic segment scoring for language-independent question answering

Published in:
Proc. 1st Int. Conf. on Human Language Technology Research, HLT, 18-21 March 2001.

Summary

This paper presents a novel language-independent question/answering (Q/A) system based on natural language processing techniques, shallow query understanding, dynamic sliding window techniques, and statistical proximity distribution matching techniques. The performance of the proposed system using the latest Text REtrieval Conference (TREC-8) data was comparable to results reported by the top TREC-8 contenders.
READ LESS

Summary

This paper presents a novel language-independent question/answering (Q/A) system based on natural language processing techniques, shallow query understanding, dynamic sliding window techniques, and statistical proximity distribution matching techniques. The performance of the proposed system using the latest Text REtrieval Conference (TREC-8) data was comparable to results reported by the top...

READ MORE

Understanding-based translingual information retrieval

Published in:
4th Int. Conf. on Applications of Natural Language to Information Systems, 17-19 June 1999, pp. 187-195.

Summary

This paper describes our preliminary research on an understanding-based translingual information retrieval system for which the input to the system is a query sentence in English, and the output of the system is a set of documents either in English or in Korean. The understanding module produces a meaning representation --- called semantic frame --- of the input sentence where the predicate-argument structure and the question-type of the input are identified, and each keyword is assigned its concept category. The translingual search module performs search on an English and Korean bilingual corpus tagged with concept categories. The results of our preliminary experiment, performed an a document set consisting of slides and notes from English and Korean briefings in a military domain, indicate that an understanding-based approach to information retrieval combined with concept-based search technique improves both precision and recall compared with a keyword match technique without understanding for both monolingual- and translingual retrieval. Current work is directed at further development of the system, and in preparation for tests on larger copora.
READ LESS

Summary

This paper describes our preliminary research on an understanding-based translingual information retrieval system for which the input to the system is a query sentence in English, and the output of the system is a set of documents either in English or in Korean. The understanding module produces a meaning representation...

READ MORE

Machine-assisted language translation for U.S./RoK Combined Forces Command

Published in:
Army RD&A Mag., November-December 1999, pp. 38-41.

Summary

The U.S. military must operate worldwide in a variety of international environments where many different languages are used. There is a critical need for translation, and there is a shortage of translators who can interpret military terminology specifically. One coalition environment where the need is particularly strong is in the Republic of Korea (RoK) where, although U.S. and RoK military personnel have been working together for many years, the language barrier still significantly reduces the speed and effectiveness of coalition command and control. This article describes the Massachusetts Institute of Technology (MIT) Lincoln Laboratory's work on automated, two-way, English/Korean translation for enhanced coalition communications. Our ultimate goal is to enhance multilingual communications by producing accurate translations across a number of languages. Therefore, we have chosen an interlingua-based approach to machine translation that is readily adaptable to multiple languages. In this approach, a natural language understanding system transforms the input into an intermediate meaning representation called Semantic Frame, which serves as a basis for generating output in multiple languages. To produce useful and effective translation systems in the short term, we have focused on limited military task domains and have configured our system as a machine-assisted translation system. This allows the human translator to confirm or edit the machine translation.
READ LESS

Summary

The U.S. military must operate worldwide in a variety of international environments where many different languages are used. There is a critical need for translation, and there is a shortage of translators who can interpret military terminology specifically. One coalition environment where the need is particularly strong is in the...

READ MORE

Ambiguity resolution for machine translation of telegraphic messages

Published in:
Proc. 35th Annual Meeting of the Assoc. for Computational Linguistics, 7-12 July 1997, pp. 120-7.

Summary

Telegraphic messages with numerous instances of omission pose a new challenge to parsing in that a sentence with omission causes a higher degree of ambiguity than a sentence without omission. Misparsing reduced by omissions has a far-reaching consequence in machine translation. Namely, a misparse of the input often leads to a translation into the target language which has incoherent meaning in the given context. This is more frequently the case if the structures of the source and target languages are quite different, as in English and Korean. Thus, the question of how we parse telegraphic messages accurately and efficiently becomes a critical issue in machine translation. In this paper we describe a technical solution for the issue, and present the performance evaluation of a machine translation system on telegraphic messages before and after adopting the proposed solution. The solution lies in a grammar design in which lexicalized grammar rules defined in terms of semantic categories and syntactic rules defined in terms of part-of-speech are utilized together. The proposed grammar achieves a higher parsing coverage without increasing the amount of ambiguity/misparsing when compared with a purely lexicalized semantic grammar, and achieves a lower degree of ambiguity/misparses without, decreasing the parsing coverage when compared with a purely syntactic grammar.
READ LESS

Summary

Telegraphic messages with numerous instances of omission pose a new challenge to parsing in that a sentence with omission causes a higher degree of ambiguity than a sentence without omission. Misparsing reduced by omissions has a far-reaching consequence in machine translation. Namely, a misparse of the input often leads to...

READ MORE

Automated English-Korean translation for enhanced coalition communications

Summary

This article describes our progress on automated, two-way English-Korean translation of text and speech for enhanced military coalition communications. Our goal is to improve multilingual communications by producing accurate translations across a number of languages. Therefore, we have chosen an interlingua-based approach to machine translation that readily extends to multiple languages. In this approach, a natural-language-understanding system transforms the input into an intermediate-meaning representation called a semantic frame, which serves as the basis for generating output in multiple languages. To produce useful, accurate, and effective translation systems in the short term, we have focused on limited military-task domains, and have configured our system as a translator's aid so that the human translator can confirm or edit the machine translation. We have obtained promising results in translation of telegraphic military messages in a naval domain, and have successfully extended the system to additional military domains. The system has been demonstrated in a coalition exercise and at Combined Forces Command in the Republic of Korea. From these demonstrations we learned that the system must be robust enough to handle new inputs, which is why we have developed a multistage robust translation strategy, including a part-of-speech tagging technique to handle new works, and a fragmentation strategy for handling complex sentences. Our current work emphasizes ongoing development of these robust translation techniques and extending the translation system to application domains of interest to users in the military coalition environment in the Republic of Korea.
READ LESS

Summary

This article describes our progress on automated, two-way English-Korean translation of text and speech for enhanced military coalition communications. Our goal is to improve multilingual communications by producing accurate translations across a number of languages. Therefore, we have chosen an interlingua-based approach to machine translation that readily extends to multiple...

READ MORE

Automatic English-to-Korean text translation of telegraphic messages in a limited domain

Published in:
Proc. Int. Conf. on Computational Linguistics, 5-9 August 1996, pp. 705-710.

Summary

This paper describes our work-in-progress in automatic English-to-Korean text; translation. This work is an initial step toward the ultimate goal of text and speech translation for enhanced multilingual and multinational operations. For this purpose, we have adopted an interlingual approach with natural language understanding (TINA) and generation (GENESIS) modules at the core. We tackle the ambiguity problem by incorporating syntactic and semantic categories in the analysis grammar. Our system is capable of producing accurate translation of complex sentences (38 words) and sentence fragments as well as average length (12 words) grammatical sentences. Two types of system evaluation have been carried out: one for grammar coverage and the other for overall performance. For system robustness, integration of two subsystems is under way: (i) a rule-based part-of-speech tagger to handle unknown words/constructions, and (ii) a word-for-word translator to handle other system failures.
READ LESS

Summary

This paper describes our work-in-progress in automatic English-to-Korean text; translation. This work is an initial step toward the ultimate goal of text and speech translation for enhanced multilingual and multinational operations. For this purpose, we have adopted an interlingual approach with natural language understanding (TINA) and generation (GENESIS) modules at...

READ MORE