Machine translation
|
This article is in need of attention. |
Please improve (https://academickids.com:443/encyclopedia/index.php?title=Machine_translation&action=edit) this article. |
Currently the state of machine translation is such that it involves some human intervention, as it requires a pre-editing and a post-editing phase. Note that in machine translation, the translator supports the machine and not the other way around.
Nowadays most machine translation systems produce what is called a "gisting translation" — a rough translation that gives the "gist" of the source text, but is not otherwise usable.
However, in fields with highly limited ranges of vocabulary and simple sentence structure, for example weather reports, machine translation can deliver useful results.
- Machine translation (MT) is the application of computers to the task of translating texts from one natural language to another. One of the very earliest pursuits in computer science, MT has proved to be an elusive goal, but today a number of systems are available which produce output which, if not perfect, is of sufficient quality to be useful in a number of specific domains.
- Source: www.eamt.org (http://www.eamt.org/mt.html), European Association for Machine Translation, EAMT, 1997.
Contents |
Machine translation vs. Computer-assisted translation
Although the two concepts are similar, machine translation (MT) should not be confused with computer-assisted translation (CAT) (also known as machine-assisted translation (MAT)).
In machine translation, the translator supports the machine, that is to say that the computer or program translates the text, which is then edited by the translator, whereas in computer-assisted translation, the computer program supports the translator, who translates the text himself, making all the essential decisions involved.
Introduction
The translation process, whether for translation per se or for interpreting, can be stated simply as:
- Decoding the meaning of the source text, and
- Re-encoding this meaning in the target language.
Behind this simple procedure there lies a complex cognitive operation. For example, to decode the meaning of the source text in its entirety, the translator must interpret and analyse all the features of the text, a process which requires in-depth knowledge of both the grammar, semantics, syntax, idioms and the like of the source language, as well as the culture of its speakers. The translator needs the same in-depth knowledge to re-encode the meaning in the target language.
Therein lies the challenge in machine translation: how to program a computer to "understand" a text as a human being does and also to "create" a new text in the source language that "sounds" as if it has been written by a human.
This problem can be tackled in a number of ways.
Linguistic approaches
It is often argued that the success of machine translation requires the problem of natural language understanding to be solved first. However, a number of heuristic methods of machine translation are also used, including:
- Lexical lookup methods
- Grammar based methods
- Semantics based methods (knowledge-based machine translation)
- Statistical methods
- Example-based methods
- Dictionary-entry based methods
- Linguistic rule based methods
Generally, rule-based methods (the first three) parse a text, usually creating an intermediary, symbolic representation, from which the text in the target language is generated. These methods require extensive lexicons with morphologic, syntactic, and semantic information, and large sets of rules.
Statistical-based and example-based methods eschew manual lexicon building and rule-writing and instead try to generate translations based on bilingual text corpora, such as the Canadian Hansard corpus, the English-French record of the Canadian parliament. Where such corpora are available, impressive results can be achieved translating texts of a similar kind, but such corpora are still very rare.
Given enough data, most machine translation programs work well enough for a native speaker of one language to get the approximate meaning of what is written by the other native speaker (i.e. producing a "gisting translation"). The difficulty is getting enough data of the right kind to support the particular method. The large multilingual corpus of data needed for statistical methods to work is not necessary for the grammar-based methods, for example. But then, the grammar methods need a skilled linguist to carefully design the grammar that they use.
Users
Despite their inherent limitations, MT programs are currently used by various organizations around the world. Probably the largest institutional user is the European Commission, which uses a highly customized version of the commercial MT system SYSTRAN to handle the automatic translation of a large volume of preliminary drafts of documents for internal use.
It was recently revealed that in April 2003 Microsoft began using a hybrid MT system for the translation of a database of technical support documents from English to Spanish. The system was developed internally by Microsoft's Natural Language Research group. The group is currently testing an English–Japanese system as well as bringing English–French and English–German systems online. The latter two systems use a learned language generation component, whereas the first two have manually developed generation components. The systems were developed and trained using translation memory databases with over a million sentences each.
History of machine translation
The first attempts at machine translation were conducted after World War II. It was assumed at this time that the newly invented computers would have no trouble in translating texts. The logic was that computers were able to do complex mathematics quickly, something that humans did with more difficulty. On the other hand, even young children were able to learn to understand human language, therefore computers could do the same. In actual fact, this belief was soon shown to be incorrect.
On 7 January 1954, the first public demonstration of a MT system was held in New York at the head office of IBM. The demonstration was widely reported in the newspapers and received much public interest. The system itself, however, was no more than what today would be called a "toy" system, having just 250 words and translating just 49 carefully selected Russian sentences into English — mainly in the field of chemistry. Nevertheless it encouraged the view that MT was imminent — and in particular stimulated the financing of MT research, not just in the US but worldwide.
The first serious MT systems were used during the Cold War to parse texts in Russian scientific journals. The rough translations produced were sufficient to understand the "gist" of the articles. If an article discussed a subject deemed to be of security interest, it was sent to a human translator for a complete translation; if not, it was discarded.
The advent of low-cost and more powerful computers towards the end of the 20th century brought MT to the masses, as did the availability of sites on the Internet.
Much of the effort previously spent on MT research, however, has shifted to the development of computer-assisted translation (CAT) systems, such as translation memories, which are seen to be more successful and profitable.
Examples
The opening paragraph of this article reads:
- Machine translation (MT) is a form of translation where a computer program analyses the text in one language - the "source text" - and then attempts to produce another, equivalent text in another language - the target text - without human intervention. (March 17, 2005)
Here are some machine translation examples (translated by SYSTRAN):
Arabic
- [مشن ترنسلأيشن] ([مت]) شكل الترجمة حيث [كمبوتر بروغرم] يحلّل النص في واحدة لغة - ال "[سورس تإكست]" - وبعد ذلك يحاول أن ينتج آخر ، نص معادلة في آخر لغة - ال [ترجت تإكست] - دون تدخل إنسانيّة.
- [[mshn] [trnsl'ayshn]] ([dies]) form of the translation where [[kmbwtr] [brwGrm]] the text in one language analyzes - the “[[swrs] [t'ikst]]” - and afterwards that tries to other results, equivalent text in other of language - the [[trjt] [t'ikst]] - humanitarian intervention echoes.
Chinese (traditional)
- 機器翻譯(MT) 是電腦程式分析文本用一種語言翻譯- "源文本" - 並且然後試圖生產另, 等效文本用其它語言- 目標文本的形式- 沒有人的干預。
- The machine translation (MT) is the computer formula analysis text - and then attempts with one language translation - "source text" to produce in addition, the equivalent text uses other language - goal texts the form - nobody intervention.
Analysis
- "電腦程式" → "computer formula"
- The Chinese translator seemed unable to translate a sentence with clauses well.
- "and then" → "並且" + "然後"; in good Chinese "and" should be omitted.
- Bad guess:
- Correct: 沒有 人的 干預 → without human's intervention
- Mistake: 沒有人 的 干預 → nobody's intervention (this is actually a Chinese joke)
Chinese (simplified)
- 机器翻译(MT) 是计算机程序分析文本用一种语言翻译- "源文本" - 并且然后试图生产另, 等效文本用其它语言- 目标文本的形式- 没有人的干预。
- The machine translation (MT) is the computer program analysis text - and then attempts with one language translation - "source text" to produce in addition, the equivalent text uses other language - goal texts the form - nobody intervention.
Dutch
- De automatische vertaling (MT) is een vorm van vertaling waar een computerprogramma de tekst in één taal - de "brontekst" - analyseert en dan probeert om een andere, gelijkwaardige tekst in een andere taal - de doeltekst - zonder menselijke interventie te produceren.
- The automatic translation (MT) is a form of translation where a computer programme analyses the text in one language - the "source text" - and then tries for another, equivalent text in another language - the aim text - without producing human intervention.
French
- La traduction automatique (la TA) est une forme de la traduction où un programme machine analyse le texte en une langue - le "texte source" - et puis essaye de produire des autres, texte équivalent en une autre langue - le texte cible - sans intervention humaine.
- Machine translation (MT) is a form of the translation where a program machine analyzes the text in a language - the "source text" - and then tries to produce others, equivalent text in another language - the target text - without human intervention.
German
- Maschinelle Übersetzung (M.Ü.) ist eine Form der Übersetzung, in der ein Computerprogramm den Text in einer Sprache analysiert - den "Ausgangstext" - und dann versucht, andere zu produzieren, gleichwertiger Text in einer anderen Sprache - der Zieltext - ohne menschliche Intervention.
- Machine translation (M.Ue.) a form of the translation, in which a computer program tries the text in a language analyzed - the "output text" - and then to produce others is equivalent text in another language - the target language text - without human intervention.
Italian
- La traduzione automatica (la TA) è una forma della traduzione dove un programma destinato all'elaboratore analizza il testo in una lingua - "il testo originale" - ed allora tenta di produrre un altro, il testo equivalente in altra lingua - il testo di arrivo - senza intervento umano.
- Automatic translation (the TA) is a shape of the translation where a destined program to the computer analyzes the text in a language - "the text originates them" - and then tries to produce an other, the equivalent text in other language - the arrival text - without human participation.
Japanese
- 機械翻訳(MT) は人間の介在のない…計算機プログラムが1 つの言語のテキスト- "原文" - を分析し、別のものを作り出すように試みる翻訳ターゲットテキスト- 別の言語の同等のテキストの形態である。
- As for machine translation (MT) there is no inclusion of the human, in order... for the computer program to analyze, the text of one language - "the original" - to create another ones the translation target text which is tried - it is form of the equal text of another language.
Korean
- 표적 원본 - 기계 번역 (MT)은 인간 내정간섭없이 - 곳에 컴퓨터 프로그램이 1개의 언어안에 원본 - "근원 원본" -을 분석하고 그때 또 다른 한개를 생성한것을 시도하는 번역, 다른 언어안에 동등한 원본의 모양 이다.
- The target original - computer translation (MT) human being domestic intervention without - in the place the computer program the original - "the origin original" - analyzes inside 1 language and it is a shape of the original which that time also the translation the fact that it creates one thing which is different, inside different language equal is.
Portuguese
- A tradução de máquina (TA) é um formulário da tradução onde um programa de computador analisa o texto em uma língua - de "o texto fonte" - e tenta então produzir outra, texto equivalente em uma outra língua - o texto de alvo - sem intervenção humana.
- The translation of machine (You) is a form of the translation where one computer program analyzes the text in a language - of "the text source" - and tries then to produce another one, text equivalent in one another language - the target text - without intervention human being.
Analysis
- Very bad guess:
- The acronym of "tradução de máquina" becomes (TA) and then becomes "you" the second person plural pronoun when it translates back from Portuguese.
- Bad guess:
Russian
- Машинным переводом (M T) будет форма перевода где компьутерная программа анализирует текст в одном языке - "исходныйа текст" - и после этого пытает произвести другие, соответствующего текста в другом языке - текста цели - без людской интервенции.
- By machine transfer (M T) there wakes the form of transfer where the komp'uternaya program it is analyzed text in one language - "iskhodnyya text" - and after this it asks it produced others, the corresponding text in other language - text it aimed - without the human intervention.
Spanish
- La traducción automática (TA) es una forma de traducción donde un programa de computadora analiza el texto en una lengua - el "texto original" - y después procura producir otro texto equivalente en otra lengua - el texto de destino - sin intervención humana.
- The automatic translation (TA) is a translation form where a computer program analyzes the text in a language - the "original text" - and later tries to produce another one, equivalent text in another language - the target text - without the human intervention.
Swedish
- Maskinöversättningen (MT) är en bilda av översättningen, var ett dataprogram analyserar texten i ett språk - ”källtexten” - och därefter försök till jordbruksprodukter another, likvärdigt text i ett annat språk - måltexten - utan mänskligt ingripande.
- The mechanical translation (MT) is a to form of the translation, is a computer program analyzes the text in a language - “the source text” - and then experiments to agricultural products another, equivalent text in another language - the objective text - without human intervening.
General analysis
In all non-Latin languages, the computer program did not generate an acronym for "machine translation". However, it also failed to include the original English term in full. For example, a better Japanese translation might be:
- 機械翻訳(machine translation; MT)は ... (Machine translation is ...)
Lost in translation
An italicized word will be grossly mistranslated in the next sentence:
- English to German:
Maschinelle Übersetzung (M.Ü.) ist eine Form der Übersetzung, in der ein Computerprogramm den Text in einer Sprache analysiert - den "Ausgangstext" - und dann versucht, andere zu produzieren, gleichwertiger Text in einer anderen Sprache - der Zieltext - ohne menschliche Intervention. - German to French:
La traduction automatique (M.Ue.) est une forme de traduction, qui programme informatique cela texte langue analysé - cela "texte initial" - et alors essaye, autre produire, texte équivalent vers une autre langue - le texte cible - sans l'intervention humaine. - French to Italian:
La traduzione automatica (M.Ue.) è una forma di traduzione, che programma informatica ciò testo lingua analizzato - ciò "testo iniziale" - ed allora prova, altro produrre, testo equivalente verso un'altra lingua - il testo determina - senza l'intervento umano. - Italian to French:
La traduction automatique (M.Ue.) est une forme de traduction, que programme informatique ce teste langue analysée - ce "je teste initial" - et alors épreuve, autre produire, je teste équivalent vers une autre langue - je teste détermine - sans l'intervention humaine. - French to Dutch:
De automatische vertaling (M.Ue.) is een vorm van vertaling, dat het informaticaprogramma dit geanalyseerde taal - dit "ik test eerste" test - en dan proef, ander produceren, ik test equivalent naar een andere taal - ik test bepaal - zonder de menselijke tussenkomst. - Dutch to French:
La traduction automatique (M.Ue.) est une forme de traduction qui le programme d'informatique cette langue analysée - ceci "me teste premier" teste - produit et alors l'essai, autre, me teste équivalent vers une autre langue - je détermine le test - sans l'intervention humaine. - French to Spanish:
La traducción automática (M.Ue.) es una forma de traducción que la programa de informática esta lengua analizada - esto "me prueba primero" prueba - produce y entonces la prueba, otro, me prueba equivalente hacia otra lengua - determino la prueba - sin la intervención humana. - Spanish to French:
La traduction automatique (M.Ue.) est une forme de traduction qui le programme d'informatique cette langue analysée - ceci "se prouve d'abord" prouve - produit et alors l'essai, un autre, se prouve équivalent vers une autre langue - je détermine l'essai - sans l'intervention humaine. - French to German:
Die maschinelle Übersetzung (M.Ue.), ist eine Art der Übersetzung, die es von Informatik programmiert diese analysierte Sprache - dies "beweist sich zuerst" beweist - produziert und dann der Versuch, ein anderer beweist sich in Richtung einer anderen Sprache - ich bestimme den Versuch - ohne die menschliche Intervention entsprechend. - German to English:
The machine translation (M.Ue.), is a kind of the translation, it from computer science programs this analyzed language - this "proves itself first" proves - produced and then the attempt, another proves itself toward another language - I determine the attempt - without the human intervention accordingly.
See also
- Translation
- Linguistics
- Universal Networking Language
- Artificial Intelligence
- Eurotra
- Distributed Language Translation
- Parallel text alignment
- Computer-assisted translation
Free (open source) software
External links
- Wikimedia Machine Translation Project (http://meta.wikipedia.org/wiki/Wikipedia_Machine_Translation_Project)
- Machine Translation (http://www.essex.ac.uk/linguistics/clmt/MTbook/), an introductory guide to MT by D.J.Arnold et al. (1994)
- European Association for Machine Translation: EAMT, non-profit org
- Talking to Strangers (http://www.wired.com/wired/archive/8.05/translation.html), an article about MT from "Wired'; MT Past and Future (timeline) (http://www.wired.com/wired/archive/8.05/timeline.html)
- Free-to-use machine translation on the web
- http://www.translatorsbase.com/ (Free human translation service)
- http://www.google.com/language_tools (uses Systran software)
- http://www.freetranslation.com/
- http://www.tranexp.com:2000/InterTran?from=fre
- http://www.systransoft.com/
- http://www.systranet.com (the Systran site)
- http://ez2find.com/channel/translate.php (uses Systran software)
- http://babelfish.altavista.com/ (uses Systran software)
- http://www.babylon.com/
- http://www.reverso.net/textonly/default_ie.asp
- http://www.worldlingo.com/en/microsoft/computer_translation.html
- Translation Service (http://www.alphaworks.ibm.com/aw.nsf/html/mt) online machine translation from AlphaWorks at IBM
- http://www.translatum.gr/dics/mt.htm
- http://translate.lycos.com
- http://www.foreignword.com (provides access to various computer-assisted translation tools)
- http://www.translationwave.com (Translation Wave software from Globe Tech Ventures Corporation)
- http://www.word2word.com/free.html (gives access to various machine translation engines)
- http://anglahindi.iitk.ac.in/ (An English to Hindi Machine Aided Translation System: an ongoing project at IIT Kanpur, India)
- http://www2.tranexp.com:2000/ (supports a lot of languages)
- http://webtrance.skycode.com/ (machine translation from English to Bulgarian)
- http://www.sprawk.com/ (machine translation and searchable dictionary for many languages)
- http://www.ParsTranslator.Net/ (English to Persian (Farsi)Machine Translation Software) Pars translator
- Sounds like Faulkner (http://reverent.org/sounds_like_faulkner.html)cs:Strojový překlad
da:Maskinoversættelse de:Maschinelle Übersetzung es:Traducción automática fr:Traduction automatique ja:機械翻訳 ms:Terjemahan Mesin nl:Computervertaling pt:Tradutor automático ru:Машинный перевод sv:Maskinöversättning