Parallel text alignment
|
A parallel text is a text in one language together with its translation in another language. Alignment of parallel text is the identification of the corresponding sentences in both halves of the parallel text.
Large collections of parallel texts are called parallel corpora (see corpus). Alignments of parallel corpora at sentence level are prerequisite for many areas of linguistic research. During translation, sentences can be split, merged, deleted, inserted or changed in order. This makes alignment a non-trivial task.
See also
External links
- Parallel text processing bibliography by J. Veronis and M.-D. Mahimon (http://www.up.univ-mrs.fr/~veronis/biblios/ptp.htm)
- The Opus project aims at collecting freely available parallel corpora (http://logos.uio.no/opus/)