Bi-directional text
|
Unicode |
---|
Encodings |
Bi-directional text |
BOM |
Han unification |
Unicode and HTML |
Unicode and Email |
The writing systems of some languages, such as Persian (Farsi), Hebrew, and Arabic are written from right to left (RTL). When Latin-based left to right (LTR) text is mixed with these languages in the same sentence, each type of text should be written in its own direction. This is known as bi-directional text. This can get quite complex when multiple levels of quotation are used.
Many computer programs fail to display bi-directional text correctly. For example, the Hebrew name Sarah (שרה) should be spelled shin (ש) resh (ר) heh (ה) from right to left. Some Web browsers may display the Hebrew text in this article in the opposite direction.
Very few languages may be written in either direction. Such was the case with Egyptian hieroglyphics, where the signs had a distinct "head" that faced the beginning of a line and "tail" that faced the end.
Some ancient Greek inscriptions, Tuareg and Hungarian runes were written in opposite directions on alternate lines, a style called boustrophedon.
Bidirectional script support is the capability of a computer system to correctly display bi-directional text. The term is often shortened to the jargon term BiDi.
Early computer installations were designed only to support a single writing system, typically for left-to-right scripts based on the Latin alphabet only. Adding new character sets and character encodings enabled a number of other left-to-right scripts to be supported, but did not easily support right-to-left scripts such as Persian (Farsi), Arabic or Hebrew, and mixing the two was not practical. It is possible to simply flip the left-to-right display order to a right-to-left display order, but doing this sacrifices the ability to correctly display left-to-right scripts. With bidirectional script support, it is possible to mix scripts from different languages on the same page, regardless of writing direction.
In particular, Unicode provides complete BiDi support, with detailed rules as to how mixtures of left-to-right and right-to-left languages are to be encoded and displayed. In Unicode encoding, all characters are stored in writing order, and software works out which direction on the page or screen the script should be displayed.
See also: Internationalization and localization
External links
- Unicode Standards Annex #9 (http://www.unicode.org/reports/tr9/) The Bidirectional Algorithm
- GNU FriBiDi (http://fribidi.org/) An implementation of the bidirectional algorithm
- ICU (http://icu.sourceforge.net/) International Components for Unicode contains an implementation of the bidirectional algorithm — along with other internationalization services
- UCData: "Pretty Good Bidi Algorithm Library" (http://crl.nmsu.edu/~mleisher/ucdata.html)
- BiDi for the Mozilla Web Browser (http://www.langbox.com/bidimozilla/)
- Opera Web Browser and Bidirectional Languages (http://nontroppo.org/wiki/OperaAndBiDiLanguages)
- Another Wiki about BiDi (http://mac.plonter.co.il/plonwiki/BidiWiki)
- Right to Flash Petition for Macromedia's Flash MX technology (http://www.the-right-to-flash.com/about.php)