Mojibake
|
Mojibake (文字化け, moji character + bake change, literally ghost characters or changed characters) is Japanese for broken characters: the result of trying to display text in character encodings which a piece of software is not configured to deal with. This is often because they are "foreign" alphabets with respect to the makers of the software, but the problem can also arise between different encodings of the same language - such as between EUC-JP and Shift-JIS, both encodings of Japanese characters.
In mid 1990s as this problem became common, several website featured mojibake not as a problem to be tackled but as a computer joke. Words and even sentences were "deciphered" with meanings made up to deliver funny messages. It was even joked that this must be the work of extraterrestrials or ghosts trying to deliver secret messages.
In Chinese, this phenomenon is called "luanma" (Template:Zh-stpl).
Example: "文字化け" might be displayed as "•¶Žš‰»‚¯" (of course, depending on the software you use to view this article, that example may not show up correctly).
Problems in other languages
Letter_to_Russia_with_krokozyabry.jpg
This problem is not unique to the Asian users. Central and Eastern European computer users tend to be affected as well. Because most computers were not connected to any network, during the mid- to late eighties there were different character encodings for every language with diacritical characters.
During the '90s, Russian computer users had to endure several different competing encodings (Unix KOI8-R, Windows CP-1251, DOS 866, standard ISO 8859-5, and several others) for the Cyrillic alphabet. Badly configured servers and lack of compatibility made garbled text a common and frustrating experience. Russian users, scared of the strange and unusual characters appearing instead of familiar Cyrillic letters, called them Template:Lang (krokozyabry). Many E-mail servers stripped the 8th bit from the characters as permitted by earlier standards (which makes complete hash out of UTF-8 as well as all of the above). For this reason many Cyrillic users used to resort to Roman transliteration. An even more frustrating problem emerged in early 2000s, when a popular e-mail client Outlook started to replace all entered Cyrillic characters with question marks when replying to or forwarding a message created in another codepage.
In Poland every entity selling early DOS computers "invented" its own encoding, and reprogrammed EPROMS of CGA/EGA/Hercules video cards with character shapes in these encodings. Additionally users of then popular home computers (like Amiga, Atari ST) "invented" their own encodings, incompatible with international standards (ISO 8859-2), vendor standards (IBM CP852, Windows CP1250) and locally agreed upon PC/MS DOS standards (Mazovia). The situation began to improve when (thanks to academic and user groups pressure) ISO 8859-2 succeeded as the "internet standard" with limited support of the dominant vendor's software. Thanks to numerous problems with all those encodings, even today some users tend to refer to Polish diacritical characters as krzaki (bushes).
See also
External links
- What Is Mojibake? (http://www.debian.or.jp/~kubota/mojibake/) - from a Debian internationalization developer.