漢字 in Traditional Chinese and other languages.

Chinese characters or Han characters (汉字/漢字) are logograms used in the written forms of the Chinese language, and to varying degrees in the Japanese and Korean languages (though the latter only in South Korea). Use of Chinese characters has disappeared from the Vietnamese language — in which they were used until the 20th century — and from North Korea, where they have been completely replaced by Hangul.

Chinese characters are called hnz in Mandarin Chinese, kanji in Japanese, hanja or hanmun in Korean, and hán tư (also used in the chu nom script) in Vietnamese. However, the last is considered an extremely sinified form and Chinese characters are normally called chữ nho (字儒). (Note that the morphemes are reversed as is common in Vietnamese borrowings from Chinese.)

In Chinese, a word or phrase (词/詞 c) (a unit of meaning) is composed of one or more characters (字 z), for instance the phrase 汉字/漢字 hnz Template:Audio2 is composed by two characters. Each Chinese character is read as a single syllabic unit in all spoken variants of Chinese dialects still existing today, however in Japanese a kanji can be multisyllabic if it is read in the Kun'yomi. It is notable that unlike any of the modern Chinese dialects, Archaic Chinese has consonant clusters and lacks a tonal feature, for example 角 (jiaǒ) is pronounced klak in Archaic Chinese.

Japanese, Korean, and Vietnamese are not linguistically related to Chinese, and in order to make Chinese characters work in those languages with radically different grammar, many adaptations had to be made. In many cases in these languages, characters different from those used in Chinese are used for words or ideas of the same meaning. Also, many similar characters with identical meanings are written with slight differences. One example is black, which is written as 黒 (kuro and koku ) in Japanese, but as 黑 (hēi) in Chinese. In the twentieth century, thousands of simplified characters were created or adopted in mainland China, creating a distinction between, for example, 汉 in simplified characters used in mainland China and Singapore, and 漢 in traditional characters used in Taiwan and Hong Kong.

For these reasons, particularly in China and Japan, where Chinese characters are used most often, it is frequently necessary to distinguish between Chinese Han characters and Japanese Han characters. In English, the distinction can often be made well enough by using the respective words hanzi and kanji.

Just as Roman letters have a characteristic shape (lower-case letters occupying a roundish area, with ascenders or descenders on some letters), Chinese characters tend to occupy a more-or-less square area. Characters made up of multiple parts squash these parts together in order to maintain a uniform size and shape. Because of this, beginners often practise on squared or graph paper, and the Chinese sometimes call Han characters Template:Unicode "square characters".



The oldest Chinese inscriptions that are clearly writing are the poorly understood Oracle Script (甲骨文 jiǎgǔwn) of the late Shang Dynasty (or Yin (殷) Dynasty), attested from about 1200 BC. There have been suggestions that this was not designed for the Chinese language, or even for a Sino-Tibetan language, because it does not seem to reflect Chinese morphology accurately. An analogy would be if English were written with a script that had a single character for die and kill, but two separate characters for warm in "it's a warm day" and "please warm the bath".

Although the succeeding Zhou Dynasty was Han Chinese, it's not clear which ethnic group the Shang were. One possibility is Miao (苗 Mio). The first recorded Miao kingdom was Jiuli. The ancestors of the Jiuli are thought to be the Liangzhu people, and it is these who are credited with creating the Oracle Script. According to Chinese legend, Jiuli was defeated by the military unification of Huang Di (黃帝 Hungd) and Yandi, leaders of the Huaxia (華夏 Huxi) tribe (the ancestors of the Han Chinese) as they struggled for supremacy of the Huang He valley. After their defeat, the Jiuli people who were not absorbed into the new Zhou state moved south, splitting into the Miao and the Li (黎 l) peoples.

The Yi script is quite old and is superficially similar to Chinese, but does not seem to be derived from it. It's perhaps likely that it was inspired by the example of Chinese, but the possibility cannot be discounted that it and the Chinese script both descend from a common source.


Missing image

The earliest Chinese characters are the so called Oracle Script of the late Shang Dynasty, followed by the Bronzeware Script or (金文) jīnwn during the Zhou Dynasty. These scripts no longer serve as anything but a source for scholars.

The first script that is still in (restricted) use today is the "Seal Script" or 篆書[篆书] zhunshū. It is the result of the efforts of the first emperor of China, Qin Shi Huang, in the standardization of the Chinese script. The Seal Script, as the name suggests, is now only used in artistic seals. Few people are still able to read the seal script, although the art of carving a traditional seal in the seal script remains alive in China today.

Scripts that are still used regularly for print are the "Clerk Script" or 隸書[隶书] lshū, the "Wei Monumental" or 魏碑 wibēi, the "Regular Script" or 楷書[楷书] kǎishū, the "Song Style" or 宋體[宋体] sngtǐ (mainly used in printing and computer fonts), and the "Running Script" or 行書[行书] xngshū. Modern Chinese handwriting is usually modeled on the Running Script.

Finally, there is the "Draft Script" (also called "Grass Script"), or 草書[草书] cǎoshū. The draft script is an idealized calligraphic style, where characters are suggested rather than realized. Despite being cursive to the point where individual strokes are no longer differentiable, the draft script is highly revered for the beauty and freedom that it embodies. Many simplified Chinese characters are based on this style.


漢字 in , ca.
漢字 in bronzeware script, ca. 800 BC

Main article: radical

Each character has a fundamental component, or radical (部首 Chinese: b shǒu, Japanese: bushu, literally "initial portion"), and this design principle is used in Chinese dictionaries to logically order characters in sets.

Full characters are ordered according to their initial radical, which fall into roughly 200 types. Then these are subcategorised by their total number of strokes.


See also: Chinese character classification

Chinese scholars classify Han characters in several groups. The first type, and the type most often associated with Chinese writing, are pictograms, which are pictorial representations of the morpheme represented. There are also ideograms that attempt to graphicalize abstract concepts, such as "up" (上) and "down" (下). However, these pictograms and ideograms take up but a small proportion of Chinese logograms.

Missing image
Excerpt from a 1436 primer on Chinese characters

Most Chinese characters, however, are radical-radical compounds, in which each element (radical) of the character hints at the meaning, and radical-phonetic compounds, in which one component (the radical) indicates the kind of concept the character describes, and the other hints at the pronunciation. This last type accounts for the majority of Chinese logograms. Note that despite being called "compounds", these logograms are single entities in themselves; they are written so that they take up the same amount of space as any other logogram.

Note that due to the long period of language evolution, such component "hints" within characters are often useless and sometimes quite misleading in modern usage. This is particularly true in non-Chinese languages.

Classification has its own problems, as the origins of characters are often obscure. For example, the character for "East" (東; Chinese: dōng, Japanese: higashi and ), which combines the "tree" radical (木) and the "sun" radical (日), is usually considered a radical-radical compound. Though it appears to represent a sun rising through trees, and this is both an evocative image and a useful mnemonic, the origin and classification of the character are disputed among scholars. While some agree with the radical-radical classification, others see it as a unique character in and of itself — some claim it as being derived from an early pictograph of bundled sticks.

As another example, the character for "mother" (媽 in Chinese ) consists of one component meaning "female" (女) and another one meaning "horse" (馬 mǎ). The first component denotes a female entity, whereas the second suggests the pronunciation by referring to the word for "horse". The reason that "horse" was chosen to represent mother may be that horses — in a historical context — were often used to represent "steadfastness". The majority of Chinese characters, like this example, have one component that suggests the meaning and another that suggests pronunciation. In many cases, even the component intended to suggest pronunciation has an abstract semantic relation to the idea expressed by the character. This is possible because the phonetic system of Chinese allows for many words to have the same pronunciation (homonymy), and because the consideration of phonetic similarity used in a character generally ignores its tone and the manner of articulation of its initial consonant (but not the place of articulation).


Chinese characters all take up the same amount of space. One of the easiest ways for beginners to ensure this is with a grid as guidance. In addition to strictness in the amount of space a character takes up, Chinese characters are written with very precise rules. The three most important rules are the strokes employed, the stroke placement, and the order with which they are written (see Stroke order). Most words can be written with just one stroke order, though some words also have variant stroke orders, which may result in different stroke counts. On a larger scale, Chinese text is traditionally written from top to bottom and then right to left, but it is more common today to see the same orientation as Western languages: going from left to right and then top to bottom. Most punctuation was adopted from Western ones, but there are a few exceptions: for example, names of books are marked with a wavy line drawn to their right in vertical text, or enclosed in a special double pointed bracket in horizontal text.

Common errors while writing Chinese characters include incorrect stroke direction, incorrect stroke order, incorrect stroke length relative to other strokes, and incorrect placement of strokes relative to other strokes. Each mistake is highly visible to the literate eye due to the imperfections of the human fingers, as well as the weight given to the different parts of a stroke. Mistakes are often shunned, as they are marks of illiteracy or incompetence. In a culture that values scholarship as its highest virtue, such attributions are highly undesirable. Because of this strictness in not only the image of the character, but how the image is produced, it is considered by many the most difficult to learn properly.

Due to the long history of China, as well as many stylistic variations that have developed and the many attempts by past rulers to standardize writing, some characters have multiple forms. The characters themselves can be considered separate, but often are merely derivatives of each other in that their composition is of the same root. They are often not considered simplifications, as their stroke count is sometimes the same, and often lessened only but a slight amount. The most famous today is probably the character for sword (劍), where the radical (on the right) is knife (刀). The same word can be written with different forms for the radical, including using 刃 or 刀 itself.

The usage of traditional characters versus simplified characters varies greatly, and can depend on both the local customs and the medium. Often, simplified characters would be used in everyday writing, or quick scribblings, while traditional characters would be used in printed works. However, the PRC's adoption of simplified characters has almost completely removed all traces of their traditional counterparts, save for in Hong Kong and Macau. There is no absolute rule for using either system, and often, it is determined by what the target audience understands, as well as the upbringing of the writer. In addition there is a special system of characters used for writing numerals in financial contexts; these characters are deliberately chosen to be complicated, to prevent forgeries or alterations.


The design and use of a dictionary of Chinese characters presents interesting problems. Dozens of indexing schemes have been created for the Chinese characters. The great majority of these schemes — beloved by their inventors but nobody else — have appeared in only a single dictionary; only one such system has achieved truly widespread use. This is the system of radicals.

Chinese character dictionaries often allow users to locate entries in several different ways. Many Chinese, Japanese, and Korean dictionaries of Chinese characters list characters in radical order: characters are grouped together by radical, and radicals containing fewer strokes come before radicals containing more strokes. Under each radical, characters are listed by their total number of strokes. In Japanese and Korean dictionaries, it is usually possible to search for characters by sound, using Kana and Hangul. Most dictionaries also allow searches by total number of strokes, and individual dictionaries often allow other search methods as well.

For instance, to look up the character 松 (pine tree) in a typical dictionary, the user first determines which part of the character is the radical, then counts the number of strokes in the radical (in this case four), and turns to the radical index (usually located on the inside front or back cover of the dictionary). Under the number 4, the user locates the radical 木, then turns to the page number listed, which is the start of the listing of all the characters containing this radical. This page will have a sub-index giving stroke numbers and page numbers. The right half of the character also contains four strokes, so the user locates the number 4, and turns to the page number given. From there, the user must scan the entries to locate the character he or she is seeking. Some dictionaries have a sub-index which lists every character containing each radical, so that if the user knows the number of strokes in the non-radical portion of the character, he or she can locate the correct page number directly.

In Korean, character dictionaries are usually called Okpyeon (옥편; 玉篇), which literally means "Jewel Book", rather like the Latin word thesaurus ("treasure"). 玉篇 is also the name of a fourth-century Chinese dictionary from the Liang Dynasty.

Another popular dictionary system is the four corner method.

Most Chinese-English dictionaries and Chinese dictionaries sold to English speakers use the radical lookup method combined with an alphabetical listing of characters based on their pinyin romanization system. To use one of these dictionaries, the reader finds the radical and stroke number of the character, as before, and locates the character in the radical index. The character's entry will have the character's pronunciation in pinyin written down; the reader then turns to the main dictionary section and looks up the pinyin spelling alphabetically, just as if it were an English dictionary.

Derivatives of Han characters

Besides Korean and Japanese, a number of Asian languages have historically been written with Han characters, or with characters modified from Han characters. They include:

In addition, the Yi script is similar to Han, but is not known to be directly related to it.

Jurchen language [6] ( used a ideographic script consisted of original characters with a few Han borrowings.

Number of Chinese characters

The question of how many characters there are is still the subject of debate. In the 18th century, European scholars claimed the total tally to be about 80,000. This number, however, is thought to be exaggerated as the character count varies by dictionary and its comprehensiveness. For example, the Kangxi Dictionary lists about 40,000 characters, while the modern Zhonghua Zihai lists in excess of 80,000. One reason for the overwhelming number of characters is due to the existence of rarely-occurring variant and obscure characters (many of which are unused, even in Classical Chinese). Note, however, that no two characters are ever contextually identical.

The large number of Chinese characters is due to their logographic nature — for every morpheme there must be a symbol, and sometimes there are variant characters have developed for the same morpheme. It has also been claimed that the sheer number of characters is used as a way to separate scholars from the ordinary, and perhaps even to keep certain texts from being read by but the most scholarly.


It is usually said that about 3,000 characters are needed for basic literacy in Chinese (for example, to read a Chinese newspaper), and a well-educated person will know well in excess of 4,000 to 5,000 characters. Note that it is not necessary to know a character for every known word of Chinese, as the majority of modern Chinese words are compounds made of two or more morphemes, and are thus written not with a single unique character, but with multiple, usually common, characters. There are 6763 code points in GB-2313, an early version of the national standard used in the People's Republic of China. GB18020 has a much higher number. The Hanyu Shuiping Kaoshi proficiency test covers approximately 5000 hanzi.

There are 4808 characters in Taiwanese Ministry of Education's list of regularly used Chinese characters. (常用國字標準字體表) The Chinese Standard Interchange Code (CNS11643) - the official national standard - supports 48027 characters, while the most widely-used encoding scheme, BIG-5, supports only 13053.

In addition, there are a large number of dialect characters which are not used in formal Chinese written language, but are used to represent colloquial terms in non-Mandarin Chinese spoken forms.


In Japanese there are 1945 "daily use kanji" (常用漢字 jōyō kanji) designated by the Japanese Ministry of Education. These are taught during primary and secondary school. Publications which include characters which fall outside this list should print furigana or rubi over the characters as a phonetic guide.

Before September 27, 2004, there were also 2232 government-designated "name kanji" (jinmeiyō kanji 人名用漢字) used in personal and geographical names., with plans to increase this list by 578 kanji in the near future. This would be the largest increase since World War II. The plan has not been without controversy, however. For example, the Chinese characters for "cancer", "hemorrhoids", "corpse" and "excrement", as well as parts of compound words (words created from two or more Chinese characters) meaning "curse", "prostitute", and "rape", are among the proposed additions to the list. This is because no measures were taken to determine the appropriateness of the kanji proposed, with the committee deciding that parents could make such decisions themselves. However, the government will seek input from the public before approving the list. For further information, see the Names section of the main Kanji article. (There is also some speculation that the "odd" kanji being added to the names list are being done so in an attempt to make a de-facto expansion of the Jouyou Kanji List, rather than with the serious idea that anyone will use them in names. The idea of reducing the number of kanji in use has been a politically contentious issue, with many conservatives believing that kanji are culturally Japanese and that people should use them frequently.)

A well-educated Japanese person may know upwards of 3500 kanji. The Kanji kentei (日本漢字能力検定試験 Nihon kanji nōryoku kentei shiken or Test of Japanese Kanji Aptitude) tests the ability to read and write kanji. The highest level of the Kanji kentei tests the ability to read and write 6000 kanji, though in practice few people attain this level as Japanese generally uses fewer Chinese characters than Chinese does, and literacy in Japanese requires knowledge of fewer Chinese characters than literacy in Chinese.


In South Korea, middle and high school students learn 1,800 to 2,000 basic characters (Hanja), but most people use Hangul exclusively in their day-to-day lives. Chinese characters are still used to some extent, particularly in newspapers, weddings, place names and calligraphy.


Although now nearly extinct in Vietnamese, varying scripts of Chinese characters were used to write the language, with use of Chinese characters becoming limited to ceremonial uses beginning in the 19th century. Similarly to Japan and Korea, Chinese was used by the ruling classes, and the characters were eventually adopted to write Vietnamese. To express native Vietnamese words which had different pronunciations than the Chinese, Vietnamese developed the Chu Nom script which added diacritical marks to distinguish native (Vietnamese) words from Chinese.

Rare and complex characters

Missing image
Zh, "verbose"
Missing image
Nng, "unclear pronouncing due to snuffle"

Often a character which is not commonly used (called "rare" or "variant" characters) will appear in a personal or place name in Chinese, Japanese, and Korean names (see Chinese name, Japanese name, and Korean name respectively). This has caused problems as many computer encoding systems include only the 5,000 or so most common characters and exclude the less often used characters. This is especially a problem for personal names which often contain rare or classical characters.

People who have run into this problem include Taiwanese politicians Wang Chien-shien (王建煊) and Yu Shyi-kun (游錫堃) and Taiwanese singer David Tao (陶喆). Newspapers have dealt with this problem in varying ways, including trying to create a character from two characters, including a picture, or, especially as is the case with Yu Shyi-kun, simply omitting the rare character with the hope that the reader will be able to infer who it refers to. Japanese newspapers may render such names and words in katakana instead of kanji, and it is common practice for people to write names for which they are unsure of the correct kanji in katakana instead.

There are also some extremely complex characters which have understandably become rather rare. According to Bellassen (1989), the most complex Chinese character is Template:Unicode zh Template:Audio2 (if the character displays as a question mark on your machine, refer to the image to the right instead), meaning "verbose" and boasting sixty-four strokes; although it fell from use around the fifth century AD.

The honour of being the most complex character in use now goes to 齉 nng Template:Audio2, meaning "unclear pronouncing due to snuffle", with "just" thirty-six strokes.

In contrast, the simplest character is 一 , "one", with just one stroke. The commonest character is 的 de, "of", with eight strokes. The average number of strokes in a character is 9.8.

