Canonical decomposition
|
In Unicode, canonical decomposition is a process of converting composite characters into canonically-ordered strings of simpler characters using Unicode canonical mappings. The procedure of decomposition may be applied recursively; that is, the characters in the decomposition string may be decomposed further. Any canonical decomposition of a character is a string that is canonically equivalent to the original character.
If a character is not decomposable, then its canonical decomposition is equal to itself.
See also: Unicode, compatibility decomposition