What is Unicode?

 

The basic fundamentals of coding computer languages consist of assigning a number to represent a character, such as a letter or a punctuation mark.  There are many encoding systems, representing many different written languages.  In some cases, these codes cannot accommodate all the characters that make up a language.

There is also the problem of coding systems conflicting with each other, such as the same character being represented by to different numbers.  Furthermore, when a computer is forced to support multiple languages, the risk of data being corrupted as it passes between platforms is greatly increased.  This risk factor is becoming more and more prevalent as the global marketplace grows and more and more languages are intermingled.

Unicode provides a solution to these problems.  Unicode provides a unique number for every character; regardless of the platform, program, or language.  Today’s leading standard is ASCII, which assigns 7 bits for each character.  This is suitable for a language such as English, but cannot accommodate all the character in a language such as Chinese.  Unicode on the other hand assigns 16 bits per character, which can support up to 65,000 characters in a language.  Thus, Unicode enables a single website to be targeted by multiple platforms, languages, and countries without translation and without the risk of data corruption.  With this ability to accommodate so many languages, and eliminate risks of data corruption, it is likely that Unicode will eventually surpass ASCII and become the leading standard.