Date: 8/15/2002
China is entering WTO and there will be many business opportunities between Mainland China, Hong Kong, Singapore and Taiwan. However different versions of Chinese characters is still a serious problem and barrier for many of the Web sites in the region as most of the people use simplified Chinese (GB) in Mainland China and Singapore, and traditional Chinese (Big5) in Hong Kong and Taiwan. It becomes necessary to have both versions to gain business awareness without the hassle of maintenance.
hkinteractive.com, the technical consulting arm of Chi Tao Studio has recently implemented customized solutions to corporate and SME in the region. The application enables real time conversions between traditional (Big5) and simplified (GB) Chinese at a low cost and easy the maintenances of the Web site operation. The application has been seamlessly integrated with many of our client's backend content management system. One example is lesliecheung.com, where it only took hkinteractive.com less than a week to implement this application over its 50,000 static and dynamic generated pages.
Main purpose and benefits:
- Easy for maintenance of regional Web sites as corporate only need to keep a primary Chinese character version either in Big5 or GB.
- The application will convert between traditional and simplified Chinese with artificial intelligence that solves one-to-many character-mapping problems in real time.
- Lightening fast conversion speed at 5000 characters/sec.
- 99% of accuracy.
- ASP component based technology
- Easy to integrate into client's existing infrastructure (Windows & Linux platform supported)
- Supports SSL encoded Web pages.
Technical background information about Chinese Big5 and GB characters:
Computers don't speak any languages, they only know numbers. In order for computers to work with human languages such as Chinese and English, special mappings between numbers and letters or characters are made into standards that various computers and programs understand. These agreed upon ways of using Chinese are called characters sets or code sets. GB (short for "Guojia Biaozhun" or "National Standard") is the standard used in the People's Republic of China and Singapore and it has a set of about 7,000 simplified Chinese characters. Big5 is used in Taiwan and Hong Kong and has about 13,000 traditional Chinese characters. Unicode is an emerging standard that attempts to encode all the major languages, including Chinese. Unicode includes all the characters from GB and Big5 and more. A character set is different from a font that supports that character set. You may have a document written using GB, but to view it you need a font that includes all the GB characters. Viewing a GB encoded document as if it were in Big5 will produce garbage on the screen. Viewing a Chinese document on a program that thinks it is in English will also produce an unintelligible document with lots of accented letters and symbols.
The characters in Unicode are a superset of the characters in GB and Big5 so it is easy to convert directly from GB or Big5 into Unicode. However, while there is some overlap between GB and Big5, there are also many simplified characters in GB that are not in Big5, and many traditional characters in Big5 that are not in GB. Consequently, conversion between GB and Big5 is not trivial, since many simplified characters map to multiple Big5 traditional equivalents. Going from Big5 to GB is easier, since the conversion from traditional to simplified is much less ambiguous.