Progress
Internationalization Guide


Valid and Invalid Code-page Conversions

Sometimes you can convert data from one code page to another and sometimes you cannot. "Understanding Code Pages," discusses a general rule, based on elementary set theory, that explains why some code-page conversions are valid and others are not. In the following restatement of the rule, source code page means the code page you are converting from and target code page means the code page you are converting to.

Determining Valid Code-page Conversions (For Non-unicode Databases)

You can convert data from one code page to another if one of the following conditions is true:

If Progress is converting from UTF-8 and encounters a character that does not exist in the target code page, Progress substitutes the question mark (?).

Converting Between Double Byte and Single Byte

If you apply the preceding rule to conversion between a double-byte code page and a single-byte code page, you conclude that all such conversions are invalid, in either direction. This is because double-byte code pages contain many more symbols than single-byte code pages.

Converting Between Double Byte and Double Byte

If you apply the preceding rule to conversion from one double-byte code page to another, you conclude that a conversion is valid if the source code page is a subset of the target code page. The double-byte to double-byte conversions that appear in Table 8–8 are valid.

Table 8–8: Valid Double-byte To Double-byte Code-page Conversions 
Conversion
Comment
SHIFT–JIS to EUCJIS
EUCJIS to SHIFT–JIS
SHIFT–JIS and EUCJIS contain the same symbols.
KSC5601 to CP949
KSC5601 is a subset of CP949.
BIG–5 to CP950
BIG–5 is a subset of CP950.

NOTE: You cannot convert from CP949 and KSC5601, both code pages for Korean, because CP949 contains many characters that KSC5601 does not.


Copyright © 2004 Progress Software Corporation
www.progress.com
Voice: (781) 280-4000
Fax: (781) 280-4095