Sybase 12.4.2 manual Code pages in Windows and Windows NT, 319

Models: 12.4.2

1 536
Download 536 pages 20.34 Kb
Page 339
Image 339

CHAPTER 9 International Languages and Character Sets

Operating system The client operating system has text displayed on its interface, and may also process text.

For a satisfactory working environment, all these sources of text must work together. Loosely speaking, they must all be working in the user’s language and/or character set.

Code pages in Windows and Windows NT

Upper and lower pages

Example

Many languages have few enough characters to be represented in a single-byte character set. In such a character set, each character is represented by a single byte: a two-digit hexadecimal number.

At most, 256 characters can be represented in a single byte. No single-byte character set can hold all of the characters used internationally, including accented characters. This problem was addressed by the development of a set of code pages, each of which describes a set of characters appropriate for one or more national languages. For example, code page 869 contains the Greek character set, and code page 850 contains an international character set suitable for representing many characters in a variety of languages.

With few exceptions, characters 0 to 127 are the same for all the single-byte code pages. The mapping for this range of characters is called the ASCII character set. It includes the English language alphabet in upper and lower case, as well as common punctuation symbols and the digits. This range is often called the seven-bitrange (because only seven bits are needed to represent the numbers up to 127) or the lower page. The characters from 128 to 256 are called extended characters, or upper code-page characters, and vary from code page to code page.

Problems with code page compatibility are rare if the only characters used are from the English alphabet, as these are represented in the ASCII portion of each code page (0 to 127). However, if other characters are used, as is generally the case in any non-English environment, there can be problems if the database and the application use different code pages.

Suppose a database holding French language strings uses code page 850, and the client operating system uses code page 437. The character À (upper case A grave) is held in the database as character \xB7 (decimal value 183). In code page 437, character \xB7 is a graphical character. The client application receives this byte and the operating system displays it on the screen, the user sees a graphical character instead of an A grave.

319

Page 339
Image 339
Sybase 12.4.2 manual Code pages in Windows and Windows NT, 319