Character Sets

Unlike some legacy encoding, UTF-8 is easy to parse. So-called lead and trail bytes are easily distinguished. Moving forwards or backwards in a text string is easier in UTF-8 than in many other multi-byte encoding.

The codes in the first half of the first row in Character Set Table CS2 (UTF-8 <-> ASCII) are replaced in this transformation format by their ASCII codes, which are octets in the range between 00h and 7F. The other UCS2 codes are transformed to between two and six octets in the range between 80h and FF. Text containing only characters in Character Set Table CS3 (UTF-8 <-> UCS-2) is transformed to the same octet sequence, irrespective of whether it was coded with UCS-2.

8859-1 Character Set Management

ISO-8859-1 is an 8 bit character set - a major improvement over the plain 7 bit US-ASCII.

Characters 0 to 127 are always identical with US-ASCII and the positions 128 to 159 hold some less used control characters. Positions 160 to 255 hold language-specific characters.

ISO-8859-1 covers most West European languages, such as French (fr), Spanish (es), Catalan (ca), Basque (eu), Portuguese (pt), Italian (it), Albanian (sq), Rhaeto-Romanic (rm), Dutch (nl), German (de), Danish (da), Swedish (sv), Norwegian (no), Finnish (fi), Faroese (fo), Icelandic (is), Irish (ga), Scottish (gd) and English (en). Afrikaans (af) and Swahili (sw) are also included, extending coverage to much of Africa.

1-14

G24-L AT Commands Reference Manual

April 15, 2008

Page 44
Image 44
PIONEERPOS G24-LC manual Character Set Management, Character Sets