AMD Geode™ LX Processors Data Book 243
Graphics Processor 33234H
6.3.2.4 Palletized Color Support
If the Preserve LUT Data bit is set in the
GP_CH3_MODE_STR register (GP Memory Offset
64h[20]) then 1K of the 2K buffer space will be allocated to
be a LUT. As long as this bit remains set, the LUT data is
preserved as written. Setting this bit has the impact of
slightly lowering performance since it limits the prefetch
ability of the GP, or its ability to receive massive amounts
of host source data. This is unlikely to be a significant
issue, but if the LUT is not needed for future BLTs, then
clearing this bit is recommended. It is required to be
cleared during rotations since the entire 2K buffer space is
needed.
If the BPP/FMT bits in the GP_CH3_MOD_STR register
(GP Memory Offset 64h[27:24]) indicate that the incoming
data is either 4 or 8-bpp indexed mode, then the LUT will
be used to convert the data into 16 or 32-bpp mode as
specified in the GP_RASTER_MODE register’s BPP/FMT
field (GP Memory Offset 38h[31:28]). The LUT should be
loaded prior to initiating such a BLT by writing an address
to the GP_LUT_INDEX register (GP Memory Offset 70h)
followed by one or more DWORD writes to the
GP_LUT_DATA register (GP Memory Offset 74h) that will
be loaded into the LUT starting at that address. The
address automatically increments with every write.
Addresses 00h-FFh are used for 8-bpp indexed pixels and
addresses 00h-0Fh are used for 4-bpp indexed pixels. The
result of a lookup is always a DWORD. If the output format
is only 16-bpp, then only the data in the two least signifi-
cant bytes is used.
For 4-bpp incoming data, two pixels are packed within a
byte such that bits[7:4] contain the leftmost pixel and
bits[3:0] contain the rightmost pixel. The pixel ordering for
4-bit pixels is shown in Table 6-13.
For host source data, the starting offset into the first
DWORD is taken from GP_CH3_OFFSET[1:0] (and
GP_CH3_OFFSET[28] if the data is 4-bpp). For data being
fetched from memory, GP_CH3_OFFSET[23:0] specifies
the starting byte and GP_CH3_OFFSET[28] specifies the
nibble within the byte for 4-bpp mode.
Note that, regardless of the output pixel depth, palletized
color has a throughput of no more than one clock per pixel.
The LUTs share memory with the incoming data FIFO, so
the datapath first pops the incoming indexed pixels out of
the FIFO (8 or 16 at a time), then performs the LUT lookup,
one pixel per clock, for the next 8 or 16 clocks, then must
pop more data out of the FIFO.
Table 6-13. Pixel Ordering for 4-Bit Pixels
313029282726252423222120191817161514131211109876543210
Pixel 6 Pixel 7 Pixel 4 Pixel 5 Pixel 2 Pixel 3 Pixel 0 Pixel 1