25112 Rev. 3.06 September 2005

Software Optimization Guide for AMD64 Processors

8.7Efficient Binary-to-ASCII Decimal Conversion

Fast binary-to-ASCII decimal conversion can be important to the performance of software working with text oriented protocols like HTML, such as web servers. The following examples show two optimized functions for fast conversion of unsigned integers-to-ASCII decimal strings on

AMD Athlon 64 and AMD Opteron processors. The code is written for the Microsoft Visual C compiler.

The function uint_to_ascii_lz converts like sprintf(sptr, "%010u", x). That is, leading zeros are retained, whereas uint_to_ascii_nlz converts like sprintf(sptr, "%u", x); that is, leading zeros are suppressed.

This code can easily be extended to convert signed integers by isolating the sign information and computing the absolute value as shown in Listing on page 130 before starting the conversion process. For restricted argument ranges, construct more efficient conversion routines using the same algorithm as used for the general case presented here.

The algorithm first splits the input argument into suitably sized blocks by dividing the input by an appropriate power of ten and working separately on the quotient and remainder of that division. The DIV instruction is avoided as described in “Replacing Division with Multiplication” on page 160. Each block is then converted into a fixed-point format that consists of one (decimal) integer digit and a binary fraction. This allows the generation of additional decimal digits by repeated multiplication of the fraction by 10. For efficiency reasons the algorithm implements this multiplication by multiplying by five and moving the binary point to the right by one bit for each step of the algorithm. To avoid loop overhead and branch mispredictions, the digit generation loop is completely unrolled. In order to maximize parallelism, the code in uint_to_ascii_lz splits the input into two equally sized blocks each of which yields five decimal digits for the result.

Binary-to-ASCII Decimal Conversion Retaining Leading Zeros

__declspec(naked) void __stdcall uint_to_ascii_lz(char *sptr, unsigned int x)

{

__asm {

 

 

 

 

push

edi

; Save

as per calling conventions.

push

esi

; Save

as per calling conventions.

push

ebx

; Save

as per calling conventions.

mov

eax, [esp+20]

; x

 

 

mov

edi, [esp+16]

; sptr

 

mov

esi, eax

; x

 

 

mov

edx, 0xA7C5AC47

; Divide x by

mul

edx

;

10,000 using

add

eax, 0xA7C5AC47

;

multiplication

adc

edx, 0

;

with reciprocal.

shr

edx, 16

; y1 =

x / 1e5

mov

ecx, edx

; y1

 

imul

edx, 100000

; (x /

1e5) * 1e5

sub

esi, edx

; y2 =

x % 1e5

mov

eax, 0xD1B71759

; 2^15

/ 1e4 * 2^30

mul

ecx

; Divide y1 by 1e4,

Chapter 8

Integer Optimizations

181

Page 197
Image 197
AMD 250 Efficient Binary-to-ASCII Decimal Conversion, Binary-to-ASCII Decimal Conversion Retaining Leading Zeros, 181