AMD Athlon™ Processor x86 Code Optimization

22007E/0 — November 1999

case combine( 1, 1 ): for( i ... ) {

DoWork1( i );

DoWork3( i );

}

break;

default:

break;

}

The trick here is that there is some up-front work involved in generating all the combinations for the switch constant and the total amount of code has doubled. However, it is also clear that the inner loops are "if()-free". In ideal cases where the “DoWork*()” functions are inlined, the successive functions will have greater overlap leading to greater parallelism than would be possible in the presence of intervening “if()” statements.

The same idea can be applied to constant “switch() ” statements, or combinations of “switch()” statements and “if()” statements inside of “for()” loops. The method for combining the input constants gets more complicated but will be worth it for the performance benefit.

However, the number of inner loops can also substantially increase. If the number of inner loops is prohibitively high, then only the most common cases need to be dealt with directly, and the remaining cases can fall back to the old code in a "default:" clause for the “switch()” statement.

This typically comes up when the programmer is considering runtime generated code. While runtime generated code can lead to similar levels of performance improvement, it is much harder to maintain, and the developer must do their own optimizations for their code generation without the help of an available compiler.

Declare Local Functions as Static

Functions that are not used outside the file in which they are defined should always be declared static, which forces internal linkage. Otherwise, such functions default to external linkage,

24

Declare Local Functions as Static

Page 40
Image 40
AMD x86 manual Declare Local Functions as Static