Options to Improve TLB Hit Rates

To improve Translation Lookaside Buffer (TLB) hit rates in an application running on an Itanium-based or a PA 8000-based system, use the following linker or chatr virtual memory page setting options:

+pd size - requests a specified data page size of 4K bytes, 16K, 64K, 256K, 1M, 4M, 16M, 64M, 256M, or L. Use L to specify the largest page size available. The actual page size may vary if the requested size can not be fulfilled.

+pi size - requests a specified instruction page size. (See +pd size for size values.)

The default data and instruction page size is 4K bytes on Itanium and PA-RISC systems. The Itanium architecture supports multiple page sizes from 4K to 4G (4K, 8K, 16K, 64K, 256K, 1M, 4M, 16M, 64M, 256M and 4G). The PA-RISC 2.0 architecture supports multiple page sizes, from 4K bytes to 64M bytes, in multiples of four. This enables large contiguous regions to be mapped into a single TLB entry. For example, if a contiguous 4MB of memory is actively used, 1000 TLB entries are created if the page size is 4K bytes, but only 64 TLB entries are created if the page size is 64K bytes. Applications and benchmarks have larger and larger working-set sizes. Therefore, the linker and chatr TLB page setting options can help boost performance by improving TLB hit rates. Some scientific applications benefit from large data pages. Alternatively, some commercial applications benefit from large instruction page sizes.

Example 19 Examples

To set the virtual memory page size by using the linker:

$ ld +pd 64K +pi 16K /opt/langtools/lib/crt0.o myprog.o -lc

To set the page size from HP C and HP Fortran:

$ cc -Wl,+pd,64K,+pi,16K myprog.c

$ f90 -Wl,+pd,64K,+pi,16K myprog.f

To set the page size by using chatr: $ chatr +pd 64K +pi 16K a.out

Profile-Based Optimization (Itanium)

For information on Profile-Based Optimization on Itanium systems, see +Oprofile=collect and +Oprofile=use in the C/C++ help document.

Incremental Linking

“Using Incremental Linking Options” (page 219)

“Archive Library Processing” (page 219)

“Shared Library Processing” (page 219)

“Performance ” (page 219)

In the edit-compile-link-debug development cycle, link time is a significant component. The incremental linker can reduce the link time by taking advantage of the fact that you can reuse most of the previous version of the program and that the unchanged object files need not be processed. The incremental linker allows you to insert object code into an output file (executable or shared library) that you created earlier, without relinking the unmodified object files. The time required to relink after the initial incremental link depends on the number of modules you modify. You can debug the resulting executable or shared library produced by the incremental linker using the gdb

Options to Improve TLB Hit Rates 217