in a particular directory, a lock file called flow.lock is used. Instrumented programs that need to update the flow.data file and linker processes that need to read it must first obtain access to the lock file. Only one process can hold the lock at any time. As long as the flow.data file is being actively read and written, a process will wait for the lock to become available. A program that terminates abnormally can leave the flow.data file inactive but locked. A process that tries to access an inactive but locked flow.data file gives up after a short period of time. In such cases, you may need to remove the flow.lock file. If an instrumented program fails to obtain the database lock, it writes the profile data to a temporary file and displays a warning message containing the name of the file. You could then use the +df option along with the +P option while optimizing, to specify the name of the temporary file instead of the flow.data file. If the linker fails to obtain the lock, it displays an error message and terminates. In such cases, wait until all active processes that are reading or writing a profile database file in that directory have completed. If no such processes exist, remove the flow.lock file.

Forking an Instrumented Application

When instrumenting an application that creates a copy of itself with the fork system call, you must ensure that the child process calls a special function named _clear_counters(), which clears all internal profile data. If you don't do this, the child process inherits the parent's profile data, updating the data as it executes, resulting in inaccurate (exaggerated) profile data when the child terminates. The following code segment shows a valid way to call _clear_counters:

if ((pid = fork()) == 0) /* this is the child process */

{

_clear_counters();

/*

reset

profile data

for child */

. . .

/*

other

code for the

child */

}

The function _clear_counters is defined in icrt0.o. It is also defined as a stub (an empty function that does nothing) in crt0.o. This allows you to use the same source code without modification in the instrumented and un-instrumented versions of the program.

Optimizing Based on Profile Data (+P/-P)

The final step in PBO is optimizing a program using profile data created in the profiling phase. To do this, rebuild the program with the +P compiler option. As with the +I option, the +P option causes the compiler to generate an I-SOM .ofile, rather than the usual object code, for each source file. Note that it is not really necessary to recompile the source files; you could, instead, specify the I-SOM .ofiles that were created during the instrumentation phase. For instance, suppose you have already created an I-SOM file named foo.o from foo.c using the +I compiler option; then the following commands are equivalent in effect:

$ cc +P foo.c

$ cc +P foo.o

Both commands invoke the linker, but the second command doesn't compile before invoking the linker.

The -P Linker Option

After creating an I-SOM file for each source file, the compiler driver invokes the linker with the -Poption, causing the linker to optimize all the .o files. As with the +I option, the driver uses /opt/langtools/lbin/ucomp to generate code and perform various optimizations. To see how the compiler invokes the linker, specify the -voption when compiling. For instance, suppose you have instrumentedprog.c and gathered profile data into flow.data. The following example shows how the compiler driver invokes the linker when +P is specified:

$ cc -o prog -v +P prog.o

/usr/ccs/bin/ld /usr/ccs/lib/crt0.o -u main -o prog \ prog.o -P -lc

210 Improving Your Application Performance