aligned octaword boundaries whenever language rules allow. In some implementations, a series of writes that completely fill a cache block may be a factor of 10 faster than a series of writes that partially fill a cache block, when that cache block would give a read miss. This is true of
For such implementations, long strings of sequential writes will be faster if they start on a
Items within aggregates that are forced to be unaligned (records, common blocks) should gen- erate
Compiled code for parameters should assume that the parameters are aligned. Unaligned actu- als will cause
Frequently used scalars should reside in registers. Each scalar datum allocated in memory should normally be allocated an aligned quadword to itself, even if the datum is only a byte wide. This allows aligned quadword loads and stores and avoids
Implementors should give first priority to fast reads of aligned octawords and second priority to fast writes of full cache blocks.
A.3.2 Shared Data in Multiple Processors — Factor of 3
Software locks are aligned quadwords and should be allocated to large cache blocks that either contain no other data or
Whenever there is high contention for a lock, one processor will have the lock and be using the guarded data, while other processors will be in a
Whenever there is almost no contention for a lock, one processor will have the lock and be using the guarded data. Under these circumstances, it might be desirable to keep the guarded
Software Considerations