Vol. 3 8-19
MULTIPLE-PROCESSOR MANAGEMENT
2. Stores from separate string operations (for example, stores from consecutive
string operations) do not execute out of order. All the stores from an earlier string
operation will complete before any store from a later string operation.
3. String operations are not reordered with other store operations.
Fast string operations (e.g. string operations initiated with the MOVS/STOS instruc-
tions and the REP prefix) may be interrupted by exceptions or interrupts. The inter-
rupts are precise but may be delayed - for example, the interruptions may be taken
at cache line boundaries, after every few iterations of the loop, or after operating on
every few bytes. Different implementations may choose different options, or may
even choose not to delay interrupt handling, so software should not rely on the delay.
When the interrupt/trap handler is reached, the source/destination registers point to
the next string element to be operated on, while the EIP stored in the stack points to
the string instruction, and the ECX register has the value it held following the last
successful iteration. The return from that trap/interrupt handler should cause the
string instruction to be resumed from the point where it was interrupted.
The string operation memory-ordering principles, (item 2 and 3 above) should be
interpreted by taking the incorruptibility of fast string operations into account. For
example, if a fast string operation gets interrupted after k iterations, then stores
performed by the interrupt handler will become visible after the fast string stores
from iteration 0 to k, and before the fast string stores from the (k+1)th iteration
onward.
Stores within a single string operation may execute out of order (item 1 above) only
if fast string operation is enabled. Fast string operations are enabled/disabled
through the IA32_MISC_ENABLE model specific register.
8.2.4.2 Examples Illustrating Memory-Ordering Principles for String Operations
The following examples uses the same notation and convention as described in
Section 8.2.3.1.
In Example 8-11, processor 0 does one round of (128 iterations) doubleword string
store operation via rep:stosd, writing the value 1 (value in EAX) into a block of 512
bytes from location _x (kept in ES:EDI) in ascending order. Since each operation
stores a doubleword (4 bytes), the operation is repeated 128 times (value in ECX).
The block of memory initially contained 0. Processor 1 is reading two memory loca-
tions that are part of the memory block being updated by processor 0, i.e, reading
locations in the range _x to (_x+511).
Example 8-11. Stores Within a String Operation May be Reordered
Processor 0 Processor 1
rep:stosd [ _x] mov r1, [ _z]
mov r2, [ _y]