The WMB instruction is the preferred method for providing high-bandwidth write streams where order must be preserved between writes in that stream.

Notes:

WMB is useful for ordering streams of writes to a non-memory-like region, such as to mem- ory-mapped control registers or to a graphics frame buffer. While both MB and WMB can ensure that writes to a non-memory-like region occur in order, without being aggregated or reordered, the WMB is usually faster and is never slower than MB.

WMB can correctly order streams of writes in programs that operate on shared sections of data if the data in those sections are protected by a classic semaphore protocol. The following example illustrates such a protocol:

Processor i

Processor j

 

 

<Acquire lock>

 

MB

 

<Read and write data

 

in shared section>

 

WMB

 

<Release lock>

<Acquire lock>

 

MB

 

<Read and write data in shared section>

 

WMB

 

 

The example above is similar to that in Section 5.5.4, except a WMB is substituted for the sec- ond MB in the lock-update-release sequence. It is correct to substitute WMB for the second MB only if:

1.All data locations that are read or written in the critical section are accessed only after acquiring a software lock by using lock_variable (and before releasing the software lock).

2.For each read u of shared data in the critical section, there is a write v such that:

a.v is BEFORE the WMB

b.v follows u in processor issue sequence (see Section 5.6.1.1)

c.v either depends on u (see Section 5.6.1.7) or overlaps u (see Section 5.6.1), or both.

3.Both lock_variable and all the shared data are in memory-like regions (or lock_variable and all the shared data are in non-memory-like regions). If the lock_variable is in a non-memory-like region, the atomic lock protocol must use some implementation-spe- cific hardware support.

The substitution of a WMB for the second MB is usually faster and never slower.

4–148Alpha Architecture Handbook

Page 204
Image 204
Compaq ECQD2KCTE manual Processor Processor j