MultiProcessor Specification

3.3 External Cache Subsystem

Intel-compatible processors support multiprocessing both on the processor bus and on a memory bus, both with and without secondary cache units. Due to the high bandwidth demands of multiprocessor systems, external caches are often employed to improve performance. The existence and implementation details of external caches are not a part of this specification. However, when external caches are used, they must conform to certain requirements with regard to the following design issues:

Maintaining cache coherency— When one processor accesses data cached in another processor’s cache, it must not receive incorrect data. If it modifies data, all other processors that access that data also must not receive stale data. External caches must maintain coherency among themselves, and with the main memory, internal caches, and other bus master DMA devices.

Cache flushing— The processor can generate special flush and write-back bus cycles that must be used by external caches in a manner that maintains cache coherency. The actual responses are implementation-specific and may vary from design to design. A program can initiate hardware cache flushing by executing a WBINVD instruction. This instruction is only guaranteed to flush the caches of the local processor. See Appendix B for system-wide flushing mechanisms. Given that cache coherency is maintained by hardware, there is no need for software to issue cache flush instructions under normal circumstances.

Reliable communication— All processors must be able to communicate with each other in a way that eliminates interference when more than one processor accesses the same area in memory simultaneously. The processor uses the LOCK# signal for this purpose. External caches must ensure that all locked operations are visible to other processors.

Write ordering— In some circumstances, it is important that memory writes be observed externally in precisely the same order as programmed. External write buffers must maintain the write ordering of the processor.

3.4Locking

To protect the integrity of certain critical memory operations, Intel-compatible processors provide an output signal called LOCK#. For any given memory access, LOCK# is asserted once, but may remain asserted for as many memory bus cycles as required to complete the memory operation. It is the responsibility of the system hardware designers to use this signal to control memory accesses among processors.

A compliant system in multiprocessor mode must guarantee atomicity of locked-aligned memory operations; however, the implementation is not specified in this specification. A compliant system must lock at least the area of memory defined by the destination operand. A specific implementation may lock a broader area—it may even lock the entire bus. Therefore, software must consider this behavior.

To guarantee AT compatibility, locking of misaligned memory operations over other AT-compatible buses in the compliant system must be strictly implemented in accordance with the bus specifications. A compliant system may not be required to support the misaligned memory

3-4

Version 1.4

Page 24
Image 24
Intel MultiProcessor manual External Cache Subsystem, Locking