Hardware Overview 45
Assume that processor 1 loads into its cache memory address 0x0123, which
happens to contain the character A. Then processor 2 writes B into address
0x0123. If processor 1 wants to load address 0x0123 again, what will
happen? In a naive implementation, processor 1 will see the value A in its
cache and load that value because it does not know that processor 2 has
already changed the same memory address in its cache. This is called the
cache coherency problem.
2.5.1.6 Snooping
One solution to the cache coherency problem is snooping. Snooping is
hardware logic that is added to the processor and is affiliated with normal
memory reads. While a memory operation is in process, the other caches in
the system are interrogated (snooped) to see if the data currently resides
there. If one processor needs to write into a cache, a message is broadcast
which causes that entry to be invalidated in all other caches. This is called a
cross invalidate. Cross invalidate reminds the processor that the value in the
cache is not valid. In this case, there is a cache miss. The processor must
then look for the correct value in another cache or in the main memory.
Since cross invalidate increases cache misses and the snooping protocol
adds to the bus traffic, solving the cache consistency problem reduces the
performance and scalability of all SMP systems. In other words, because
latency time for a request can be widely variable due to the location of data
and snooping activity, adding more processors is not always the best thing to
do in an attempt to improve response time for a given request.
The POWER3 processor has this extra logic. In fact, all PowerPC processors
except the 603, POWER and POWER2 have this function.
Bus snooping is used to drive a MESI four-state protocol as it is described in
the following section.
2.5.1.7 MESI Protocol
The unit of storage in the cache is the cache line. The size of the cache line is
implementation dependent. The PowerPC has a cache line size that is 64
bytes. This cache line is divided into two 32-byte sectors. The POWER3 has
a 128-byte cache line and a single sector.
The PowerPC maintains cache coherency on a cache sector basis by using
the four-state MESI protocol. Each sector has two state bits. The four states
are: