#### Shared Memory SMP and Cache Coherence (cont)

## Review: Snoopy Cache Protocol

#### Write Invalidate Protocol:

- Multiple readers, single writer
- Write to shared data: an invalidate is sent to all caches which snoop and *invalidate* any copies
- Read Miss: • Write-through: memory is always up-to-date Write-back: snoop in caches to find most recent copy
- Write Broadcast Protocol (typically write through):
- Write serialization: bus serializes requests! Bus is single point of arbitration
- Good for a small number of processors; how about 16 or more?

#### Larger MPs

Separate Memory per Processor

from UCB CS252 S01. Copyright 2001 U

- ♦ Local or Remote access via memory controller
- 1 Cache Coherency solution: non-cached pages
- Alternative: directory per cache that tracks state of every block in every cache
  - Which caches have a copies of block, dirty vs. clean, ...
- Info per memory block vs. per cache block?
  - PLUS: In memory => simpler protocol (centralized/one location) MINUS: In memory => directory is f(memory size) vs. f(cache size)
- Prevent directory as bottleneck?
- distribute directory entries with memory, each keeping track of which Procs have copies of their blocks

#### **Directory Protocol**

#### Similar to Snoopy Protocol: Three states

- Shared: ≥ 1 processors have data, memory up-to-date
- Uncached (no processor hasit; not valid in any cache)
- Exclusive: 1 processor (owner) has data; memory out-of-date
- In addition to cache state, must track which processors have
- data when in the shared state (usually bit vector, 1 if processor has copy)
- Keep it simple(r):
  - Writes to non-exclusive data => write miss
  - Processor blocks until access completes
  - Assume messages received
  - and acted upon in order sent

### Distributed Directory MPs





No bus and don't want to broadcast:

- interconnect no longer single arbitration point
- all messages have explicit responses
- Terms: typically 3 processors involved
  - Local node where a request originates
  - Home node where the memory location of an address resides
  - Remote node has a copy of a cache block, whether exclusive or shared
- Example messages on next slide: P = processor number, A = address

| Directory Protocol Messages                                                                                               |                |                |             |
|---------------------------------------------------------------------------------------------------------------------------|----------------|----------------|-------------|
| Message type                                                                                                              | Source         | Destination    | Msg Content |
| Read miss                                                                                                                 | Local cache    | Home directory | P, A        |
| <ul> <li>Processor P reads data at address A;<br/>make P a read sharer and arrange to send data back</li> </ul>           |                |                |             |
| Write miss                                                                                                                | Local cache    | Home directory | P, A        |
| <ul> <li>Processor P writes data at address A;<br/>make P the exclusive owner and arrange to send data back</li> </ul>    |                |                |             |
| Invalidate                                                                                                                | Home directory | Remote caches  | A           |
| Invalidate a shared copy at address A.                                                                                    |                |                |             |
| Fetch                                                                                                                     | Home directory | Remote cache   | A           |
| Fetch the block at address A and send it to its home directory                                                            |                |                |             |
| Fetch/Invalidate                                                                                                          | Home directory | Remote cache   | A           |
| <ul> <li>Fetch the block at address A and send it to its home directory; invalidate<br/>the block in the cache</li> </ul> |                |                |             |
| Data value reply                                                                                                          | Home directory | Local cache    | Data        |
| <ul> <li>Return a data value from the home memory (read miss response)</li> </ul>                                         |                |                |             |
| Data write-back                                                                                                           | Remote cache   | Home directory | A, Data     |
| <ul> <li>Write-back a data value for address A (invalidate response)</li> </ul>                                           |                |                |             |
|                                                                                                                           |                |                | 7           |

# State Transition Diagram for an Individual Cache Block in a Directory Based System States identical to snoopy case; transactions very similar. Transitions caused by read misses, write misses, invalidates, data fetch requests Generates read miss & write miss msg to home directory. Write misses that were broadcast on the bus for snooping => explicit invalidate & data fetch requests. Note: on a write, a cache block is bigger, so need to read the full cache block

0





