Coherence Ordering for Ring- based Chip Multiprocessors Presented by Bob Koutsoyannis
Download ReportTranscript Coherence Ordering for Ring- based Chip Multiprocessors Presented by Bob Koutsoyannis
Coherence Ordering for Ringbased Chip Multiprocessors Michael R. Marty and Mark D. Hill Presented by Bob Koutsoyannis Outline of Key Points Why have a Ring-based Chip Design? Tradeoffs between: ORDERING-POINT, GREEDYORDER, RING-ORDER Coherence Ordering Evaluations/Comparisons Strengths of Paper Weaknesses of Paper Ring-Based CMPs Physical Characteristics: Unidirectional Ring Processing Elements Shared Higher Level Cache Inside. Advantages? Disadvantages? In Comparison to Previous Bus/Directory based Chip Designs Pros and Cons of Ring-Based Centralized Arbiters? Buses (Wire Area)? Switch Networks? Directories? Ordering? The Protocols in Detail ORDERING-POINT -Established ordering point costs in latency and bandwidth +Requests are Pipelined and not blocked so good “order” is established GREEDY-ORDER +No extra steps to reach an ordering point. -Issue of Starvation and unbounded retries. RING-ORDER ++Best of both worlds using Tokens ORDERING-POINT Request is activated after passing the ordering point GREEDY-ORDER Two Piggies issue requests Try Again DENIED!!! •More Thoughts? •How to Handling Retries… RING-ORDER Importance of Priority Token Minimized Data Transfer Stability No Fixed Synchrony Acquiring all Tokens Furthest Destination Helps Limit requests Simulation Details 8 CPUs with L1 & L2 Private Caches 2 Shared L3 Cache Banks in center of Ring, each with a memory controller L2 interleaved 16 ways GREEDY-ORDER-IDEAL Coherence Ordering Evaluations/Comparisons RING-ORDER 6-52% Faster than ORDERING-POINT RING ORDER 8-12% Faster than GREEDY-ORDER Out of Order vs. In Order Cores Strengths of Paper Ring-based Evaluations and Results Weaknesses of Paper Didn’t offer a comparison with non-ring based architectures.