CMP - Computer Science

Download Report

Transcript CMP - Computer Science

CMP
Created by:
Erik Chesnut, Harrison Jordan, Begad Shaheen, and Jeremy Taylor
Presented on:
April 23, 2015
What is a CMP?
● CMP stands for Chip Multi-Processor.
● A core is another name for a processing unit.
● The job of this unit is to read and execute program
instructions.
● A CMP is a single physical computing component that is
made up of at least two cores.
Why use a CMP?
● High number of transistors.
● Increased performance through pipelining and
parallelism.
● CMP’s are best used with highly parallel
throughput-sensitive applications and less
parallel latency-sensitive applications.
Advantages of CMP over Single Core
● Efficient in energy consumption and silicon
area.
● Thread-Level Speculation (TLS).
● Transactional Memory.
● Increased throughput performance by use of
parallelism.
Early CMP’s
An example: Intel Core Duo (2006)
● 2 cores.
● 2-way Simultaneous Multi-Threading (SMT).
● Power consumption between 9W and 30W.
● Up to 2.33 GHz.
● 2MB shared L2 cache.
● P6 microarchitecture (Pentium M).
● 151 million transistors in 65nm technology.
Another Design
Example: Intel Polaris (2007)
● 80 cores.
● Static schedule, single issue.
● No shared L2 cache.
● 3.2 GHz.
● Power consumption 62W.
CMP’s of Today
Example: Intel core i7 (2012)
● 2,4,6, or 8 cores.
● 2 way Simultaneous Multi-Threading (SMT).
● Up to 3.5 GHz.
● Sandybridge microarchitecture.
● Up to 20 MB shared L3 cache.
● Power consumption between 45W and 150W.
● 2 billion transistors in 22nm technology.
Architecture
•Two general types of CMP:
•Homogeneous.
o Mostly used for PC’s.
•Heterogeneous.
o Sold as multiprocessor system-on a-chip (MPSoC).
Issues with CMP’s
● Moore’s Law.
● Shared Memory.
● Consistency.
Moore’s Law
More of an observation than a law.
● Observation made by co-founder of Intel.
○
○
Gordon Moore, 1965
Number of transistors per square inch will double
every 18 months.
● Limitations.
○
○
○
Circuit complexity bounds.
Atomic level scale.
Dielectric constant between metals.
http://www.aspistrategist.org.au/wp-content/uploads/2013/04/Transistor_Count_and_Moores_Law_-_2011.svg-copy.jpg
Shared Memory
● Each core on the chip has its own cache.
o Each cache has its own instructions and data.
● Cache coherence major issue when trying to use
parallel programming.
o MSI Protocol derived to solve this issue.
 Modified (Dirty).
 Shared (Valid).
 Invalid.
Consistency
● Instructions can execute out of order.
o
o
Address space would have incorrect values.
Write buffer stores in address space then moves on
to next instruction.
● Erroneous Loads.
o
Cores loading instructions/data from another core.
● Use of locks to ensure threads execute in
order.
Sources
•
http://www.inf.ed.ac.uk/teaching/courses/pa/Notes/lecture10-multicores.pdf
•
http://academic.udayton.edu/scottschneider/courses/ECT466/Course%20Notes/LSN%204%20%20Instruction-level%20Parallelism/Chip%20Multiprocessor%20Architecture%20%20Tips%20to%20Improve%20Throughput%20and%20Latency.pdf
•
http://www.eng.auburn.edu/~agrawvd/COURSE/E6200_Fall08/CLASS_TALKS/Single%20Chip%20multi%20
processor.ppt
•
http://webee.technion.ac.il/bolotin/Presentations/VLSI%20seminar%20session%20-%20CMP.final.ppt
•
http://www.pcstats.com/articleview.cfm?articleid=2000&page=7
•
http://studentnet.cs.manchester.ac.uk/ugt/2013/COMP35112/
•
http://www.eetimes.com/document.asp?doc_id=1323507