FUTURE • Timing seemed good • However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast (5 of.
Download ReportTranscript FUTURE • Timing seemed good • However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast (5 of.
FUTURE • Timing seemed good • However, only student to give feedback marked confusing (2 of 5 on clarity) and too fast (5 of 5 on pace). • VLIW, MIMD not mean anything to EE undergrads. • In general, the taxonomy at the end is not the big revelation it was for UCB CS grad students 1 Penn ESE680-002 Spring2007 -- DeHon ESE680-002 (ESE534): Computer Organization Day 8: February 5, 2007 Computing Requirements and Instruction Space 2 Penn ESE680-002 Spring2007 -- DeHon Previously • Fixed and Programmable Computation • Area-Time-Energy Tradeoffs • VLSI Scaling 3 Penn ESE680-002 Spring2007 -- DeHon Today • Computing Requirements • Instructions – Requirements – Taxonomy 4 Penn ESE680-002 Spring2007 -- DeHon Computing Requirements (review) 5 Penn ESE680-002 Spring2007 -- DeHon Requirements • In order to build a general-purpose (programmable) computing device, we absolutely must have? –_ –_ –_ –_ –_ 6 Penn ESE680-002 Spring2007 -- DeHon 7 Penn ESE680-002 Spring2007 -- DeHon Primitive compute elements enough? 8 Penn ESE680-002 Spring2007 -- DeHon 9 Penn ESE680-002 Spring2007 -- DeHon 10 Penn ESE680-002 Spring2007 -- DeHon Compute and Interconnect 11 Penn ESE680-002 Spring2007 -- DeHon Sharing Interconnect Resources 12 Penn ESE680-002 Spring2007 -- DeHon Sharing Interconnect and Compute Resources What role are the memories playing here? 13 Penn ESE680-002 Spring2007 -- DeHon Memory block or Register File Interconnect: moves data from input to storage cell; or from storage cell to output. 14 Penn ESE680-002 Spring2007 -- DeHon What do I need to be able to use this circuit properly? (reuse it on different data?) 15 Penn ESE680-002 Spring2007 -- DeHon 16 Penn ESE680-002 Spring2007 -- DeHon Requirements • In order to build a general-purpose (programmable) computing device, we absolutely must have? – Compute elements – Interconnect: space – Interconnect: time (retiming) – Interconnect: external (IO) – Instructions 17 Penn ESE680-002 Spring2007 -- DeHon Instruction Taxonomy 18 Penn ESE680-002 Spring2007 -- DeHon • Distinguishing feature of programmable architectures? – Instructions -- bits which tell the device how to behave Compute 0000 00 net0 Penn ESE680-002 Spring2007 -- DeHon 010 add 11 0110 mem slot#6 19 Focus on Instructions • Instruction organization has a large effect on: – size or compactness of an architecture – realm of efficient utilization for an architecture 20 Penn ESE680-002 Spring2007 -- DeHon Terminology • Primitive Instruction (pinst) – Collection of bits which tell a single bitprocessing element what to do – Includes: • select compute operation • input sources in space – (interconnect) • input sources in time – (retiming) Compute 0000 00 net0 010 add 11 0110 mem slot#6 21 Penn ESE680-002 Spring2007 -- DeHon Computational Array Model • Collection of computing elements – compute operator – local storage/retiming • Interconnect • Instruction 22 Penn ESE680-002 Spring2007 -- DeHon “Ideal” Instruction Control • Issue a new instruction to every computational bit operator on every cycle 23 Penn ESE680-002 Spring2007 -- DeHon “Ideal” Instruction Distribution • Why don’t we do this? 24 Penn ESE680-002 Spring2007 -- DeHon “Ideal” Instruction Distribution • Problem: Instruction bandwidth (and storage area) quickly dominates everything else – Compute Block ~ 1Ml2 (1Kl x 1Kl) – Instruction ~ 64 bits – Wire Pitch ~ 8l – Memory bit ~ 1.2Kl2 25 Penn ESE680-002 Spring2007 -- DeHon 64x8l=512l Two instructions in 1024l Instruction Distribution 26 Penn ESE680-002 Spring2007 -- DeHon Instruction Distribution Distribute from both sides = 2x 27 Penn ESE680-002 Spring2007 -- DeHon Instruction Distribution Distribute X and Y = 2x 28 Penn ESE680-002 Spring2007 -- DeHon Instruction Distribution • Room to distribute 2 instructions across PE per metal layer (1024 = 2864) • Feed top and bottom (left and right) = 2 • Two complete metal layers = 2 • 8 instructions / PE Side 29 Penn ESE680-002 Spring2007 -- DeHon Instruction Distribution • Maximum of 8 instructions per PE side • Saturate wire channels at 8N = N • at 64 PE – beyond this: • instruction distribution dominates area • Instruction consumption goes with area • Instruction bandwidth goes with perimeter 30 Penn ESE680-002 Spring2007 -- DeHon Instruction Distribution • Beyond 64 PE, instruction bandwidth dictates PE size PEarea 4N =N (648l) PEarea =16Kl2N • As we build larger arrays processing elements become less dense 31 Penn ESE680-002 Spring2007 -- DeHon Avoid Instruction BW Saturation? • How might we avoid this? 32 Penn ESE680-002 Spring2007 -- DeHon Instruction Memory Requirements • Idea: put instruction memory in array • Problem: Instruction memory can quickly dominate area, too – Memory Area = 641.2Kl2/instruction – PEarea = 1Ml2 + (Instructions) 80Kl2 33 Penn ESE680-002 Spring2007 -- DeHon Instruction Pragmatics • Instruction requirements could dominate array size. • Standard architecture trick: – Look for structure to exploit in “typical computations” 34 Penn ESE680-002 Spring2007 -- DeHon Typical Structure? • What structure do we usually expect? 35 Penn ESE680-002 Spring2007 -- DeHon Two Extremes • SIMD Array (microprocessors) – Instruction/cycle – share instruction across array of PEs – uniform operation in space – operation variance in time 36 Penn ESE680-002 Spring2007 -- DeHon Two Extremes • SIMD Array (microprocessors) – Instruction/cycle – share instruction across array of PEs – uniform operation in space – operation variance in time • FPGA – Instruction/PE – assume temporal locality of instructions (same) – operation variance in space – uniform operations in time 37 Penn ESE680-002 Spring2007 -- DeHon Placing Architectures • What programmable architectures (organizations) are you familiar with? 38 Penn ESE680-002 Spring2007 -- DeHon • What differentiates a VLIW from a multicore? – E.g. • 4-issue VLIW vs. • 4 single-issue processors 40 Penn ESE680-002 Spring2007 -- DeHon Gross Parameters • Instruction sharing width – SIMD width – granularity • Instruction depth – Instructions stored locally per compute element • pinsts per control thread – E.g. VLIW width 41 Penn ESE680-002 Spring2007 -- DeHon Architecture Instruction Taxonomy 42 Penn ESE680-002 Spring2007 -- DeHon Instruction Message • Architectures fall out of: – general model too expensive – structure exists in common problems – exploit structure to reduce resource requirements • Architectures can be viewed in a unified design space 43 Penn ESE680-002 Spring2007 -- DeHon Admin • Instruction assignment due Wednesday • Reading for today and Wed. on web • Try GRW262 for André Office Hours this week 44 Penn ESE680-002 Spring2007 -- DeHon Big Ideas [MSB Ideas] • Basic elements of a programmable computation – Compute – Interconnect • (space and time, outside system [IO]) – Instructions • Instruction resources can be significant – dominant/limiting resource 45 Penn ESE680-002 Spring2007 -- DeHon Big Ideas [MSB-1 Ideas] • Two key functions of memory – retiming – instructions • description of computation 46 Penn ESE680-002 Spring2007 -- DeHon