Transcript Document
WP4 Status Crolles, Jun. 22-23, 2009 WP4 Presentation WP4 Meeting 17/07/2015 1 Task T4.1: Variability-aware Design • Partners: LETI, UPC – Task leader: Edith BEIGNE (LETI) • Definition and development of (self-) adaptive compensation and optimization techniques to cope with increasing PV variations • Development of new adaptive voltage and frequency scaling (AVFS) techniques which can be exploited either after testing or at run-time WP4 Meeting 17/07/2015 2 Task T4.1: Variability-aware Design • LETI will develop a network of distributed on-chip controllers to optimize the SoC global performance by exploiting the resource monitors designed in WP3 – Distributed and interconnected controllers will be used to adjust locally and dynamically the power supply level, threshold voltages, and operating frequency, while satisfying the global system constraints • UPC will define and design specific functional blocks for AVFS such as monitors and level shifters to tune the operating frequency and power supply levels and to connect different voltage islands – A test vehicle will be taped out with the designed sensor and level shifter circuits WP4 Meeting 17/07/2015 3 Task T4.2: Variation-tolerant, Robust, Low-noise and Low-EMI Architectures/Micro-architectures • Partners: CSEM, TMPO, LETI, ELX, POLI, ST I, TEKL – Task leader: Jordi CORTADELLA (ELX) • Development and design of advanced macro-blocks for robust and reliable systems • Development and design of adaptive architectures based on asynchronous and de-synchronization techniques for: – Computational units and memories – On-chip communication schemes based on GALS paradigm • Synthesis of PV-tolerant asynchronous/de-synchronized functional blocks and architectures for low-EMI design • Design of functional blocks for ultra low-power applications WP4 Meeting 17/07/2015 4 Task T4.2: Variation-tolerant, Robust, Low-noise and Low-EMI Architectures/Micro-architectures • TMPO will design and characterize PV-tolerant, lownoise and low-EMI asynchronous macro-blocks • LETI will study asynchronous and de-synchronized communication schemes in GALS-type architectures – Quasi-delay-insensitive (QDI) and de-synchronized approaches will be evaluated in a NoC – Adaptive communication architectures based on globally asynchronous or de-synchronized communication will be optimized for PV variations • ELX will develop a complete automatic design flow for the synthesis of asynchronous circuits either from RTL specifications or from a post-placement gate-level netlists WP4 Meeting 17/07/2015 5 Task T4.2: Variation-tolerant, Robust, Low-noise and Low-EMI Architectures/Micro-architectures • POLI will develop a new asynchronous synthesis prototype tool starting from untimed (or partially-timed) SystemC description – New synthesis techniques to improve the performance and (by using dynamic voltage scaling) reducing power consumption • ST I will provide the industrial test cases and design flows to validate these novel asynchronous design methodologies • TEKL will develop the methodology and support for integrating its novel power shaping optimization technology for EMI reduction into existing synchronous mainstream design flows WP4 Meeting 17/07/2015 6 Task T4.3: Design of Reliable Systems • Partners: THL, NMX, ST F, ISD – Task leader: Dimitris MITROVGENIS (ISD) • Design of highly reliable analog, mixed-mode, digital, and NVM systems based on unreliable foundations subject to large PV variations and degradation – New mechanisms to recognize faulty devices and structures before the overall system collapses will be developed along with procedures to reconfigure the system so that it continues operating, although at a lower frequency, thus allowing a graceful degradation WP4 Meeting 17/07/2015 7 Task T4.3: Design of Reliable Systems • THL and ST F will study and develop a parallel architecture for safety-critical applications compatible with PV variability for robust and time-predictable design – Multi-core architecture will be based on fault detection, isolation, and communication dynamic reconfiguration – Design of a processing core for a parallel architecture with realtime and time-predictable capabilities – In particular ST F will extend Spidergon STNoC to cope with the requirements of the multi-core architecture • ISD will design highly-reliable analog, mixed-mode, and digital blocks implemented in a moderately reliable CMOS process. Moreover, ISD plans to investigate faulttolerant routing, as well as fault diagnosis and dynamic reconfiguration schemes • NMX will design highly-reliable NVM systems WP4 Meeting 17/07/2015 8 Task T4.4: Design of Regular Architectures and Circuits for High Manufacturability and Yield • Partners: TMPO, UPC, ST I, UNBO – Task leader: Roberto CANEGALLO (ST I) • Design of customizable circuits, macroblocks, and architectures based on regular structures to improve manufacturability and predictability WP4 Meeting 17/07/2015 9 Task T4.4: Design of Regular Architectures and Circuits for High Manufacturability and Yield • TMPO will design variability-tolerant asynchronous functional blocks using regular structures to evaluate the yield improvement, while satisfying low-noise/low-EMI requirements • UPC will develop a via-configurable regular transistor array to improve parametric yield, and manufacturability • ST I will design customizable via-programmable macroblocks and mask-programmable IPs suitable for a fast and efficient SoC design and mapping on regular transistor arrays • UNBO will design a customizable architecture for homogeneous multi-threading based on modular elementary computational blocks – The silicon structure for architectural mapping will be the regular transistor array developed in cooperation with ST I WP4 Meeting 17/07/2015 10 Task T4.5: Distributed reconfigurable PV-robust architectures • Partners: THL, LIRM – Task leader: Philippe Bonnot (THL) • Programming methods and tools for predictable and PV-robust MPSoC computing architectures will be developed to consider PV variations at the software/system level, since specific formalisms for execution-time management for critical applications embedded into multi-core architectures are required as early as possible in the design cycle WP4 Meeting 17/07/2015 11 Task T4.5: Distributed reconfigurable PV-robust architectures • THL will develop programming methods and tools for predictable processing architectures to take into account PV variations • LIRM will study self-adaptive mechanisms to allow application task run-time remapping onto a distributed reconfigurable multi-core architecture, maintaining a given functionality with the same level of performance under PV variability – The remapping policy will be based on the information obtained from on-die monitors WP4 Meeting 17/07/2015 12 First Deliverables • D4.1.1 – LETI, UPC - M24 • Reports on PV-aware (self-) adaptive compensation and optimization techniques, including on-chip monitors • D4.2.1 – TMPO, CSEM, ELX, POLI, ST I - M12 • Reports on PV-tolerant asynchronous blocks and on ultra low-power circuits/architectures • Prototype asynchronous/de-synchronization flow • D4.3.1 – THL, ST F, ISD - M12 • Robust architecture design specification, and SystemC model for a multi-core SoC virtual platform WP4 Meeting 17/07/2015 13 First Deliverables • D4.4.1 – UPC, TMPO - M24 • Report on yield prediction tool and regular structures for PV-tolerant asynchronous blocks • D4.4.2 – ST I, UNBO - M24 • Report on customizable and regular architectures for homogeneous multi-threading and signal processing, and on programming and customization model based on C language extensions • Delivery of a design flow for mapping on mask-programmable computational blocks, regular transistor arrays, and via-/metalprogrammable datapaths • D4.5.1 – THL, LIRM - M24 • Report on programming methods and tools for PV-tolerant, reliable, and predictable MPSoC architectures WP4 Meeting 17/07/2015 14 LETI Contribution to WP4 • Contribution to WP4.1: Variability-aware design – Task leader • LAVFS (Local Adaptive Voltage and Frequency Scaling Architecture) based on VDD-hopping technique specified – To be designed in 32nm and qualified as a specific IP – To be designed and adapted to an IP block (LDPC or/and µP) – Detailed next slides • Discussion with UPC for a T4.1 cooperation • ST-F withdrawal? • Contribution to WP4.2 • Asyncrhonous NoC to be designed in 32nm • Timing measurements will be included WP4 Meeting 17/07/2015 15 Local Adaptive Voltage Scaling (LAVS) • Design shrink: deep impact on both statistical and dynamic variations – Process, voltage, temperature, aging, … – Yield is continuously decreasing in digital circuit (Fmax decrease) • LAVS objective – Use design techniques to dynamically alleviate from variability constraints • Instrument a synchronous digital circuit using adaptive control to enhance yield – This approach should allow to: • Have knowledge of the real silicon corner (instead of worst-case corner) • Follow on along the circuit lifetime the silicon variations (T,V,A) and act accordingly • Find an optimal set point to reduce timing margins taken at design phase WP4 Meeting 17/07/2015 16 Local Adaptive Voltage Scaling (LAVS) • Local Adaptive Voltage Scaling (LAVS) technique based on VDD-hopping – Contrarily to standard adaptive voltage scaling techniques, do not adapt the voltage, but keep the two Vhigh / Vlow voltage as constant and adapt on line the maximum frequency and the VDD-hopping dithering ratio (see next) • VDD-hopping serves in two roles 1. Reduce the dynamic power by means of DVFS 2. Serve as a regulator using an adaptive technique to trade off timing margins against power budget WP4 Meeting 17/07/2015 17 LAVS Architecture Principles • Compared to the initial VDDhopping elements, it is required to add the following components – Diagnostic system (monitors or probes) allowing to observe robustly local timing violations – Adaptation controller to make decision according to silicon measures regarding maximum achievable frequency (translation table, …) – Local power manager which control the closed-loop system, according to application targets • The VDD-hopping serves as an actuator to act on environmental parameters (frequency / supply) Existing VDD-hopping elements Vhigh Vlow Flow Clock L Adaptation Controller perf index Clock H Ftarget Fhigh Supply Selector LPM performance control Clock Selector Sequencer Vcore Fcore Probe 2 Core Probe 1 Probe 3 WP4 Meeting 17/07/2015 18 UPC Contribution to T4.1 Variability-aware design • Level shifter design – Evaluation of “True single voltage level shifter” in 90nm technology • Schematic level • Tolerance to variations, P-V-T – ST 45nm DK installed and ready in UPC – Next step: (Sep-Oct): migration of 90nm design to 45nm (32nm?) • Coordination with LETI – What other blocks are necessary? – How do we split work, now that ST F is out? – Work plan due during next few weeks WP4 Meeting 17/07/2015 19 Elastix Contribution to T4.2 • Goal – Use elasticity to spread the clock • How – De-synchronization of synchronous circuits – EDA flow for the automatic transformation of synchronous circuits into asynchronous versions – Enforce an elastic clock jitter dynamically WP4 Meeting 17/07/2015 20 De-synchronized Elastic Blocks • Operating conditions may individually change at each time instant – Frequencies will change accordingly • Delays can be adjusted dynamically to spread the clock V1 V2 V3 T1 T2 T3 M1 M2 M3 req delay1 ack delay2 WP4 Meeting delay3 17/07/2015 21 Current Status • A preliminary prototype for desynchronization has been designed • The EDA flow is being validated with opensource benchmarks • Support for clock gating and scan chains in progress WP4 Meeting 17/07/2015 22 TIEMPO Contribution to T4.2 • Variation-tolerant, robust, low-noise and lowEMI architectures/micro-architectures • Enable the design of variability-tolerant low-EMI circuits • Evaluate/predict at design time the EMC behavior • On-going work – Design and characterization of PVT-tolerant asynchronous macroblocks – Cell libraries, RAM, ROM – Current profile estimation – Noise reduction methodology WP4 Meeting 17/07/2015 23 CSEM Contribution to T4.2 • Intra-die Process Variations (PV) – Random dopant Fluctuations – Line edge roughness – Polysilicon gate granularity Gaussian distribution of threshold voltage WP4 Meeting 17/07/2015 24 Sensitivity of Transistor ON-current to Intra-die PV • μ is the mean, σ is the standard deviation • 90nm technology node, δVT=35mV • Sensitivity strongly depends on VDD WP4 Meeting 17/07/2015 25 Supply Voltage Selection to Reduce PV • For specific operating frequency and power budget, it is better to work at higher VDD voltage to reduce PV effects WP4 Meeting 17/07/2015 26 Architecture Selection to Reduce PV • For specific operating frequency and power budget, we should look for architectures which satisfy timing and power constraints at higher VDD values • Example: for a full adder, which is the best architecture (ripple carry, carry lookahead) and VDD for reducing the effect of PV? WP4 Meeting 17/07/2015 27 An Example Cary Look- Ripple Cary Ripple Cary ahead Adder Adder Adder @ (400mV, VT) @ (400mV, VT) @ (500mV, VT) Performance Fast Slow Fast Dynamic Power Normal Low Normal Leakage Power Normal Low Normal Sensitivity to PV Sensitive Sensitive WP4 Meeting ~2X less sensitive 17/07/2015 28 CSEM Summary • A ripple carry adder at 500 mV provides same speed and same power than a carry look-ahead adder at 400 mV with about 2X less sensitivity to PV • Using low-power slow circuits at higher VDD voltage is better than using highpower fast circuits in lower VDD! WP4 Meeting 17/07/2015 29 TEKL Contribution to T4.2 • Deliverables of Teklatech – To develop the Power Shaping design methodology – To integrate its FloorDirector tool into existing mainstream design flows FloorDirector ™ WP4 Meeting 17/07/2015 30 Accomplished Goals • During the last few months Teklatech and ST-I have been working together • The Power Shaping design methodology and the FloorDirector tool have been proven with – Cadence Encounter physical backend – Synopsys PrimeTime timing sign-off – Apache RedHawk noise sign-off • Results have been extremely satisfactory • Smooth integration with existing tools • Projected benefits maintained through physical backend WP4 Meeting 17/07/2015 31 Target Goals • Teklatech is currently working with ST I with a focus on noise, EMI, and power integrity • Goal: over the year to integrate the methodology and tool into flows based on: – Synopsys backend – Magma backend – Cadence backend WP4 Meeting 17/07/2015 32 ST I Contribution to T4.2 • Provide Industrial test-cases and design flows to validate novel design methodologies for extended silicon reliability • Background – ST I and Central Cad & Design solutions have a strong background in the evaluation and deployment of design solutions and methodologies for power integrity and EMC – There is a strong pressure in all ST for design solutions in the field of power rail noise and EMI control at design and architecture level • Added value – ST I can provide industrial test-cases and design experience derived from years of investigations on the issue WP4 Meeting 17/07/2015 33 Electromagnetic Compatibility • Various application fields are showing increased sensitivity to EMI – Automotive (due to severe security/reliability requirements in a very noisy working environment) – Wireless, networking (due to increasing working frequencies in ever-increasing transistor densities and complex designs • EMI is a very challenging field and metrics are difficult to define – Difficult to evaluate impact of architectural choices at design time – Difficult to interpret measurement results • But compliancy to a given EMI class may determine the survival/success of a product line! WP4 Meeting 17/07/2015 34 ST I Activity in T4.2 • ST I can leverage a – Strong experience on EMI issues – Experience in PDN analysis and dynamic power behavior analysis – Experience in system-level PDN evaluation (including package and board modeling) – Established cooperation with some T4.2 partners on related issues – Experience in advanced methodologies for EMI reduction • ST I can provide – Test-cases of industrial relevance for the evaluation of the design methodologies developed in T4.2 – Active contribution in the definition of the metrics for the design methodologies developed in T4.2 – Active contribution in the evaluation of the T4.2 results WP4 Meeting 17/07/2015 35 ISD Contribution to T4.3 ISD and THL virtual platform • A clock-accurate transaction-level SystemC multicore SoC virtual platform (VP) that targets multimedia processing is developed jointly by ISD and Thales (executable specifications in D4.3.1) • ISD currently develops a SystemC model of a flexible packetswitched network-on-chip (NoC) topology connecting together storage elements (SEs) to clusters of processing elements (CPEs) • ISD will soon define and develop the SystemC model of a sequentially consistent virtual shared memory subsystem based on interleaved SEs which perform read/write and synchronization operations • SystemC models of CPEs and test bench will be developed by Thales Cluster of PEs (Thales) CPE1 NoC (ISD) Shared Memory (ISD) CPE2 … CPEN Interconnect SE1 SE2 WP4 Meeting … SEN 17/07/2015 36 D4.3.2: NoC Fault Tolerance • ISD has started to examine NoC reliability and fault tolerance issues for mission critical systems • More specifically ISD currently focuses on analyzing available techniques fault diagnosis, fault tolerant routing and dynamic reconfiguration for both permanent and transient NoC component (node/link) faults on all multicore SoC layers – Data-link layer (eg. encoding) – Network layer (e.g. packet retransmission protocols and fault tolerant routing) – Transport layer (eg. offline reconfiguration) • A fault tolerant design methodology based on the above principles will be implemented and evaluated on the multicore SoC VP developed in D4.3.1 (executable specifications in D4.3.2) WP4 Meeting 17/07/2015 37 D4.3.3: NoC Dynamic Power Estimation • ISD will propose and evaluate a simple and generic system-level dynamic power estimation methodology which will focus on switching activity, while also utilizing algorithmic, topological, architectural, and possibly technological characteristics • More specifically, power instrumentation functions will be assigned the task of collecting switching activity traces for real or synthetic application traffic – A suitable graphical interface will help analyze & optimize the design based on the traces obtained • This methodology will be implemented and evaluated on the multicore SoC VP developed in D4.3.1, considering the effects of multiple clock domains (e.g. GALS) and dynamic power management policies on PMCs, e.g. CPEs and SEs. (executable specifications in D4.3.3) • Since power estimation is built on top of fault tolerance mechanisms, ISD will be able to explore the design of future low cost, reliable, and power-efficient multicore systems WP4 Meeting 17/07/2015 38 THL Contribution to T4.3 • The choice of the platform has been clarified and agreed with ISD (and finally with ST-F) (see next slide showing this platform) – The platform is made of computing nodes connected through a NoC – The computing nodes have all the same architecture (but they can include different accelerators: SIMD accelerator, reconfigurable unit, etc.) – Existing models of the platform can be reused • The target applications of the platform are: – Image processing (with regular and non-regular parts), radar processing, etc. – Applications with reliability constraints (aerospace applications, transportation, etc.) • The objective is to make this platform reliable in spite of less and less reliable underneath technology – Reliability objectives to be clarified, fault models to be clarified (WP1) – NoC to be made reliable (work with ISD and ST-F) – Tile to be made reliable and able to manage reliability of the platform WP4 Meeting 17/07/2015 39 Platform Architecture (Simplified View) Node Network OCP TL2 Compute NODE CTR ACC CTR ACC DMA LMEM DMA LMEM NIM Compute NODE Compute NODE … CTR ACC DMA LMEM NIM NIM SHMEM CTR: node controller DMA: direct memory access controller NIM: network interface module WP4 Meeting Network on Chip ACC: accelerator LMEM: local memory SHMEM: shared memory 17/07/2015 40 ST F Contribution to T4.3 • Spidergon STNoC is a configurable interconnect technology to meet the needs of consumer SoCs based on 3 main pillars Architecture Software Tools & Configurability WP4 Meeting 17/07/2015 41 Spidergon STNoC: On-chip Network • The on-chip network is based on 4 well-defined configurable components called – Network Interface – Network Plug Switch uP uP DDR DDR – Router – Physical link IP IP IP IP IP IP IP IP IP IP IP IP WP4 Meeting 17/07/2015 42 MODERN Contribution: Highly Reliable Interconnect • Extend Spidergon STNOC with – Software re-configurability – Dependable features – …to be discussed with Thales • Integrate Spidergon STNoC within the Thales multi-core platform (see Thales slides) – Deliverable SystemC or FPGA implementation • Cooperation with Thales has been well defined while with other partners is still on going WP4 Meeting 17/07/2015 43 Numonyx Contribution to T4.3 • Aim – Design of highly-reliable Non-Volatile Memory (NVM) systems • Approach – Model the memory cell as a communication channel – Exploit advanced coding techniques for robust NVM design • Deliverables – D4.3.2 – D4.3.4 WP4 Meeting 17/07/2015 44 Memory Cell As A Communication Channel • Simple model – Cell = Pulse Amplitude Modulation (PAM) channel + Gaussian Noise • Is the noise really Gaussian? – The results of the others WP will give the elements for defining an accurate model of the noise • Many coding techniques are very sensitive to Gaussian hypothesis violation 2D Gaussian density WP4 Meeting 17/07/2015 45 Advanced Coding Techniques for Robust NVM Design • Premise – shrinking of technology nodes + – increasing of the number of bit-per-cell = – memories more prone to errors • Approach – Exploit signal processing and coding techniques from communication field – Optimize these techniques to satisfy the tight area and time constraints of memory devices WP4 Meeting Example: turbo codes 17/07/2015 46 Summary • “Communication-centric approach” has been reported in literature for high-speed on-chip signaling • The application approach to memories is just starting (but it seems promising due to scaling and process variability) • Any knowledge sharing on this topic and any suggestion for cooperation is really welcome WP4 Meeting 17/07/2015 47 ST I Contribution to T4.4 • Via-Programmable Datapath (MT-PiCoGA) – Derive via programming from PiCoGA bitstream • Added a step for manual floorplanning (accelerator position in MTPiCoGA) • Evaluated the feasibility of Cadence Virtuoso Scripting Language (SKILL) as reference for VIA programming • Design of basic block (RTL-level) done • Mask-Programmable Accelerators for regular transistor array – Evaluated the feasibility of Griffy-C to VHDL translation • Based on a library of high-level computational blocks (add, sub, mux, …) which map C-like operators • Some basic block already implemented • Need to generate an interface to allow the processor to trigger the required accelerator WP4 Meeting 17/07/2015 48 UNBO Contribution to T4.4 • Evaluation of the CUDA programming environment and methodology for the exploitation on MODERN – Evaluated the NVidia GPGPU CUDA environment and MCUDA C-to-C translator (from Illinois Institute of Technology) – Verified the possibility to automatically split the code in sw-only (host-code) and accelerated kernels (device-code) using the native CUDA environment – Simple trials verified on Linux platform substituting accelerators with their software emulation function (automatically generated from Griffy-C) WP4 Meeting 17/07/2015 49 UPC Contribution to T4.4 • Design of regular architectures and circuits for high manufacturability and yield • Regular layout (VCTA) – Several adders designed (layout level) and simulated – Next step: design a small processing element with adder-SRAM datapath – Technology: 90nm • Variability evaluation (was: yield prediction tool) – Contacts with TIEMPO (Marc Renaudin) to establish a workplan and possible methodology WP4 Meeting 17/07/2015 50 Variability Evaluation Methodology Proposal Litho simulation MOS model HSPICE simulation SSTA tools Geometry variability MOS param variability Gate level variability Block variability Still not clear who can do this task UPC/TMPO WP4 Meeting 17/07/2015 51 TIEMPO Contribution to T4.4 • Design of regular architectures and circuits for high manufacturability and yield • Study regular structures of variability-tolerant asynchronous circuits • Evaluate their benefits on manufacturability and yield • Work is starting – Regular structures and their configurability/programmability – Characterization – Yield improvement WP4 Meeting 17/07/2015 52 THL Contribution to T4.5 • No real progress on this task which has later deadlines WP4 Meeting 17/07/2015 53 LIRM Contribution to T4.5 • Multi-processor SoC (MPSoC) design and distributed and reconfigurable PV-tolerant architectures • Programming methods and tools for predictable and PV-robust computing • Self-adaptive mechanisms to allow application task run-time • Runtime (Re)mapping policy WP4 Meeting 17/07/2015 54 T4.5 Summary • Done – First MPSoC Prototyping platform validated on FPGA – First Distributed OS and distributed tasks allocations (dynamically) – Video Processing application – Presented at DATE University Booth • Current cooperations : – STMicroelectronics (Crolles): MPSoC Reliability techniques considering PV robust computing – CEA/LETI: MPSoC Architecture, reconfigurable PV Tolerant architecture, distributed computing (“game theory approcaches”) – Thales: to be defined, meeting should be scheduled WP4 Meeting 17/07/2015 55 T4.5 Summary • In progress – Distributed computing optimization, example game theory applied to MPSoC – Cooperation CEA/LETI – PVT monitoring and dynamic feedback on the task allocation and energy consumption – Thales meeting to schedule • Dissemination: – DATE paper + U-Booth and IEEE SoC conference (September) • Human ressources – Currentlty master students (6 months) – September PhD student start on the MODERN project WP4 Meeting 17/07/2015 56 WP4 Highlights/Lowlights • WP4 kick-off meeting at ST I Agrate on Apr. 3rd – All partners participated: strong interest and motivations from all participants in the technical activities of WP4 – In-depth technical discussion on WP4 activities, plans, and deliverables – Partners are confident that M12 deliverables can be achieved – Positive cooperative environment • Technical activities in WP4 tasks are progressing and on track (see partners’ presentations) • New partner (ST F AST Grenoble expected to join the project replacing ST F TR&D) – Proposed activity: NoC (contact will be Marcello Coppola) – WP4 task: T4.3 • Lowlights – POLI’s contribution to WP4 missing due to the current lack of funding from Italian PAs (no contract signed as yet) – Expectations for closing on the contract are for Q3/Q4 2009 – ST F TR&D Crolles withdrawal WP4 Meeting 17/07/2015 57