Transcript Slides

Fabrizio Lombardi ITC Endowed Chair Professor Dept of ECE Northeastern University, Boston

    CMOS: currently at 28/22nm, soon to move further down in scaling (ITRS) New commercial markets: GPU, tablet, massive external storage (mostly portable) Emerging paradigms: multi-value operation, non-volatile RAM, processing-in-memory Challenges: New designs abound, but not yet a clear winner

   CMOS is not going away any time soon More and More-Than Moore Beyond CMOS Elements  year Beyond CMOS

Extending MOSFETs to the End of the Roadmap ___________ CNTFETs Graphene nanoribbons III-V Channel MOSFETs Ge Channel MOSFETs Nanowire FETs Tunnel FET Non-conventional Geometry Devices Unconventional FETSCharge-based Extended CMOS Devices _______________ Spin FET& Spin MOSFET Negative Cg MOSFET NEMS switch Excitonic FET, Mott FET Tunnel FET I-MOS SET Non-FET, Non Charge-based ‘Beyond CMOS’ Devices _______________ Spin Transfer Torque Logic Moving domain wall devices Pseudo-spintronic Devices Nanomagnetic (M:QCA) Negative Cg MOSFET All Spin Logic Molecular Switch Atomic Switch BiSFET

Resistive Memories

 Spin Transfer Torque MRAM  Nanoelectromechanical  Nanowire PCM  Macromolecular (Polymer)  Electronic Effects Memory − − Metal-Insulator Transition − Charge trapping FE barrier effects  Redox Memory − Nanoionic memory − Electrochemical memory − Fuse/Antifuse memory  Molecular Memory

Capacitive Memory

 FeFET Memory

NVM cost/gigabyte ~ $1 (Intel)

       PVT variations Stability (SNM) concern Power dissipation Charge diffusion and collection in the layout Basic binary operation (supply voltage requirements) Inability to meet large storage needs Likely soft errors      Avoid large capital investment, selectively use new/compatible technologies Preferably, hybrid circuits Multi-level (multi-bit) operation Processing in memory (PIM) Problematic endurance

 1.

2.

Move to higher radix bases than binary: ternary, quad or eventually octal Bases: Ternary: used for CAM processing mostly in routers, but also in GPUs (cache) Quaternary/Octal: increase capacity for massive storage (to replace flash memories) Not efficiently done in CMOS (additional voltage rails and high area/power penalty)

Use radically new technologies

   

ITRS

: memory has always met stated objectives in the past 

Late 2014

as crucial initial milestone wrt to performance (power dissipation and density) and design fundamentals.

Discuss new (emerging) directions:

Unorthodox technologies (briefly) Material-based technologies Focus on non volatile memories

 1.

2.

Innovative operational paradigms for memory using new physics storage phenomena: QCA (memory in motion); challenge is room temperature operation and CMOS compatibility for manufacturing SET (controlled transfer of electrons for memory operation purposes)

Long term opportunities abound, but grand challenges

too

Currently applicable mostly to an academic investigation

 Exploit new materials and fabrication methods (CMOS compatible) to meet challenges Additional criteria: 1.

Hybrid operation is usually sought 2.

Robustness to PVT variations/endurance.

3.

New design realms:

Multi level

(resistance) for increased capacity

Ambipolar operation

for control APPLICATION: non volatile storage

2011 Memory Application (ITRS)

Emerging Research Memory Technology Stand-Alone Ferroelectric-gate FET X Nanoelectromechanical RAM Spin Transfer Torque MRAM X Nanoionic or Redox Memory Nanowire Phase Change Memory (PCM Molecular memory

)

Electronic Effects (Charge trapping, Mott) Macromolecular memory X X X X Embedded X X X X X X X

 Also know as Resistive RAMs: add (programmable) resistive element(s) to active device(s) (usually 1T1R for simplest non-volatile cell design) Issues: 1.

Resistance range (Rmax-Rmin) 2.

3.

4.

5.

6.

Power dissipation and leakage Programmability and universal memory feature Error/defect models (soft and drift) Endurance (related to read/write operation) Testing

FEATURE Capacity Random Read Random Write Endurance Management Error Correction Retention(ys) Read Access(ns) Prog Access(us) Erase Access(ms) Power Cell size(F^2) Universal Memory NOR NAND 256MB 16GB Yes No No 10^5 High No No 10^5-10^3 High 1-72 bits 10 60 200 1-100 1-100 Mid Mid 10 4 No 1-10 60 200 No PCM MRAM FRAM 32MB 2MB 1MB Yes Yes Yes Yes Yes Yes 10^6 10^15 10^14 Mod * No No No No 15 10 20 50 Mid 4 Yes 20 35 35 35 Low 6-20 Yes 5-20 60 60 60 Low 4-15 Yes

   Flash memory seen as a mature technology, unable to capitalize on scaling and not meeting high density storage for mobile application Low lifetime due to high-voltage based process Apple and Anobit (2012)  Additional players: Samsung, Micron, IBM

• Does not require many transistors or other access devices • • • • • Remove silicon requirements: Improve density Reduce power consumption Integrate with processors Reduce total area Crossbar Inc (August 2013): 3D stacking, 1TByte on chip prototype (using FeRRAM) Feature size = Litho node F Cell Size = 4 F 2

The Memristor: Prediction

Fourth Fundamental, Two-Terminal Circuit Element

φ v q i

Leon Chua U.C. Berkeley d

φ

/ d

t

=

v

d

q

/ d

t

=

i v

Ohm 1827 RESISTOR

dv

= R d

i i

CAPACITOR

dq

= C d

v

Von Kleist 1745

q

1831 Faraday IN DUCTOR

d φ

= Ld

i φ

MEMRISTOR

d φ

= M d

q

1971 Chua

  Resistance depends on direction of voltage or current across it (

dϕ = M*dq) Titanium dioxide film sandwiched between two platinum electrodes; doped operation (HP Labs), 5-10nm in length Resistance Range

• • • Between Ron and Roff Roff : Highest resistance Ron : Lowest resistance

      Excellent linearity in switching Resistive range is good I-V characteristics are also very good Nanometric dimension (10nm in 2011, 5nm in 2013): very high density potential at extremely low power consumption Manufacturing compatibility with CMOS Problem: endurance and leakage (on read)

   Ambipolar control of single memristor No standby power, no direct path from V DD GND, only dynamic power dissipation Less number of transistors than RAM (6T) to

  

Memristor changes its value when reading Roff state Refresh operation is required Write time significantly higher than read

V DD (V) Write time (ns) 32 nm 0.9 V 1 V 45nm 0.9 V 1 V 65 nm 0.9 V 1 V

160 150 195 180 235 200

Read time (ns)

0.8

0.75

0.975

0.9

1.175

1

1 0 4 Ti 1nm /Pt 100nm/TiO x 29nm/Ti 4 O 7 100nm 1 0 3 1 0 2 1 0 0 1 0 1 1 0 2 1 0 3 1 0 4 1 0 5 s w itc h in g c yc le s 1 0 6

   Use phases of GTS (chalcogenide alloy) High current-based process for two phases: amorphous (high R) and crystalline (low R).

No erase-write cycle as for NAND flash (at most 100,000 cycles for enterprise product)

    Ron, programming (write) region: intersection of Ron curve with voltage axis is Vh (holding voltage) Roff, read region: this can be changed by I or V pulse; Roff=Ron exp(toff/t) where t=effective recombination time (constant), toff=non programming time Vx as intersection point of Ron curve and Rset curve, Vx=Vh x Rset/(Rset-Ron) Typical values: Rset=7k, Rreset=200k, Ron=1k, Vh=0.45v, Rset

    Mobile devices (Samsung) PCM likely to a be a depository (for less frequently accessed data) next to DRAM for processor design (IBM) Networking/Communication systems: CAM/TCAM designs Massive storage for data acquisition systems

    ISSCC11: Samsung (1-Gbit, 58-nm manufacturing process, low-power double data-rate nonvolatile memory interface) ISSCC12 : Samsung (8-Gbit, 20-nm device).

IEDM11: Macronix/IBM (39-nm device with 30-microamp reset current and 10^9 cycling endurance, 128-Mbit) July 2012: Micron/Numonyx (45 nm PCM for mobile devices in 1 Gb and 512 Mb multichip packages); commercially available

    

Low voltage and moderate current

as operational characteristics

Multiple bit operation

(at least 2): higher resistance range (M ohms) than other RRAMs Read Time: 12ns; Write time: 85ns (@45nm) Soft error

highly unlikely

to occur for GST Good endurance (IBM: 1million cycles) and density

    Use 1T1P core for both CAM/TCAM Functionality is at support circuitry Voltage-based sensing for comparison outcome in search Use of circuit with ambipolar properties for comparison and control

IBM (1/2 PCMs per core), current based operation New cell (1 PCM per core), voltage based operation Stored 0 (200kΩ) 1 (7kΩ) Search 0 (V SL = 0) 1 (V SL = 0.4) 0 (V SL = 0) 1 (V SL = 0.4) I ML (A) -1.38*10 -1.97*10 -1.38*10 -4.15*10 -9 -6 -9 -5 Circuit Write Time (ns) Search Time (ns) Number of Transistors/C ore Number of PCM s/Core PDP of Search Operation (fJ) [20] 199.34

1.326

1 1 46.6886

CAM Proposed 199.34

1.092

1 1 36.4296

[20] 209.53

TCAM Proposed 199.34

1.346

2.447

2 1 2 1 48.41

43.4518

     Practical problem: drift of resistance and threshold voltage (when not read or programmed) Related to crystalline fraction (Cx) in GST Rpcm=(1-Cx)*Ra+Rc*Cx (Ra >> Rc) Ra=Rreset Rc=Rset

   Level drift is more pronounced for high resistance states and non linear wrt time Problematic for MVL storage (i.e. more than one bit per cell) Order of resistivity for states remains the same (short term), so avoid overlap in long term.

    Use advanced modulation coding technique for solving short-term drift (analogous to NAND flash, electrons leak through thin walls of cells and create data read errors). Apply a voltage pulse based on deviation from desired level and measure resistance. If desired level of resistance is not achieved, apply another voltage pulse and measure again – until achieve the exact level Only suitable for binary cell storage It may reduce endurance (multiple writes)

     Assume cell independence in drift errors (?).

Data to be encoded not in the programmed state but in the relative order of the states in a small group of cells.

Error in encoding scheme only seen when resistivity levels of states cross each other Software-based error correction methodologies are then applied (slow) Reduction in capacity: from 2 bits/cell to 1.57 bits/cell

     Octal base for MVL (noise, crosstalk) and/or single vs multiple storage elements MVL implications on error detection/correction Dynamic models of RRAM operation in HSPICE (as related to drift evaluation and mitigation) At system-level, improve endurance by reducing maximum number of writes to a cell System-level application modeling (for example “normally-off instantly-on” operation: combining SRAM with PCM)

    Emergence of new paradigms: resistive RAMs, non-volatile operation, multi-bit storage Nearly all future memories will utilize new phenomena away from 6T configuration TECHNOLOGY TIME SCALE: Hybrid implementations will be dominant in the next 5-10 years 4Q-2014/1Q-2015 as crucial time frame for PCM