Sensory Information Processing at Johns Hopkins University

Download Report

Transcript Sensory Information Processing at Johns Hopkins University

Neuromorphic Image Processing

Ralph Etienne-Cummings

The Johns Hopkins University

Collaborators:

Kabena Boahen, Gert Cauwenberghs, Timothy Horiuchi, M. Anthony Lewis, Philippe Pouliquen

Students:

Eugenio Culurciello, Viktor Gruev, Udayan Mallik

Sponsors:

NSF, ONR, ARL Computational Sensory Motor Systems Lab Johns Hopkins University

An Alternative Style of Neuromorphic Image Processing

• Traditional image processing uses pixel-serial image access, digitization and sequential processing - Discrete levels, Discrete time High fidelity images , large vocabulary of functions (GP) - High power, high latency, small sensor/processing area ratio • Traditional neuromorphic vision systems typically uses pixel parallel processing Continuous and/or discrete levels , continuous time - Low fidelity images, large pixels, small vocabulary of function (ASICs) Low power, low-latency • Computation-On-Readout (COR) vision systems uses block serial-pixel-parallel image processing Continuous levels, discrete time High fidelity images, medium vocabulary of function (pseudo-GP) Low power, medium/low latency, computation for “free,”

Computational Sensory Motor Systems Lab Johns Hopkins University

A daptive S patio TE mpo R al I maging ( ASTERIx ) Architecture

- Digitally controlled

analog

processing - Image acts as memory - Parallel execution of multiple filters - Temporal evolution of results - Standard Fetch-Decode-Compute Store (RISC) architecture possible Competition/recurrence possible

Computational Sensory Motor Systems Lab Johns Hopkins University

Foveated Tracking Chip

Technology Chip Size Package Array Sizes Fill Factor Transistors/Cell Photosensitivity Contrast Foveal Direction Sensitivity Peripheral ON-set Sensitivity

2  m NWELL CMOS, 2 Metal, 2 Poly 6.4 x 6.8 mm 2 132 Pin DIP Fovea: 9x9 @150  m pitch Fovea: 18% Peri: 19x17 @300 pitch Peri: 34%  m Fovea: Receptor + Edge: 12 Peri: Receptor + Edge + ON: 12 Fovea: Motion: 8 Peri: Centroid: 15 6 Orders of Magnitude 10 - 100% 2.5

 W/cm 2 : 1.5 - 1.5K pixels/s 25  W/cm 2 : 3 - 4.5K pixels/s 250  W/cm 2 : 5 - 10K pixels/s 2.5

 W/cm 2 : < 0.1 - 63K Hz 25  W/cm 2 : < 0.1 - 250K Hz 250  W/cm 2 : < 0.1 - 800K Hz 25  W/cm 2 : >10mW @ 3V Supply

Power Consumption

• Spatially variant layout of sensors and processing elements • Dynamically controllable spatial acuity • Velocity measurement capabilities • Combined high-resolution imaging and focal-plane processing

IMAGE 1

VLSI Implementation of Robotic Vision System: Single Chip Micro-Stereo System

IMAGE 2 STEREO CHIP Single Chip Stereo Optics Matlab Simulation of VLSI Algorithm VLSI Algorithm Chip Layout Disparity (# pixels shift from right to left) after confidence test 60 80 100 20 40 120 20 40 60 80 100 120 Measured data: Line is disparate on the imagers -20 -30 -40 10 0 -10 50 40 30 20 • A single chip stereo vision system has been implemented • Contains 2, 128 x 128 imagers • Computes full frame disparity in parallel • Provides a confidence measure on computation • Uses a vertical template to reduce noise and computation • Operates at 20 fps • Uses ~30mW @ 5V (can be reduced)

VLSI Implementation of Robotic Vision System: Spatiotemporal Focal Plane Image Processing

Parallel Processed Images Biological Inspiration of the GIP Chip: Orientation Detection Spatiotemporal receptive fields Spatially Processed: Orientation Selectivity Temporally Processed: Motion Detection • Implemented CMOS Imagers with focal plane spatiotemporal filters • Realized high resolution imaging and high speed processing • Consumes milliwatts of power • Performs image processing at GOPS/mW (unmatched by any other technology) • Used for optical flow measurement, object recognition and adaptive optics.

Color-Based Object Recognition on a Chip

Skin-tone Identification “Learned” templates Synapses Smart Camera Chip Coke or Pepsi?

Fruit Identification • Implemented chip that contains a camera and a recognition engine • Decomposes the image into Hue, Saturation and Intensity (HSI) • Creates a template of HIS for learned template • Identifies part of the scene that match a template • Used by interactive toys, aides to the blind and Robots

VLSI Implementation of Robotic Vision System: Visual Tracking

Low Noise Imaging and Motion Tracking Chip

Technology Array Size Pixel Size Fill Factor Power Consumption (with 3.3V supply) FPN (APS) – Dark (Std. Dev./Full Scale) FPN (APS) – Half Scale (Std. Dev./Full Scale) 0.5µm 3M CMOS APS: 120 (H) x 36 (V) APS: 14.7µm x 14.7µm APS: 16% 3.2mW Pixel-Pixel (within column): 0.6% Column-Column: 0.7% Pixel-Pixel (within column): 0.7% Column-Column: 1.2%

Sample Image Target Tracking • Implemented CMOS Imager with active pixel sensor and motion tracking • Obtain low noise image • Tracks multiple targets simultaneously • Consumes milliwatts of power • Used for optical flow measurement, target tracking, 3D mouse and robot assisted surgical systems.

VLSI Implementation of Robotic Vision System: Ultrasonic Imaging and Tracking

i i Ultrasonic Array Processing Bearing Estimation with Spatiotemporal Filters MEMS Front-End 3 2 1 Target blip changes in height input voltage time series 0 sampling period #1 -1 change detection flag -2 sampling period #2 -3 0 0.02

0.04

0.06

0.08

0.1

Time (seconds) 0.12

0.14

0.16

0.18

Bearing Estimation Chip 3 2 1 input voltage time series 0 -1 sampling period #1 change detection flag -2 sampling period #2 -3 0 0.02

0.04

0.06

0.08

Time (seconds) 0.1

0.12

0.14

0.16

Bearing Change Detection Range Change Detection Bearing Estimation Algorithm Bearing/Range Mapping and Novelty Detection • Implemented ultrasonic bearing estimation chip and change detection chip • Uses sonic flow across microphone array to measure bearing of target • Creates internal map of environment • Detects changes in the structure of the environment • Operates on milliwatts of power • Used for surveillance and navigation

VLSI Implementation of Central Pattern Generators (CPG) for Legged Locomotion

Descending signals Non-Linear Sensoy Feedback Motor Output Biologically Inspired Locomotion controller Non-Linear Sensoy Feedback Motor Output Adaptive Locomotion Controller Synapses 10 Neuron CPG Chip Silicon Integrate-and-fire Neuron New Biped: Snappy • Implemented a general purpose CPG chip • Contains 10 Neurons • Allows 10 fully connected neurons • Allows 10 inputs from off-chip • Allows Spike and Graded neuron inputs • Allows digitally programmable synapses • Operates on microwatts of power • Used to control legged locomotion

Outline

• • • •

Photo-transduction:

• Active Pixel Sensors • Dynamic Range Enhancement • Current Mode

Spatial Processing:

• Image Filtering

Spatiotemporal Processing:

• Change Detection • Motion Detection

Spectral Processing:

• Color-Based Object Recognition

Computational Sensory Motor Systems Lab Johns Hopkins University

Photo-transduction

Computational Sensory Motor Systems Lab Johns Hopkins University

Conventional CMOS Cameras: Integrative Photo-detection

Simple 3-T APS:

Fossum, 1992

Integrative Imagers: Voltage domain; Dense arrays (1.25-T); Low Noise;

Low dynamic range (~45 – 60dB), Not ideal for computation

Computational Sensory Motor Systems Lab Johns Hopkins University

Conventional CMOS Cameras: Integrative Photo-detection

- 150 million sold in 2004, 55% annual growth rate to 700 million by 2008 Power consumption is relatively low ( ~ 10’s of mW for VGA) - 2 Mega Pixels is probably the limit of usefulness - Download bandwidth is a problem (service providers would like more people to download their pictures) - There is a fear that it will represent the next technology bubble …. So much hype, legal problems … Camera phones are driving the CMOS camera market - Small (~ 100 x 100 pixels) imagers, with smarts (e.g. motion, color processing) have market in toys, sensor networks, computer mouse …

Computational Sensory Motor Systems Lab Johns Hopkins University

Spike-Based CMOS Cameras: Octopus

Vdd_r reset event Ic Imaging Concept Sample Image

Computational Sensory Motor Systems Lab Johns Hopkins University

Other approaches: W. Yang, “Oscillator in a Pixel,” 1994 J. Harris, “Time to first Spike,” 2002

Culurciello, Etienne-Cummings & Baohen, 2003

Front-End of Vision Chips: Photoreception Adaptation

Adaptive Phototransduction

(Delbruck, 1994)

Computational Sensory Motor Systems Lab Johns Hopkins University

After Normann & Werblin, 1974

•Time adaptive (band-pass) •Voltage domain •Large dynamic range (9 orders) •

Can be large pixels (Caps)

Can have mismatch?

Front-End of Vision Chips: Photoreception

Current Domain Imaging

(Mead et al, 1988)

•Wide dynamic range (9 orders) •Simple to implement (2 Trans.) •Ideal for computation (KCL) •

Poor matching (10 – 15%)

Slow turn-off

Transfer function is non-linear

Photo sensitive elements:

Phototransistors: ~100pA/um 2 Photodiodes: ~1pA/um 2

Computational Sensory Motor Systems Lab Johns Hopkins University

How Can We Improve Current Mode Imagers

- Linear Current Mode APS  Photodiode linear discharges with light intensity 

Amplified linear current

output from the APS - Incorporate noise correction techniques at the focal plane 

Current mode

Correlated Double Sampling (CDS)  Improve the quality of image noise characteristics  Easy integration with processing units – convolution, ADC, others.

Computational Sensory Motor Systems Lab Johns Hopkins University

Complete Imaging System

I photo I reset  

p C OX

 

p C OX W

[(

V photo L W L

 [(

V reset

t t ref

V

2

ref

2

ref

]  2

V ref

2 ] I out  I photo  I reset  

p C OX W V ref L

(

V reset

V photo

)

Computational Sensory Motor Systems Lab Johns Hopkins University

Pixel Vt variations are

eliminated

from the final current output!

Measured FPN figure

Computational Sensory Motor Systems Lab Johns Hopkins University

-Image quality has been improved -Non-linearity due to mobility degradation degrades performance under bight light

Spatial Processing: Image Filtering

Computational Sensory Motor Systems Lab Johns Hopkins University

Architectural Concept: Visual Receptive Fields

Computational Sensory Motor Systems Lab Johns Hopkins University

Architectural Concept: Visual Receptive Fields

High resolution Imaging array Programmable Scanning Registers

Computational Sensory Motor Systems Lab Johns Hopkins University

Parallel Processed Images Spatiotemporal receptive fields

Etienne-Cummings, 2001

Results – Spatial Image Processing

Enhanced Imaging • • • • 1. Vertical Edge Detection (3x3) 2. Horizontal Edge Detection (3x3) 3. Laplacian Filter (3x3) 4. Intensity Image • • • • 6. Vertical Edge Detection (5x5) 7. Horizontal Edge Detection (5x5) 8. Laplacian Filter (5x5) 9. Gaussian Filter (5x5) 1. Intensity Image 2. Horizontal Edges 3. Enhanced Image = Intensity + Horizontal Edge Image

Computational Sensory Motor Systems Lab Johns Hopkins University

Results – Spatial Image Processing

3 x 3 Kernels

Computational Sensory Motor Systems Lab Johns Hopkins University

5 x 5 Kernels

Summary

Technology No. Transistors Array Size Pixel Size FPN (STD/Mean) Fill Factor Dynamic Range Frame Rate Kernel Sizes Kernel Coefficients Coeff. Precision Temporal Delay Power

GIP version 1

1.2  m Nwell CMOS 6K

30 16 x 16

m x 30

m 2.5% (Average) 20%

1 – 6000 Lux DC – 400KHz 2x2 - whole array +/- 3.75 by 0.25

Intra-processor: <0.5% 1% decay in 150ms @ 800Lux 5 x 5: 1mW @ 20 kfps

Computation Rate

(Add and Multiply) 5 x 5:

1 GOPS/mW Computational Sensory Motor Systems Lab Johns Hopkins University

@ 20 kfps

GIP version 2

1.5  m Nwell CMOS 13K

20 42 x 35

m x 20

m 2.1% (Average) 35%

1 – 6000 Lux DC – 400KHz 2x2 - whole array +/- 3.75 by 0.25

Inter-processor: <2.5% NA 5 x 5: ~1mW @ 20 kfps 5 x 5:

1 GOPS/mW

@ 20 kfps

Spatiotemporal Processing: Change & Motion Detection

Computational Sensory Motor Systems Lab Johns Hopkins University

Motivation: Free Space Laser Communication

Computational Sensory Motor Systems Lab Johns Hopkins University

Motivation

 

Flexible control of exposure, inter-frame delay and read-out synchronization

Low fixed pattern noise on current and previous image

High speed

,

high resolution, high accuracy, pitch matched,

Temporal Difference

Imager (TDI) Pipelined readout mechanism for improved read-out rate and temporal difference accuracy

Computational Sensory Motor Systems Lab Johns Hopkins University

Photo Pixel Designs

Pixel Size Fill Factor FPN Technology

Computational Sensory Motor Systems Lab Johns Hopkins University

TDI version 1 25  m x 25  m 30%

0.5% of saturation

0.5

 m (SCMOS) TDI version 2 25  m x 25  m 50%

0.15% of saturation

0.35

 m (Native)

Results and Measurements

Computational Sensory Motor Systems Lab Johns Hopkins University

Results and Measurements

Computational Sensory Motor Systems Lab Johns Hopkins University

New Change Detection Chip

Computational Sensory Motor Systems Lab Johns Hopkins University

On-Set and Off-Set Imaging

Narrow Rejection Band

Computational Sensory Motor Systems Lab Johns Hopkins University

Wide Rejection Band

Video Compression

Computational Sensory Motor Systems Lab Johns Hopkins University

Video Reconstruction

Computational Sensory Motor Systems Lab Johns Hopkins University

Spectral Processing: Color Object Recognition

Computational Sensory Motor Systems Lab Johns Hopkins University

RGB to HIS: Why?

r

I

_

bias R R

G

B

;

g

I

_

bias R

G G

B

;

b

I

_

bias R B

G

B Sat

(

R

,

G

,

B

) 

I

_

bias

[ 1  min(

r

,

g

,

b

)]

Hue

(

R

,

G

,

B

)  arctan(

X

/

Y

)  arctan 0 .

866 (

G

2

R

G

 

B B

)

Computational Sensory Motor Systems Lab Johns Hopkins University

Etienne-Cummings et al., 2002

Examples: Chroma-Based Object Identification

Skin Identification Fruit Identification “Learned” templates

Computational Sensory Motor Systems Lab Johns Hopkins University

Chip Block Diagram

-Block addressable color imager -White correction and R,G,B scaling -R,G,B normalization -R,G,B to HSI conversion -HSI histogramming for an image block -Stored “learned” HSI templates -SAD template matching

Computational Sensory Motor Systems Lab Johns Hopkins University

Hue Computation

Hue

(

R

,

G

,

B

)

arctan(

X

/

Y

)

arctan 0 .

866 (

G

2

R

G

 

B B

)

Computational Sensory Motor Systems Lab Johns Hopkins University

Hue Computation

R G B-to -HS I T ran sfo rm atio n 360 300 240 180 120 60 0 0 5 10 15 20 25 30 C h ip C o m p u ted H u e B in s [1 0 d egrees reso lu tio n ] 35

Computational Sensory Motor Systems Lab Johns Hopkins University

Hue Based Segmentation

Computational Sensory Motor Systems Lab Johns Hopkins University

HSI Histogramming

-Filters Saturation and Intensity Values -Non-linear RGB->Hue transformation using analog-to-digital look-up -Hue histogram constructed by counting number of pixels in a block mapping to each Hue bin -36 x 12b Template per block -Programmable bin assignment in next version

Computational Sensory Motor Systems Lab Johns Hopkins University

Template Matching

200 150 100 50 0 450 400 350 300 250

Te m pla te M a tc hing R e sults

M atc hing thres ho ld

Im a g e S e g m e nt B lo c k Inde x

SAD

  

I i

,

j

T i

,

j

,

k

 

k

Computational Sensory Motor Systems Lab Johns Hopkins University

Color-Based Object Recognition

Computational Sensory Motor Systems Lab Johns Hopkins University

Summary

Technology Array Size (R,G,B) Chip Area Pixel Size Fill Factor FPN Dynamic Range Region-Of-Interest Size Color Current Scaling

0.5µm 3M CMOS 128 (H) x 64 (V) 4.25mm x 4.25mm

24.85µm x 24.85µm 20% ~5% >120 dB (current mode) 1 x 1 to 128 x 64 4bits

Hue Bins Saturation Intensity Histogram Bin Counts Template Size No. Stored Template Template Matching (SAD) Frame Rate Power Consumption

36, each 10 degree wide Analog (~5bits) one threshold Analog (~5bits) one threshold 12bits/bin 432bits (12 x 36bits) 32 (13.8Kbits SRAM) 4 Parallel SAD, 18bits results Array Scan: ~2K fps HIS Comp: ~30 fps ~1mW @ 30 fps on 3.3V Supplies

Computational Sensory Motor Systems Lab Johns Hopkins University

Some Conclusions

• Block-Serial-Pixel-Parallel Focal-Plane Computation-on Readout (COR) is an another style of neuromorphic image processing – Computation for “free”, high fidelity images, compact, low-power, high-speed, reconfigurable, multiple parallel kernels, can be iterated • Although COR can be used for both voltage- and current-mode imagers, current-mode image processing is more ideal for focal-plane implementation – Linearize the photo-current, perform CDS to remove FPN • Many different algorithms can be implemented with COR that are compatible with standard machine vision

Computational Sensory Motor Systems Lab Johns Hopkins University