Low-power Wireless Video System

Download Report

Transcript Low-power Wireless Video System

Low-Power Wireless Video System

Advisor: Professor Alex Doboli Students: Christian Austin Artur Kasperek Edward Safo

Objective

Establish a low-power wireless client/server streaming video system.

  Use a multimedia standard amenable to wireless networks.

Apply hardware software co-design techniques to reduce the power used by the system’s clients.

Server Ÿ Running Video Server Ÿ Maintains database of MPEG4 files 802.11B

11Mbit Access Point MP E G4 e lemen Ÿ Streams MPEG4 video from server through wireless communication to the Laptop/PDA ta ry s tre a m Ÿ Running MPEG-4 Client Software Ÿ PDA Running MPEG-4 Client Software

Hardware/Software Co-Design

Design methodology that splits a computer system’s design between hardware and software in an effort to improve some feature of the system.

 Partitioning targets low power consumption in this design.

 Achieved by relocating the functionality of high power sections of code to specialized hardware.

Project Flow

Decide on a multimedia standard.

Software.

 Design software from scratch .

 Find and analyze existing software.

Hardware.

 Isolate high power sections of software for a hardware port.

 Determine a hardware architecture.

Functional testing and hardware power analysis.

 Hardware tuning for lower power consumption.

Multimedia Standard

MPEG-4 was a good match for the system’s requirements.

What is MPEG-4?

 Object based video compression and decoding standard.

 New object based compression technique compresses objects, rather than frames.

 Objects are distinct entities in a scene; information can be associate with each one.

 Builds on previous MPEG and H.263 standards.

MPEG-4 Framework

Multimedia Content

Monitor

Scene Compositor Audio Objects Scene Description Information Audio decoder Video Objects Video Decoder MPEG-4 Client Framework Audio Streams Scene Description Streams Sync Layer Video Streams Protocol Independent Streams DMIF Layer Local Streams Network Streams MPEG-4 Data Streams

Why Use MPEG-4?

Non-proprietary standard.

High compression makes streaming over low bandwidth network practical (e.g. wireless).

Adjustable resolution coding allows for video continuity/quality trade off.

 High bit-rate yields better quality video at the expense of lost frames… Robust error resilience over noisy channels.

Emerging standard.

 Superset of previous MPEG standards.

Object Based Compression

Video Scenes defined as a composition of objects in space at an instant in time.

  Object color defined by pixel chrominance and luminance values; shape is defined by an alpha mask.

Object and bounding rectangle called Video Object Plane (VOP).

Each object compressed separately.

 Main reason for improved compression.

Block based encoding scheme extended to handle arbitrary shaped objects.

Compression Illustration

Bounding Rectangle Boundary Macroblock Object Transparent Macroblock Opaque Macroblock Transparent Macroblocks.

 Carry no information.

Boundary Macroblocks.

 Compressed using block based scheme after padding.

Opaque Macroblocks.

 Compressed as is using block based scheme.

Software Decisions

Used Open source MPEG-4 client and server software.

 Darwin Streaming Server by Apple.

 MPEG4IP, an open source project at Sourceforge.

Why Open Source?

 Implementation of a video server was not an objective.

 Design of software from scratch was not practical given the time constraints.

Locating Power Intensive Code

Hardware power measurement.

 Accurate measurement requires expensive hardware.

Power measurement using software.

 Instruction level power estimation.

 SimplePower developed at Penn State.

Software profiling.

  No direct power measurements.

Begin looking for high power sections of code in computationally intensive areas of code.

 GPROF or Visual Studio.

The Inverse Discrete Cosine Transform (IDCT)

Highly utilized code.

 Used each time a macroblock is decoded.

Computationally Intensive.

 Inherent nested loop structure.

High frequency of memory accesses.

 Results in elevated power consumption.

2

N cu



N

 1

v

  0  

N

 1

u

  0

cu cv

f ( 1   2 1

u

 0

otherwise

) cos   1 2

cv

 ( 2

x

 1 )

u

N

 cos   1 2    1 2 1

v

 0 ( 2

y

 1 )

v

N

  

otherwise

IDCT in an MPEG-4 Decoder

An MPEG-4 decoder consists of more than the IDCT Shape Stream (Alpha Mask Data) Shape Decoder Motion Stream Texture Stream (Macroblock Data) Variable Length Decoding Motion Compensation Inverse Scan DC and AC Prediction Inverse Quantization VOP Construction IDCT

Hardware Requirements

An economical FPGA with a large gate equivalence.

A fast interface to the FPGA.

 The hardware will implement a time critical function of an MPEG-4 decoder.

Peripheral memory, which the FPGA can use as a buffer for IDCT blocks.

Spartan-II 200 PCI Board

200, 000 gate equivalent Xilinx Spartan-II FPGA.

32-bit PCI interface.

8 MB on-board memory.

JTAG interface ISP PROM

PCI Core

PCI was the best solution for a high transfer rate interface.

Need to interface IDCT design to PCI Bus.

Xilinx LogiCore provides a PCI front end for the IDCT design.

 Abstracts the details of the PCI specification away from the IDCT design.

Hardware Implementation

IDCT hardware design considerations.

 Low power is primary concern, but design size and speed are also important.

Procedure.

 Design an IDCT architecture in terms of a functional unit block diagram.

 Code the design in VHDL.

 Write a driver with an API that maps to the hardware’s functions.

 Synthesize and place and route the design.

IDCT Architecture

Decodes an 8X8 block of IDCT coefficients.

Uses onboard memory as buffer for fetching and storing inputs.

 Less CPU intervention.

Performs two 1-D IDCTs.

 First half of data path performs 1-D IDCT on each row vector of the 8X8 input macroblock matrix.

 Row results stored in an 8X8, transposed, and used as inputs to the second half of the data path.  Second half of data path performs another 1-D IDCT on each of the column vectors of its 8X8 input matrix, completing the 2-D IDCT of the macroblock.

Control Inputs Control Outputs IDCT CONTROL

Architecture Block Diagram

Coefficient Table Control Data Address Memory input control Reg 1 d0,2,4,6 A0.even

4 mult/3 add A1.even

x0 4 mult/3 add A2.even

x1 x2 4 mult/3 add A3.even

x3 4 mult/3 add d1,3,5,6 A0.odd

4 mult/3 add x4 Reg 2 A1.odd

4 mult/3 add A2.odd

x5 x6 4 mult/3 add A3.odd

4 mult/3 add x7 x0 x4 x1 x5 Butterfly Butterfly D0 D7 D1 D6 x2 x6 Butterfly D2 D5 8X8 Transpos e Reg x3 x7 Butterfly D3 D4 d0,2,4,6 A0.even

4 mult/3 add A1.even

x0 4 mult/3 add A2.even

x1 x2 4 mult/3 add A3.even

x3 4 mult/3 add d1,3,5,6 A0.odd

4 mult/3 add x4 Reg 3 A1.odd

4 mult/3 add A2.odd

x5 x6 4 mult/3 add A3.odd

4 mult/3 add x7 Coeff 1 Data 1 Coeff 2 Data 2 Coeff 3 Data 3 Coeff 4 Data 4 In1 Mulltiplier Mulltiplier Mulltiplier Mulltiplier Adder x0 x4 Butterfly D0 D7 x1 x5 Butterfly D1 D6 x2 x6 Butterfly D2 D5 x3 x7 Butterfly D3 D4 Reg 4 In2 Subtractor Adder 4 mult/3 add Butterfly Memory output control Control Data Address

Architecture Features

Pipelined design for increased throughput and power reduction.

Exploits Symmetry of IDCT coefficient matrix.

 Breaks 8X8 matrix operation into two 4X4 matrix operations and butterfly operations.

Parallel multiply and addition operations perform two 4X4 matrix multiplications in parallel.

 Speed up of IDCT’s repetitive matrix operations.

Power Reduction

Clock Isolation.

 Add additional logic to isolate sections of logic from the clock when not in use.

Glitch reduction.

 Balance the number of synthesized logic levels.

 Duplicate resources instead of sharing them.

 Increase amount of pipeline registers.

Goals and Applications

Demonstrate that a low-power wireless video system is practical.

 Design for a power constrained, low bandwidth PDA.

Applications:   Interactive shopping.

 Request video of product information while shopping.

Multimedia preview.

 Preview movie before buying or renting; watch music video while previewing new album.

Any Questions?