Huffman Coding

Download Report

Transcript Huffman Coding

Huffman Coding

An implementation using C++ STL (part of the material is due to the work of Mark Nelson, Dr. Dobb’s Journal, January 1996)

Why C++

  No special reason, also C, Java and many more The main advantage of using C++ is the presence of the STL (Standard Template Library) Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 2

What is the STL

  It is a library of templates A template is a class which has a data type that is a parameter (by the way, this data type can be primitive, or itself a class) vector, queue Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 3

What is a queue?

 

FIFO Operations: push(), pop()

Very simple to implement, but sometimes we need a more sophisticated form of queue

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 4

What is a priority queue?

   In a priority queue each elements has a priority New elements are added through the function

push()

pop() function gets an element from the queue. In a priority queue this element is not the oldest, but the element with the maximum priority  This could be useful in certain types of tasks, for example the scheduler of an operating system Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 5

What is a priority queue?

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 6

Implementation with a heap

  One efficient implementation of a priority queue uses a heap A heap is a data structure that keeps queue elements in a partially-sorted order.

Insertion and extraction can be accomplished in log(N) time, where N is the number of elements Also memory overhead is almost null Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 7

Heap characteristics

   The elements of a heap are stored in a contiguous array (let’s say, for convenience, from position 1 to position N) The elements in the array are part of an

x

the root node) has a parent at location 

x

/ 2 Each node has a priority that is greater then or equal to both of each children Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 8

Heap

Heap: push()

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006

swap

10

Heap: pop()

 Something similar also happens when we call the pop() functions...

 Details don’t matter...

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 11

STL priority_queue

  It implements a priority queue using a heap We don’t need to remember all the details about heap and resorting because ... all is automatic Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 12

Huffman implementation - I

 Class node { public: int weight; unsigned char value; const node *child0; const node *child1; Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 13

Huffman implementation - II

 Constructors node(unsigned char c=0, int i=-1) { value=c; weight=i; child0=0; child1=0; } node(const node *c0, const node *c1) { value=0; weight=c0->weight + c1->weight; child0=c0; child1=c1; } Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 14

Huffman implementation - III

 Comparison operator } bool operator>(const node &a) const { return weight > a.weight; Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 15

The key loop

while (q.size()>1) { node *child0 = new node(q.top()); q.pop(); node *child1 = new node(q.top()); q.pop(); q.push(node(child0, child1)); } 

and now, let’s have a look at the results...

Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 16