Transcript Huffman Coding
Huffman Coding
An implementation using C++ STL (part of the material is due to the work of Mark Nelson, Dr. Dobb’s Journal, January 1996)
Why C++
No special reason, also C, Java and many more The main advantage of using C++ is the presence of the STL (Standard Template Library) Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 2
What is the STL
It is a library of templates A template is a class which has a data type that is a parameter (by the way, this data type can be primitive, or itself a class) vector
What is a queue?
FIFO Operations: push(), pop()
Very simple to implement, but sometimes we need a more sophisticated form of queue
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 4
What is a priority queue?
In a priority queue each elements has a priority New elements are added through the function
push()
pop() function gets an element from the queue. In a priority queue this element is not the oldest, but the element with the maximum priority This could be useful in certain types of tasks, for example the scheduler of an operating system Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 5
What is a priority queue?
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 6
Implementation with a heap
One efficient implementation of a priority queue uses a heap A heap is a data structure that keeps queue elements in a partially-sorted order.
Insertion and extraction can be accomplished in log(N) time, where N is the number of elements Also memory overhead is almost null Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 7
Heap characteristics
The elements of a heap are stored in a contiguous array (let’s say, for convenience, from position 1 to position N) The elements in the array are part of an
x
the root node) has a parent at location
x
/ 2 Each node has a priority that is greater then or equal to both of each children Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 8
Heap
Heap: push()
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006
swap
10
Heap: pop()
Something similar also happens when we call the pop() functions...
Details don’t matter...
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 11
STL priority_queue
It implements a priority queue using a heap We don’t need to remember all the details about heap and resorting because ... all is automatic Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 12
Huffman implementation - I
Class node { public: int weight; unsigned char value; const node *child0; const node *child1; Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 13
Huffman implementation - II
Constructors node(unsigned char c=0, int i=-1) { value=c; weight=i; child0=0; child1=0; } node(const node *c0, const node *c1) { value=0; weight=c0->weight + c1->weight; child0=c0; child1=c1; } Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 14
Huffman implementation - III
Comparison operator } bool operator>(const node &a) const { return weight > a.weight; Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 15
The key loop
while (q.size()>1) { node *child0 = new node(q.top()); q.pop(); node *child1 = new node(q.top()); q.pop(); q.push(node(child0, child1)); }
and now, let’s have a look at the results...
Gabriele Monfardini - Corso di Basi di Dati Multimediali a.a. 2005-2006 16