File Compression Using the CUDA Framework

Download Report

Transcript File Compression Using the CUDA Framework

File Compression Using the
CUDA Framework
Brandon Grant
Tomas Mann
Florida State University
Spring 2010 Multicore Programming
CIS4930
Contents
•
•
•
•
Introduction
Implementation
Performance
Conclusions
Monday, April 13, 2015
File Compression Using the CUDA
Framework
2
Introduction
• Why is compression important?
– Limited disk space
– Shrink bloated and redundant files
– Expedite file transfer
Monday, April 13, 2015
File Compression Using the CUDA
Framework
3
Introduction
• What benefits can CUDA bring to the table?
– Nvidia Compute Unified Device Architecture
– Massively parallel GPU
– Can run thousands of threads
• We want to implement a parallel lossless
compression algorithm that can compress
larger files faster by taking advantage of
parallelism.
Monday, April 13, 2015
File Compression Using the CUDA
Framework
4
Implementation
• Modified Burrows-Wheeler Compression
– Implement a sequential version to identify
individual tasks.
– Identify potentially parallelizable tasks.
– Implement parallel versions of these tasks and use
them to replace their sequential counterparts.
Monday, April 13, 2015
File Compression Using the CUDA
Framework
5
Implementation
• Parallel: BurrowsWheeler
Transformation
– Computes the BurrowsWheeler code.
– Computes the index of
the original string in the
sorted string rotations
table.
Monday, April 13, 2015
Algorithm 1: Parallel Burrows-Wheeler
1: s := string for which the thread is
responsible, rank = 0.
2: for each string x in the list
3: if x < s
4: rank = rank + 1
5: end if
6: end for
7: output[rank] = last character of s
8: if (s == original input sequence)
9: BW_index = rank
File Compression Using the CUDA
Framework
6
Implementation
• Parallel: Huffman Coding
– Ascii table initialization
– Character occurrence counter
– Node sorter
Monday, April 13, 2015
File Compression Using the CUDA
Framework
7
Conclusions
• Our parallel algorithm did not show any
significant performance advantages over the
sequential version
– Burrows-Wheeler algorithm is optimized for a
single core implementation, significant
performance boosts would be difficult to realize
– Memory hierarchy restrictions in CUDA hamper
performance
Monday, April 13, 2015
File Compression Using the CUDA
Framework
8
Conclusions
Compression Time
Decompression Time
Compression Times (Seconds)
Decompression Times (Seconds)
English.dic
World95.txt
Delaware.osm
English.dic
World95.txt
Delaware.osm
Sequential
3.456
6.114
294.817
Sequential
2.002
2.208
40.308
Parallel
4.146
6.752
270.002
Parallel
2.747
3.005
38.658
Bzip2
0.713
0.806
63.097
Bzip2
0.241
0.360
3.460
Original File Size
Compression Ratios
File Sizes (Bytes)
Compression Ratio
File
English.dic
World95.txt
Delaware.osm
Size
4,067,439
2,988,578
79,648,840
Monday, April 13, 2015
English.dic
World95.txt
Sequential
0.336
0.242
Delaware.os
m
0.161
Parallel
0.336
0.242
0.161
Bzip2
0.300
0.193
0.045
File Compression Using the CUDA
Framework
9
Conclusions
Burrows-Wheeler Performance
• english.dic
Monday, April 13, 2015
• world95.txt
File Compression Using the CUDA
Framework
10
Thank you for viewing our presentation.
QUESTIONS?
Monday, April 13, 2015
File Compression Using the CUDA
Framework
11