Changes

Happy Valley

224 bytes removed, 14:54, 26 February 2018

→‎LZSS

=== Assignment 1 ===

==== LZSS ====

[https://en.wikipedia.org/wiki/Lempel%E2%80%93Ziv%E2%80%93Storer%E2%80%93Szymanski LZSS] (Lempel–Ziv–Storer–Szymanski) is a compression algorithm that belongs to LZ77 family. It attempts to replace a string of symbols with a reference to a dictionary location of the same string.

0.07 27.69 0.02 1 0.02 0.02 init_tree</pre>

<pre>/////////////////// ~~125MB~~ 250MB FILE ///////////////////

[obelavina@localhost gpu610]$ cat 250M.flt

Flat profile:

|}

[[File:~~https://d2mxuefqeaa7sj.cloudfront.net/s_04A25BBEA5534B427FFC3152A69B8313D1EACB7FDCEF46F064943F9A797A2EBF_1519616588207_image~~Lzss_perf.png]]

'''Analysis'''

<pre>gprof ./lzss | gprof2dot | dot | convert - out.png</pre>

[[File:~~https://d2mxuefqeaa7sj.cloudfront.net/s_04A25BBEA5534B427FFC3152A69B8313D1EACB7FDCEF46F064943F9A797A2EBF_1519616497287_out~~Lzss_calls.png]]

The call graph demonstrates that insert_node function takes the bulk of work (83.48%). delete_node takes only 10% even though it’s called nearly as many times as insert_node. It is hard to determine if LZSS is a good candidate for GPU optimisation: data throughput can be quite large but the logic of the algorithm includes some branching which can introduce some issues while porting code to CUDA. I suspect that compress function itself will need to be parallized (rather than insert_node).

Obelavina

68

edits

Changes

Happy Valley

Navigation menu

Personal tools

Namespaces

Variants

Views

More

Search

Navigation

get involved with CDOT

courses

course projects

links

Tools