Huffman Compression

Includes both an implementation of the Huffman compression method and a corresponding CLI. A few limitations hold this project back from being a practical compression tool. It is only intended as a proof of concept.

Usage

huffman [mode] [target] <freq>

`-e`	Encodes the `target` file. Expects only one argument. The encoder creates two files. The first contains the compressed data (a `.huf` file) and the frequency table is written to the other.
`-d`	Decodes the `target` file using the given `freq` table. This operation expects both the `target` file and its corresponding frequency table to have appropriate extensions (`.huf` and `.freq`, respectively). The decompressed data is written to a `.dec` file, which is human-readable.

Sample Files

Each of the files presented below can be found in the /test_files subdirectory in their uncompressed form. Files are sourced from http://textfiles.com, although their names have been changed.

File	Description	Size of `.txt` (bytes)	Size of `.huf` (bytes)
`decind.txt`	The American Declaration of Independence	~83k	~47k
`zork.txt`	An ASCII map aid for the game Zork	2670	1040
`candy.txt`	A massive list of candy recipes	~424k	~240k
`yoda.txt`	ASCII art of Yoda (in Cyrillic characters)	2500	882

Implementation

Both encoding and decoding begin with the construction of a Huffman tree, specific to the given file. First, character frequencies are compiled as an SLL of TreeNode objects. This list is then sorted and assembled into a tree by repeatedly placing the first two nodes under a new parent node until only one root node remains.

To begin encoding, an array of pointers to the tree’s leaf nodes are constructed. The target file is then read character by character. Each character is matched against the array of leaves until the corresponding node is found. The algorithm then walks up the tree until it reaches the root. Left children append a 0 to the bitstream, right children append a 1. Each character’s code is reversed prior to writing, which avoids ambiguity in the decoding process.

Compressed files are read 1 byte at a time. For each byte, the decoder starts at the root of the tree and traverses until it reaches a leaf. Left and right children are chosen based on the individual bits of the data.

(Handled) Error States

The program will fail and exit if…

an invalid command flag is given
wrong number of arguments are supplied for the intended operation
any file argument cannot be opened
the decoder recieves files without appropriate extensions

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
test_files		test_files
.gitignore		.gitignore
README.adoc		README.adoc
huffman.c		huffman.c
huffman.h		huffman.h
tree.c		tree.c
tree.h		tree.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Huffman Compression

Usage

Sample Files

Implementation

(Handled) Error States

About

Releases

Packages

Languages

hankotanks/huffman_encoding

Folders and files

Latest commit

History

Repository files navigation

Huffman Compression

Usage

Sample Files

Implementation

(Handled) Error States

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages