So how is this wonderful, magical Huffman tree can be created? Here is the algorithm, as taken from “The Data Compression Book”:
First count the amount of times each character appears, and assign this as a “weight” to each character, or node. Add all the nodes to a LIST.
Then, repeat these steps until there is only one node left:
Find the two nodes with the lowest weights.
Create a parent node for these two nodes. Give this parent node a weight of the sum of the two nodes.
Remove the two nodes from the list, and add the parent node.
This way, the nodes with the highest weight will be near the top of the tree, and have shorter codes. Try the algorithm on the following characters:
You should get a tree similar to the one above.
From this point, you simply get the codes for each character and send the encoded string along with the tree. So the string ‘abc’ will be 000110.
The decoder then can use the Huffman tree to decode the string by following the paths according to the string and adding a character every time it comes to one.