Back
Close
  • 142

Learning Opportunities

This puzzle can be solved using the following concepts. Practice using these concepts and improve your skills.

Statement

 Goal

Huffman Coding is an algorithm for doing data compression and it forms the basic idea behind file compression. Instead of allowing every character to occupy 8 bits in a file, we use variable-length encoding to assign each symbol a unique binary code according to the frequency of the character in the file, without any ambiguities.

To put this into perspective: Suppose a file contains a string “aabacdeade”, where frequency of characters a, b, c, d and e is 4, 1, 1, 2 and 2 respectively. We assign binary codes to each character as follows:
a --> 00	  b --> 010	  c --> 011	  d--> 10	  e--> 11 

The process of encoding can be divided into two parts:

Part 1: Building a Huffman tree
First, assume all of the characters as individual trees with frequency as their weight. Now, we use a greedy approach to find the two trees with the smallest weights. Then, join them to create a new tree with the sum of those two as its weight and repeat this process until we have a single tree remaining.

For the above example:

Step 1: [a] [d] [e] # --> Here: a = 4, d = 2, e = 2, (bc) = 2
/ \
[b] [c]

Step 2: [a] # # --> Here: a = 4, (bc) = 2, (de) = 4
/ \ / \
[b] [c] [d] [e]

Step 3: # # --> Here: (de) = 4, (a(bc)) = 6
/ \ / \
[a] # [d] [e]
/ \
[b] [c]

Step 4: # --> Here: ((de)(a(bc))) = 10
/ \
/ \
/ \
# #
/ \ / \
[a] # [d] [e]
/ \
[b] [c]


Part 2: Assigning binary codes to each symbol by traversing Huffman tree
Generally, bit ‘0’ represents the left child and bit ‘1’
represents the right child

#
0 / \ 1
/ \
/ \
/ \
# #
0 / \ 1 0 / \ 1
/ \ / \
[a] # [d] [e]
0 / \ 1
/ \
[b] [c]

Thus by going through the tree, we will come up with
a = 00, b = 010, c = 011, d = 10, e = 11


Test Case 1
n = 5
frequencies = 4 1 1 2 2
bit length for each test case in order = 2 3 3 2 2 [see above for clarification]
total bit count = 4 * 2 + 1 * 3 + 1* 3 + 2 * 2 + 2 * 2 = 22

0utput : 22



Here are some links to understand it further :
https://www.geeksforgeeks.org/huffman-coding-greedy-algo-3/
https://www.studytonight.com/data-structures/huffman-coding
Input
Line 1: A single integer N representing the number of characters
Line 2: An ordered sequence of frequency values separated by space, where
1 2 3 ... N represents char1 = 1, char2 = 2, char3 = 3 ... charN = N
Output
A single value representing the least number of bits used to store the complete file.
Constraints
1 ≤ N ≤ 3000
Example
Input
5
4 1 1 2 2
Output
22

A higher resolution is required to access the IDE