Can you solve it? Are you brainy at binary?
Last month an example of binary code was cleverly displayed on the door of 10 Downing Street, to promote London Tech Week.
A binary code is a way of encoding information (in this case, letters) in the binary digits 0 and 1, called “bits”. In the standard encoding used on the PM’s door, every letter is represented by eight bits. The first line spells L, the second T and the third W.
Today’s puzzle is about a different type of binary code: one that is used to compress data in the most efficient way possible.
1. ABACAB
Here is a code for the letters A, B and C, in which each letter is represented by two bits.
A: 00
B: 01
C: 10
This code would translate the six-letter message ABACAB into 12-bits: 000100100001
Devise a binary code that translates ABACAB into a string consisting of only 9 bits.
That’s it. I am asking you design a binary code for A, B and C that encodes, or compresses, ABACAB into just 9 bits, which is the fewest possible number of bits.
To clarify: in this code, A, B and C each have their own unique representation in binary.
Also, the code must not be ambiguous. A code is ambiguous if the encoding of two different messages produce the same bit string. For example, a code in which A is 0, B is 1, and C is 01, is not permitted because this would mean that the encodings of both AB and C are “01” . Given any bit string, it must be unambiguously decodable into a message of As, Bs and Cs.
[UPDATE: combinations of letters, such as AB, BC, or AC, are not allowed their own codes. The only codes are for A, B and C.]
2. ABACADABA
Here is a code for the letters A, B, C and D, in which each letter is represented by two bits.
A: 00
B: 01
C: 10
D: 11
This code would translate the message ABACADABA, which has nine letters, into a string consisting of 18 bits: 000100100011000100.
Devise a binary code that translates ABACADABA into a string consisting of only 15 bits.
Same rules as above. Every letter has its own unique code, and these are the only codes. The code is not ambiguous.
Today’s questions were simple to state – and also have a famous pedigree. In 1951, MIT professor Robert Fano, who along with Claude Shannon was one of the fathers of information theory, gave the general version of these questions to graduate student David Huffman:
Given a collection of letters, numbers, or symbols, find the most efficient way to represent them using a binary code.
Shannon and Fano had failed to find an answer – but Huffman found it. This type of coding is now called Huffman coding, and is widely used to compress all sorts of data files.
The answer to both of today’s puzzles is a Huffman coding. If you solve them, congratulations – you are officially smarter than Robert Fano and Claude Shannon. You will have replicated the results of one of the classic algorithms in information theory and computer science. (And if you don’t, you will have the pleasure of discovering the elegant solution to the problem.)
Usually, I reveal the answers to my puzzles at 5pm UK, which is ten hours after I set the questions.
Today, however, I’m trialling a new format, in which the solution is posted AT THE SAME TIME. If you want to read the solution now, please click the link here.
NO SPOILERS PLEASE
Instead talk about binary coding, Genesis, the Steve Miller Band, words using only A, B, C and D, and maybe some feedback about whether or not you prefer having the answers straight away.
Thanks to Pierre Chardaire, a retired computer scientist, for help with this puzzle.
I set a puzzle here every two weeks on a Monday. I’m always on the look-out for great puzzles. If you would like to suggest one, email me.
I give school talks about maths and puzzles (online and in person). If your school is interested please get in touch.