quiz

Question: On old cell phones,users typed on a numeric keypad and the phone would provide a list of words that matched these numbers. Each digit mapped to a set of 0 - 4 letters. Implement an algorithm to return a list of matching words, given a sequence of digits. You are provided a list of valid words (provided in whatever data structure you’d like). The mapping is shown in the diagram below:

EXAMPLE:
Input: 8733
Output: tree, used

(Cracking the coding interview 16.20 T9)

SOLUTION:

Brute Force

Imagine how you would solve the problem if you had to do it by hand. You’d probably try every possible value for each digit with all other possible values.

This is exactly what we do algorithmically.We take the first digit and run through all the characters that map to that digit. For each character, we add it to a prefix variable and recurse, passing the prefix downward. Once we run out of characters, we print prefix (which now contains the full word) if the string is a valid word.

This is very, very slow on large strings.

Optimized

Let’s return to thinking about how you would do this, if you were doing it by hand. Imagine the example of 33835676368 (which corresponds to development).Ifyou were doing this by hand, Ibet you’d skip over solutions that start with fftf [3383]. as no valid words start with those characters.

Ideally, we’d like our program to make the same sort of optimization: stop recursing down paths which will obviously fail. Specifically, if there are no words in the dictionary that start with prefix, stop recursing.

The Trie data structure can do this for us. Whenever we reach a string which is not a valid prefix, we exit.

Runtime is depends on what the language looks like, but this “short-circuiting” will make it run much, much faster in practice.

Most Optimal

Believe or not, we can actually make it run even faster. We just need to do a little bit of preprocessing. That’s not a big deal though. We were doing that to build the trie anyway.

This problem is asking us to list all the words represented by a particular number in T9. Instead of trying to do this “on the fly”(and going through a lot of possibilities, many of which won’t actually work), we can just do this in advance.

Our algorithm now has a few steps:

Pre-Computation:

  1. Create a hash table that maps from a sequence of digits to a list of strings.
  2. Go throug h each word in the dictionary and convert it to its T9 representation (e.g., APPLE -> 27753). Store each of these in the above hash table. For example, 8733 would map to {used, tree}

Word Lookup:

  1. Just look up the entry in the hash table and return the list.
    That’s it!

Note that it’s easy to think, “Oh, linear-that’s not that fast”. But it depends what it’s linear on. Linear on the length of the word is extremely fast. Linear on the length of the dictionary is not so fast.

Reference:
Cracking the Coding Interview, 6th Edition