I found this little challenge in a puzzle booklet and instantly thought "I wonder if I can get a computer to figure this out for me?". Not long after that first thought, I had a second thought "I wonder if I can get a computer to give me the LONGEST chain of words!".
The source code for this project is on Github.
There are two ways to implement the depth first search algorithm: recursion and iteration. Recursion is a more (arguably) elegant approach but the drawback is that if a function recurs too many times, you will get a stack overflow (at least, you will get this in Java). Iteration does not have this limitation, but cannot backtrack so an alternative to backtracking is needed.
I chose to go with the iteration implementation and the way I handled backtracking is that each “unit of work” has its own path history self contained. This way, if a chain of words hits the target word then we keep that chain, else if a chain of words hits a dead end then just drop it.
To avoid copying all previous words for each subsequent word which takes time and memory, I implemented a linked list to store the path. Each word in the chain refers only to the previous word.
In this example, the word PROP has a path of POOP and POOR. The word COOP also has a path of POOP and POOR. There is only one copy of POOR and POOP in memory.
The upshot of this approach is that it is efficient to add a new word to a previous chain. Time complexity is O(1) to add a new word to a chain.
From one word, there will be multiple possible words which might be the next in the chain. These possible next words can be thought of as child nodes in a tree graph.
The mechanics of finding next possible words is relatively easy but computationally expensive.
We set up a double loop. First we loop through every character across the given word, then for each of those we substitute the character with all possible characters A-Z. If the new word is in the dictionary and it hasn’t been used in the given chain before then it’s a new child word.
Sometimes, it would find a path between two words by luck. For example going from POOR to RICH would find a chain of 632 words long.
Other times it would churn through millions of rejected paths before I lost patience and cancelled the search.
The program implements this preference by sorting the list of next words before adding them to the stack for processing. The list is sorted by the count of characters in the child word that match the target word.
For example, if the target word is RICH, then:
The good news is that we generally found a path. The “bad” news is that this path is not necessarily the shortest. For that, we need to make one small yet fundamental change.
A breadth first search algorithm uses a queue to keep track of which neighbouring node to process next. Depth first search algorithms use a stack. The only changes I needed to make was:
However, to find the shortest path took a short while because it needed to consider every child of every child and so on as it worked down the tree of possible words. The longer the shortest chain is, the exponentially longer the program will take to find the solution.
If I had to persevere with finding the longest chain, I would introduce multithreading to the mix. The stack object would be a source of work and some execution service would delegate the finding of new child words to a thread pool. This will speed up the program, but from what I’ve seen there is still too many combinations to compute to really find the longest chain. A distributed compute model could help scale out and process more, but who the hell would go to such effort for such a benign problem?
The source code for this project is on Github.
Version 1: Find Longest Path
I figured that finding the longest chain will probably take a lot of compute, so designing for performance was a priority.Dictionary words
We need a list of English words and we need to be able to quickly check if a word of a certain length is a real word. For this, I created a hashmap where the key is the length of words and the values are a set of all words of that length. A set gives us constant-time lookups, which we definitely need.Search Algorithm
Next, I needed to decide an overall approach to take to find the longest path. For this, I decided to use a depth-first-search algorithm. This algorithm is good for finding "paths" where the start and end are far apart. This algorithm is also good where we need to backtrack and try a new path when we find the one doesn't work. A classic example is trying to find a way through a maze. If one path ends up with a dead end, then backtrack to the previous junction and try the alternate path.There are two ways to implement the depth first search algorithm: recursion and iteration. Recursion is a more (arguably) elegant approach but the drawback is that if a function recurs too many times, you will get a stack overflow (at least, you will get this in Java). Iteration does not have this limitation, but cannot backtrack so an alternative to backtracking is needed.
I chose to go with the iteration implementation and the way I handled backtracking is that each “unit of work” has its own path history self contained. This way, if a chain of words hits the target word then we keep that chain, else if a chain of words hits a dead end then just drop it.
Remembering the Path
Using the iteration approach of a depth first search meant that I needed to store the complete path from the first word to the current word.To avoid copying all previous words for each subsequent word which takes time and memory, I implemented a linked list to store the path. Each word in the chain refers only to the previous word.
In this example, the word PROP has a path of POOP and POOR. The word COOP also has a path of POOP and POOR. There is only one copy of POOR and POOP in memory.
The upshot of this approach is that it is efficient to add a new word to a previous chain. Time complexity is O(1) to add a new word to a chain.
Finding Next Words
We set up a double loop. First we loop through every character across the given word, then for each of those we substitute the character with all possible characters A-Z. If the new word is in the dictionary and it hasn’t been used in the given chain before then it’s a new child word.
Result
The bad news is that this approach only worked some of the time because there are just too many paths to compute.Sometimes, it would find a path between two words by luck. For example going from POOR to RICH would find a chain of 632 words long.
Other times it would churn through millions of rejected paths before I lost patience and cancelled the search.
Version 2: Find Any Path
I wondered if I could make the program prefer words which were closer to the target word. This would mean we would no longer have the longest chain but at least we would have a chain.The program implements this preference by sorting the list of next words before adding them to the stack for processing. The list is sorted by the count of characters in the child word that match the target word.
For example, if the target word is RICH, then:
- PLAY would have a count of 0
- RAIL would have a count of 1
- RICE would have a count of 3
Result
This actually worked quite well. The program found a path to the target word for the few example test cases I gave it. This is quite interesting because the new logic we added does not guarantee a path to be found - it’s more like a hint - yet in most cases a path was found. For example, the path from POOR to RICH was now 8 steps long.The good news is that we generally found a path. The “bad” news is that this path is not necessarily the shortest. For that, we need to make one small yet fundamental change.
Version 3: Find Shortest Path
I needed a breadth-first-search algorithm to find the shortest path between two words.A breadth first search algorithm uses a queue to keep track of which neighbouring node to process next. Depth first search algorithms use a stack. The only changes I needed to make was:
- Replace the stack with a queue; and
- Reverse the order in which new child words were added to the queue.
Result
Using the sample POOR to RICH example, the program found a path using only five steps.However, to find the shortest path took a short while because it needed to consider every child of every child and so on as it worked down the tree of possible words. The longer the shortest chain is, the exponentially longer the program will take to find the solution.
Conclusions
In hindsight, trying to find the longest chain between two words was probably a bit too ambitious. I was held back by the sheer amount of compute required to find new words all the time. The stack depth was around 27,000 and the word length was around 4,000. As the program churned through all combinations, these numbers wouldn’t change much.If I had to persevere with finding the longest chain, I would introduce multithreading to the mix. The stack object would be a source of work and some execution service would delegate the finding of new child words to a thread pool. This will speed up the program, but from what I’ve seen there is still too many combinations to compute to really find the longest chain. A distributed compute model could help scale out and process more, but who the hell would go to such effort for such a benign problem?
Comments
Post a Comment