Skip to main content

Clueless

Clueless is a popular style of crossword puzzle where there are no clues, only numbers.  It is also known as “Code Words” and “Crack the Code”. Each square on the crossword contains numbers and each number maps to a letter and the puzzler needs to figure out which number is which letter. The ultimate goal is to figure out the "final word", seperate from the main puzzle.

Clueless puzzle
Clueless puzzle

This post will describe how to get a computer to solve Clueless puzzles for you.

Clues

The clues themselves can be represented using integer arrays. For example {20, 21, 4, 9, 23, 25}.

That's it.  There is no need to create a two dimensional array to reproduce the layout with all the digits and black squares. There is no need to track whether the clue goes across or down. Depending how the program goes, it might not even be required to input all the clues to get to the final solution. I once solved a puzzle using only four clues.

Writing down all (or even some) of the clues is the most time consuming part of solving these puzzles with a computer.  Once the clues are entered, the computer will solve it in less than the blink of an eye.

Matching Words

The most basic thing we need to figure out is what words fit a clue, and it’s just a little bit more complicated than you’d first imagine.

Firstly, we need is a word list. There are quite a few to be found, for example on Github. I pre-grouped all the words in the word list by length so it’s quick to loop through all three letter words or four letter words, for example.

For the set of words whose length is the same as a clue, we apply a set of rules to see if a word matches.
  1. The known letters must match the word from the word list. This is done using regular expressions. The regex pattern is built from the known letters and dots for unknown letters.
  2. The list of known letters cannot contain duplicate letters. For example, let’s say we have a clue {1,2,3} and we know that 1=E and 2=R. We would need to reject ERR else the letter R would appear twice in the list of known letters. Likewise, ERE would be rejected but ERA is a match.
  3. The last rule is hard to explain but simple to demonstrate. Assume we have a clue {1,2,3,4,4} with known letters 1=T, 2=H. The rule should match THREE and reject THERE. To do this, the rule would need to temporarily assume that 3=R and 4=E and then make sure that the word fits the clue’s digits.
These rules are actually quite powerful and can find unique matches for clues where a human probably wouldn’t spot one. For example, given clue {26, 18, 26, 15, 24, 26, 19, 24, 13} and one known letter 13=E, the only matching word is AVAILABLE. Hats off to any person who can spot that.

Sure Answers

Once we have a good matching algorithm in place we can start finding matches and start solving the puzzle. We’ll start with the simple case where we find clues that have only one possible solution - these are sure answers.

Finding sure answers can be seen as a short set of steps.
  1. Find all matching words for each unsolved clue. Each clue will maintain its own set of matching words.
  2. Sort the list of clues by the number of matching words. Clues with one answer will end up at the start of the list. Assume for now that the clue at the start of the list has only one matched word.
  3. For all the clues which have one sure answer, update the known letters and mark the clue as solved.
  4. If we have updated the list of known letters, repeat steps 1-3 in case we find more sure answers using the newly discovered letters.
This set of steps will keep solving and looping through the clues until hopefully all the clues become solved and the final answer can be solved. This will usually be the case if there is a good number of clues to begin with and “the ball starts rolling”. Once a few letters are solved, the rest usually get solved pretty quickly. However, this is not always the case and that leads us to the final section on how to solve Clueless puzzles. 

Unsure Answers

Sometimes when looking for sure answers, we just don’t have any. At best maybe we have one of two matching words. In this case we need to test solving the puzzle for each possibility. One of those possibilities will lead to the puzzle being solved; the rest will lead to clues that have no matches. In the case of having no matches, we need to go back to the point where we started trying the failed possibility and then switch to the next possibility. This is called backtracking.

The program may need to test unsure answers based on previous unsure answers. The program will need to keep track of and traverse a tree of decisions, backtracking from all failed attempts until a solution is found. The traversal style used for this problem is called depth first search.

The approach I took to solving this problem is called recursion. This is a programming technique where a set of steps can call itself to apply the same set of steps on a (usually) smaller dataset.

The set of recursive steps are:
  1. Find all sure answers. At this stage the list of unsolved clues will be sorted with clues having the least number of matching words at the start of the list. All the sure answers will be solved already so the only unsolved clues left will have either zero or more than one matching words.
  2. If we haven’t taken any guesses yet or if all clues are solved, take a stab at solving the final solution. Sometimes, we know just enough known letters so that the final clue can have only one solution. If we find a solution, great! The program can finish there, else it keeps going. If the program took guesses in the past to get to this point, then it won’t try to solve the final clue because the guess could be wrong which can lead to a wrong final answer.
  3. If the first clue has no matching words, then the program must have taken a previous guess which ended up being incorrect. Backtrack and try the next guess.
  4. If the program gets to this point, the first clue must have multiple matching words. For each of the possible matching words, recursively call step 1.

In action

The Java source code for this program is published on my Github repository

Here is some sample output where the program solved the puzzle without having to take any guesses. You can see how the program keeps repeating to find sure answers until all the clues are solved.

Scanning 39 clues for sure answers
Clue D.AR can only be DEAR
  Solved 12 = E
Scanning 38 clues for sure answers
Clue ..A.ARE can only be UNAWARE
  Solved 4 = U
  Solved 2 = N
  Solved 1 = W
Clue .ARR..R can only be WARRIOR
  Solved 23 = I
  Solved 25 = O
Scanning 36 clues for sure answers
Clue ..UDIO can only be STUDIO
  Solved 20 = S
  Solved 21 = T
Clue E.UA... can only be EQUALLY
  Solved 16 = Q
  Solved 13 = L
  Solved 7 = Y
Clue DAD. can only be DADS
Clue DE.EA. can only be DEFEAT
  Solved 14 = F
Clue E...ORE can only be EXPLORE
  Solved 10 = X
  Solved 24 = P
Clue R...E. can only be RHYMES
  Solved 26 = H
  Solved 18 = M
Clue .ODERN can only be MODERN
Clue DRAWN is already solved
Clue A.RO.A. can only be ACROBAT
  Solved 3 = C
  Solved 5 = B
Clue .NEE.E. can only be SNEEZES
  Solved 6 = Z
Clue I..RO.E can only be IMPROVE
  Solved 8 = V
Clue E.E.EN can only be ELEVEN
Scanning 24 clues for sure answers
Clue OCCUPY is already solved
Clue BANANA is already solved
Clue SHRILL is already solved
Clue CHAIN is already solved
Clue .OURNEY can only be JOURNEY
  Solved 11 = J
Clue MOAN is already solved
Clue WIFE is already solved
Clue ABOVE is already solved
Clue SAMPLE is already solved
Clue RECEIVE is already solved
Clue ASPECT is already solved
Clue RUBBED is already solved
Clue A.O can only be AGO
  Solved 15 = G
Clue TRAINED is already solved
Clue AFFECTS is already solved
Clue ASIDE is already solved
Clue OBEY is already solved
Clue .EPT can only be KEPT
  Solved 17 = K
Clue OXEN is already solved
Clue COU.H can only be COUGH
Clue APPLY is already solved
Clue PULLIN. can only be PULLING
Clue CYCLED is already solved
Scanning 1 clues for sure answers
Clue AGED is already solved
Scanning 0 clues for sure answers
No more sure answers
Attempting to solve final answer
Final clue ENJOYS can only be ENJOYS

Here is a snippet of output where the program could not find sure answers for all the clues then started taking guesses.

Scanning 2 clues for sure answers
No more sure answers
Attempting to solve final answer
Branching out, trying SHRUBS, out of [SHRUBS, SHRUGS]
  Solved 23 = B
Scanning 1 clues for sure answers
Clue BALLOON is already solved
Scanning 0 clues for sure answers
No more sure answers
Attempting to solve final answer
Final clue BLAZES can only be BLAZES



Comments

Popular posts from this blog

Depth First Search - For Babies!

Old MacDonald Had A Farm

I wanted to play around with AWS Polly, which is a service that converts text into speech for you.  What better way to test this out than with an "Old MacDonald Had A Farm" song generator.  But first, I needed to do some research. Song Variations There are actually many different ways to sing this song.  I spent a few minutes listening to various versions and here are some important differences I discovered: The list of animals is different every time. The Wiggles version shortens the previous animal sounds so there is no "here a woof, there a woof, everywhere a woof woof".  They simple sing "woof woof here, woof woof there". The previous animal sounds can either be in order, reversed or left out entirely. Generator Program The code to generate the lyrics is quite simple.  I have it uploaded on Github . For every animal in the given list of animal names/sounds, it writes a verse.  Each verse has a intro lines ("Old MacDonald had a f...

Word Changer

I found this little challenge in a puzzle booklet and instantly thought "I wonder if I can get a computer to figure this out for me?". Not long after that first thought, I had a second thought "I wonder if I can get a computer to give me the LONGEST chain of words!". The source code for this project is on  Github . Version 1: Find Longest Path I figured that finding the longest chain will probably take a lot of compute, so designing for performance was a priority. Dictionary words We need a list of English words and we need to be able to quickly check if a word of a certain length is a real word. For this, I created a hashmap where the key is the length of words and the values are a set of all words of that length. A set gives us constant-time lookups, which we definitely need. Search Algorithm Next, I needed to decide an overall approach to take to find the longest path. For this, I decided to use a depth-first-search algorithm. This algorith...