A fast solver for wordle written in Python using numba.
For the maximum information policy it takes:
- less than 3 seconds to compute the optimal next word
- 30 minutes to compute the optimal starting word considering all solutions
Test conducted on MacBook Pro (13-inch, M1, 2020)
Run demos/demo_solver.py
Optimal starting guess: reast
-----------------
Enter guess,code: reast,10000
Next guess: courd
-----------------
Enter guess,code: courd,20010
Next guess: crick
-----------------
Enter guess,code: crick,22200
Answer is crimp
Run demos/demo_game.py
g = Game(word="crimp", verbose=True)
agent = StandardAgent(answers, guesses)
final_guess, _ = agent.play(g)
print(final_guess)
This package supports both standard and hard mode. In hard mode, any revealed hints must be used in subsequent guesses.
Use the mode
parameter of the various Agent and Solver classes to change modes e.g.
agent = MaxInfoAgent(answers, guesses, mode="standard")
Don't trust how fast someone solved wordle? Check their results by seeing all remaining words, ranked by common usage.
Run either of:
demos/bs_plays.py
demos/bs_tiles.py
I consider an optimal start word to be the one that solves the most answers and takes the fewest plays on average.
The optimal starting word depends on the policy used and the game mode. The table below outlines the optimal starting word and performance characteristics of each policy on the original wordle answer/guess list.
Policy | Mode | Optimal Starting Word | Mean Guesses | Failed Words |
---|---|---|---|---|
MaxInfo | Standard | 'reast' | 3.600431965442765 | 'goner' 'hatch' 'jaunt' 'found' 'waste' 'taunt' 'catch' 'dilly' 'boxer' |
Hard | 'reast' | 3.639635732870772 | ||
MaxSplits | Standard | 'trace' | 3.6207343412527 | |
Hard | 'salet' | 3.6259211096662334 | 'pound' 'hatch' 'watch' 'found' 'cover' 'blank' 'boxer' 'wight' | |
MaxPrune | Standard | 'laten' | 4.439469320066335 | 'sissy' 'awake' 'blush' ... 'judge' 'rower' 'shave' |
Hard | 'leant' | 3.8034934497816595 | 'dolly' 'mover' 'piper' 'water' 'foist' 'bound' 'sense' 'viper' 'rarer' 'waver' 'wreak' 'flake' 'wound' 'baste' 'tight' 'biddy' 'happy' 'fleck' 'mossy' 'hound' 'blame' 'vaunt' 'match' 'catty' 'rower' |
Failed words are often due to "lookalikes". For example with the word hatch
the solver will check match
, batch
, patch
and latch
first and ultimately fail.
The policy plays the word from the guess list that maximises the expected information content revealed about the solution, until there is one solution remaining. Played words are not repeated as they would not reveal new information.
Let the outcome from making a guess, i.e. the code received, be a discrete random variable
- ⬜️⬜️⬜️⬜️⬜️ - no matches,
- 🟩⬜️⬜️⬜️⬜️ - only first letter matched,
- 🟩⬜️⬜️🟨⬜️ - exact match and a partial match.
For a guessed word,
The expected information content in the outcomes of
where
The word that most evenly divides the answer pool into the 243 bins necessitates the greatest number of bins and thus has highest entropy.
This policy plays the word from the guess list that results in the largest number of outcomes.
This policy plays the word that reduces the remaining answers as much as possible.
numba requires llvmlite, which in turn requires llvm version 11. The default installed version of llvm is likely more recent than version 11.
- Install llvm version 11
arch -arm64 brew install llvm@11
- Install llvmlite by pointing to old llvm version
LLVM_CONFIG="/opt/homebrew/Cellar/llvm@11/11.1.0_3/bin/llvm-config" arch -arm64 pip install llvmlite
pip install numba
brew install openblas
pip install --no-cache --no-use-pep517 pythran cython pybind11 gast
OPENBLAS="$(brew --prefix openblas)" pip install --no-cache --no-binary :all: --no-use-pep517 scipy
The Maximum Information policy is equivalent to the Information Gain decision policy used by tree learning algorithms such as ID3, C4.5 and CART.
A decision node consists of a guess word and the leaf nodes correspond to each of the 243 possible outcomes that the answers can be placed into.
Under the information gain policy one must select the word which results in the greatest information gain, which is defined as:
where
is the conditional entropy due to the creation 243 splits, by playing word
Expanding
where
We can simplify as follows:
Therefore
The equivalence can be seen through expanding the logs of both approaches and dropping constant terms. First the criteria from the wordle policy .
then the decision tree case