I saw the problem on youtube. Its generic form would be given an alphabet and wordlength find the minimal amount of words which cover entirely or the largest possible subset of the given alphabet. In particular, the task is to find 5 English words with 25 distinct characters.
I use a cycle graph to find valid solutions for a given alphabetically sorted wordlist in the following manner:
- All words are repsentented as uints encoding the letters in each word as bits.
- Adjancency lists are being built as a mapping between each word and its list of neighbors - the words with different characters, only the words with higher alphabeticall order are stored.
- Cliques of the prespecified size are built by iterating over the wordlist and considering the each next word from the adjancency list of the previous (cyclic graph), while keeping track of all letters from the words added to the clique.
- All cliques are stored in a solutions container.
A similar solution was implemented here.
The code utilizes the parallel execution algorithms implemented in the C++17 standard:
λ git main* → g++ cycle_graph.cpp -o cycle_graph.o -ltbb -std=c++17 -O3 && time ./cycle_graph.o