Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dev: Add trie libraries implemented in c++ #43

Merged
merged 5 commits into from
Oct 22, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
8 changes: 4 additions & 4 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,29 +3,29 @@

Spellsolver is a software that helps to search for the best possible word in Spellcast discord activity. Spellsolver uses a trie to store the valid words, and then iteratively tries all the possible combinations of letters on the board, discarding the ones that don't make valid words and keeping the ones that do.

- Initialization of the trie structure to store valid words in single swap mode can take anywhere from 20 to 30 seconds and uses approximately 1 GB of ram memory, but allows almost all spellsolver queries to be executed in less than a second.
- Double swap mode can be enabled in config.py, but it is not recommended as it significantly increases load times (100 seconds), ram usage (3.6 GB) and query time (up to 20 seconds)
- Initialization of the trie structure to store valid words in single swap mode take 5 seconds, uses approximately 150 MB of ram memory and allows almost all spellsolver queries to be executed in less than two second.
- Double swap mode can be enabled in config.py, but it is not recommended as it significantly increases load times (25 seconds), ram usage (650 MB) and query time (up to 30 seconds)
- In case the wordlist.txt file does not exist, a new file will be automatically generated from the sources folder when starting spellsolver using any interface

A message like this will be printed on the screen while Spellsolver starts
```bash
Spellsolver v1.10 - fabaindaiz
WordValidate is being initialized, this will take several seconds
WordValidate successfully initialized (elapsed time: 25.05 seconds)
WordValidate successfully initialized (elapsed time: 4.8 seconds)
```

- #### Inside the docs folder, you will find some documents that detail the operation of spellsolver, as well as notes on how the algorithm is implemented.


### Requirements
- python3 (3.6 or later)
- marisa-trie (for store words)
- tk (tkinter for graphicui.py)
- fastapi (for webapi.py)
- uvicorn (for webapi.py)

### TODO
- Add some spellsolver tests to avoid accidentally introducing new bugs
- Add some heuristics to reduce the load and query time of double swap mode

### Notices for contributors
- Thank you for your interest in contributing to spellsolver, any improvement will be welcome
Expand Down
2 changes: 1 addition & 1 deletion graphicalui.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import tkinter as tk
from typing import Tuple

from src.interfaces.tkinterboard import TkinterBoard
from src.interfaces.tkinter.tkinterboard import TkinterBoard
from src.interfaces.baseui import BaseUI


Expand Down
1 change: 1 addition & 0 deletions requirements.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
tk
marisa-trie
fastapi
uvicorn
22 changes: 6 additions & 16 deletions src/config.py
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
VERSION = "v1.10"
VERSION = "v1.11"
DEBUG = False

# Wordlist settings
Expand All @@ -9,23 +9,13 @@
HOST = "127.0.0.1"
PORT = 8080

# Heuristic settings
HEURISTIC = False

# Multiprocess settings
# Use multiprocessing is slower than single process
MULTIPROCESS = False

# Trie settings
# Use PATRICIA trie is slightly slower than PREFIX trie
# TRIE = "PREFIX"
# TRIE = "PATRICIA"
TRIE = "PREFIX"
# TRIE = "MARISA"
TRIE = "MARISA"

# Swap mode settings
# Make sure you have enough ram memory (and patience) for the selected swap modes
# SWAP = 0 (no swap) - Memory: 150 MB - Load: 3 sec - Query: 20 ms (mean)
# SWAP = 1 (one swap) - Memory: 990 MB - Load: 30 sec - Query: 1000 ms (mean)
# SWAP = 2 (two swap) - Memory: 3520 MB - Load: 100 sec - Query: 12000 ms (mean)
# SWAP = 0 (no swap) - Memory: 36 MB - Load: 1 sec - Query: 50 ms (mean)
# SWAP = 1 (one swap) - Memory: 150 MB - Load: 5 sec - Query: 2000 ms (mean)
# SWAP = 2 (two swap) - Memory: 650 MB - Load: 25 sec - Query: 30000 ms (mean)
SWAP = 1
# notice: it is recommended not to activate double swap (SWAP = 2)
8 changes: 4 additions & 4 deletions src/interfaces/baseui.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
from src.modules.wordlist.validate import WordValidate
from src.modules.gameboard.resultlist import ResultList
from src.modules.gameboard.gameboard import GameBoard
from src.spellsolver import SpellSolver
from src.modules.resultlist import ResultList
from src.modules.validate import WordValidate
from src.modules.gameboard import GameBoard
from src.utils.timer import Timer
from src.config import VERSION

Expand Down Expand Up @@ -35,7 +35,7 @@ def __init__(self) -> None:

print(f"Spellsolver {VERSION} - fabaindaiz")
self.timer.reset_timer()
self.validate.load_wordlist()
self.validate.init_trie()
print(
f"WordValidate successfully initialized (elapsed time: {self.timer.elapsed_seconds()} seconds)"
)
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from typing import Any, Dict, Optional
from pydantic import BaseModel
from src.interfaces.baseapi import BaseRouter
from src.interfaces.fastapi.baseapi import BaseRouter
from src.interfaces.baseui import BaseUI


Expand Down
File renamed without changes.
8 changes: 4 additions & 4 deletions src/interfaces/board.py → src/interfaces/tkinter/board.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,10 +2,10 @@

from src.config import SWAP
from src.interfaces.baseui import BaseUI
from src.interfaces.boardbutton import BoardButton
from src.interfaces.boardlabel import BoardLabel
from src.interfaces.boardtile import BoardTile
from src.modules.resultlist import ResultWord
from src.interfaces.tkinter.boardbutton import BoardButton
from src.interfaces.tkinter.boardlabel import BoardLabel
from src.interfaces.tkinter.boardtile import BoardTile
from src.modules.gameboard.resultlist import ResultWord
from src.utils.utils import aux_to_indices


Expand Down
File renamed without changes.
File renamed without changes.
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
from tkinter.font import Font
from typing import List

from src.modules.gameboard import GameTile
from src.modules.gameboard.gameboard import GameTile


class BoardLabel:
Expand Down
File renamed without changes.
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
import tkinter as tk

from src.interfaces.boardentry import BoardEntry
from src.interfaces.boardmenu import BoardMenu
from src.interfaces.tkinter.boardentry import BoardEntry
from src.interfaces.tkinter.boardmenu import BoardMenu


class BoardTile:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
from src.interfaces.board import Board
from src.interfaces.tkinter.board import Board


class MultHandler:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
from src.interfaces.baseui import BaseUI
from src.interfaces.board import Board
from src.interfaces.multhandler import MultHandler
from src.interfaces.tkinter.multhandler import MultHandler
from src.interfaces.tkinter.board import Board


class TkinterBoard(Board):
Expand Down
File renamed without changes.
2 changes: 1 addition & 1 deletion src/modules/path.py → src/modules/gameboard/path.py
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import List, Tuple
from src.modules.gameboard import GameTile
from src.modules.gameboard.gameboard import GameTile


class Path:
Expand Down
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
from typing import Any, Dict, Generator, List, Tuple
from src.modules.gameboard import GameTile
from src.modules.gameboard.gameboard import GameTile
from src.utils.timer import Timer


Expand Down
20 changes: 20 additions & 0 deletions src/modules/trie/base.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
from typing import Generator
from src.modules.wordlist.wordlist import WordList


class Trie:

def insert_trie(self, loader: WordList) -> None:
raise NotImplementedError()

def query_trie(self) -> "TrieQuery":
raise NotImplementedError()


class TrieQuery:

def get_key(self, word: str) -> str:
raise NotImplementedError()

def get_leaf(self, word: str) -> Generator[str, None, None]:
raise NotImplementedError()
18 changes: 18 additions & 0 deletions src/modules/trie/loader.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,18 @@
from typing import List, Generator
from itertools import combinations
from src.config import SWAP


def _word_iter(word, num):
for t in combinations(range(len(word)), num):
yield "".join("0" if i in t else word[i] for i in range(len(word)))

def word_iter(word: str) -> Generator[str, None, None]:
for num in range(SWAP + 1):
for iword in _word_iter(word, num):
yield iword

def chunk_iter(words: Generator[str, None, None]) -> Generator[str, None, None]:
"""Insert a chunk of words into the trie and return it"""
for word in words:
yield from word_iter(word)
42 changes: 42 additions & 0 deletions src/modules/trie/marisa.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,42 @@
from marisa_trie import RecordTrie
from typing import Generator, List, Tuple
from src.modules.trie.base import Trie, TrieQuery
from src.modules.wordlist.wordlist import WordList
from src.modules.trie.loader import word_iter


class MarisaTrie(Trie):

def __init__(self) -> None:
self.trie: RecordTrie = None
self.words: List[str] = []

def insert_trie(self, loader: WordList) -> None:
ind: int = 0
trie_keys: List[str] = []
trie_data: List[Tuple[int]] = []

for word in loader.get_words():
for iword in word_iter(word):
trie_keys.append(iword)
trie_data.append((ind,))

self.words.append(word)
ind += 1
self.trie = RecordTrie("<i", zip(trie_keys, trie_data))

def query_trie(self) -> TrieQuery:
return MarisaTrieQuery(self)


class MarisaTrieQuery(TrieQuery):

def __init__(self, trie: Trie) -> None:
self.trie: MarisaTrie = trie

def get_key(self, word: str) -> str:
return self.trie.trie.has_keys_with_prefix(word)

def get_leaf(self, word: str) -> Generator[str, None, None]:
for i in self.trie.trie.get(word, []):
yield self.trie.words[i[0]]
File renamed without changes.
File renamed without changes.
110 changes: 0 additions & 110 deletions src/modules/validate.py

This file was deleted.

Loading