This package implements a double-array trie in Julia.
julia> Pkg.update()
julia> Pkg.clone("https://github.com/hshindo/DoubleArrayTrie.jl.git")
Double-array is a fast implementation that realizes a trie data structure.
Basically, it contains two internal arrays: base
and check
.
Roughly speaking, base
is an offset value for child node indices and check
is a flag to ensure that the child node exsits in a trie.
More specifically, a double-array is constructed as the following conditions hold.
- child = base[parent] + key
- check[child] == parent
In the above example,
- 3 = base[1] + 2
- check[3] == 1
- 4 = base[3] + 4
- check[4] == 3
and so on. Note that #
indicates undefined value.
Looking up a key in a trie takes O(m) time where m
is the key length.
The main type of DoubleArrayTrie.jl
is DATrie{T}
.
A type of key is Vector{Int}
and that of value is T
.
For simplicity and efficiency, the package only supports static construction, which means keys must be sorted beforehand and DATrie
does not provide any functions to append/delete key-values.
For key lookup, use get(trie, key::Vector{Int}, default)
function.
Here is a quick example.
using DoubleArrayTrie
# Please download "words.txt" from "DoubleArrayTrie/test/words.txt"
lines = open(readlines, "<path>/words.txt")
sort!(lines)
words = map(chomp, lines)
keys = map(w -> [Int(c) for c in w], words)
trie = DATrie(keys, words)
for i = 1:length(words)
v = get(trie, keys[i], "")
if v == words[i]
println(v)
else
error("Key not found: $(keys[i]).")
end
end
- Jun-ichi Aoe. An efficient digital search algorithm by using a double-array structure. IEEE Transactions on Software Engineering, Vol. 15, No. 9, pp. 1066-1077, 1989.