Skip to content

Commit

Permalink
[ngrams] Speed up bz2 decoding
Browse files Browse the repository at this point in the history
Decode at once.
  • Loading branch information
behdad committed Nov 10, 2023
1 parent 7605b0d commit 5e14c60
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions ngrams.py
Original file line number Diff line number Diff line change
Expand Up @@ -61,8 +61,8 @@ def extract_ngrams_from_file(filename, *kargs, **kwargs):
import bz2

# Assume harfbuzz-testing-wikipedia format
txtfile = bz2.open(filename + ".txt.bz2")
frqfile = bz2.open(filename + ".frq.bz2")
txtfile = bz2.open(filename + ".txt.bz2").read().splitlines()
frqfile = bz2.open(filename + ".frq.bz2").read().splitlines()

return extract_ngrams(txtfile, *kargs, frequencies=frqfile, **kwargs)

Expand Down

0 comments on commit 5e14c60

Please sign in to comment.