Releases: bminixhofer/nlprule
Releases · bminixhofer/nlprule
Release 0.6.4
Internal improvements
- Decrease time it takes to load the
Tokenizer
by ~ 40% (#70). - Tag lookup is backed by a vector instead of a hashmap now.
Breaking changes
- The tagger now returns iterators over tags instead of allocating a vector.
- Remove
get_group_members
function.
Release 0.6.3
Release 0.6.2
Internal improvements
Speed up loading the Tokenizer
by ~ 25% (#66).
Release 0.6.1
Release 0.6.0
- Fix a significant bug where text with multiple sentences would sometimes cause an error if one of the latter sentences matches some pattern (#61, #63, thanks @drahnr!).
Breaking changes
- Remove
multiword_tags
on tokens (now part of the regular tags). - Make fields of the
Word
private and add getter methods. Word
constructor is now callednew
instead ofnew_with_tags
.
New features
- Adds
as_str
convenience method to multiple structs (WordId
,PosId
,Word
).
Release 0.5.3
- CI failed for Release 0.5.2
Release 0.5.2
Release 0.5.1
Breaking changes
- Changes the focus from
Vec<Token>
toSentence
(#54).pipe
andsentencize
return iterators overSentence
/IncompleteSentence
now. - Removes the special
SENT_START
token (now only used internally). Each token corresponds to at least one character in the input text now. - Makes the fields of
Token
andIncompleteToken
private and adds getter methods (#54). char_span
andbyte_span
are replaced by aSpan
struct which keeps track of char and byte indices at the same time (#54). To e.g. get the byte range, usetoken.span().byte()
.- Spans are relative to the input text now, not anymore to sentence boundaries (#53, thanks @drahnr!).
New features
- The regex backend can now be chosen from Oniguruma or fancy-regex with the features
regex-onig
andregex-fancy
.regex-onig
is the default. - nlprule now compiles to WebAssembly. WebAssembly support is guaranteed for future versions and tested in CI.
- A new selector API to select individual rules (details documented in
nlprule::rule::id
). For example:
use nlprule::{Tokenizer, Rules, rule::id::Category};
use std::convert::TryInto;
let mut rules = Rules::new("path/to/en_rules.bin")?;
// disable rules named "confusion_due_do" in category "confused_words"
rules
.select_mut(
&Category::new("confused_words")
.join("confusion_due_do")
.into(),
)
.for_each(|rule| rule.disable());
// disable all grammar rules
rules
.select_mut(&Category::new("grammar").into())
.for_each(|rule| rule.disable());
// a string syntax where slashes are the separator is also supported
rules
.select_mut(&"confused_words/confusion_due_do".try_into()?)
.for_each(|rule| rule.enable());
Release 0.5.0
- Superseded by 0.5.1. The release script for 0.5.0 did not finish.
Release 0.4.6
Breaking changes
.validate()
innlprule-build
now returns aResult<()>
to encourage calling it after.postprocess()
.
Fixes
- Fixes an error where
Cursor
position innlprule-build
was not reset appropriately. - Use
fs_err
everywhere for better error messages.