Releases: segment-any-text/wtpsplit
Releases · segment-any-text/wtpsplit
Release 0.5.3
- Updates Rust dependencies
- Adds support for Russian (
ru
) and Ukrainian (uk
)
Release 0.5.2
- Split sequence data is now stored in the ONNX file instead of being hardcoded: #21
- Added
verbose
argument to thesplit(..)
method of the Python bindings to display a progress bar - Retrained Chinese model with properly removed punctuation
- Retrained German model with Compound Splitting as additional split level
- docs.rs documentation now has all features enabled
- Added methods to get the levels of the current models:
Python: splitter.get_levels()
JS: splitter.getLevels()
Rust: splitter.logic().split_sequence().get_levels()
- NNSplit now has a website with demo, benchmarks and metrics! https://bminixhofer.github.io/nnsplit/
Release 0.5.1
Introduce model versioning: With the new model architecture, old Rust releases broke because models were always fetched from the master branch. Sorry! Now they are versioned along with the library so this won't happen again. Please upgrade to this version to use the new models.
Update German and English models.
Release 0.5.0
Add five new languages:
- Norwegian
- Swedish
- Turkish
- Chinese
- French
Retrain all models with new downsampling trick, improves Accuracy significantly at roughly the same speed.
Release 0.4.12
Add missing sigmoid to JS.
Release 0.4.10 - Better JS docs and tested Node.js support
remove outdated release instructions (now in CI :) )
Release 0.4.9
Testing release CI.
Release 0.4.8
Testing release CI.