Skip to content

Commit

Permalink
build minified version of ZH module, npm audit fix, add ZH to readme
Browse files Browse the repository at this point in the history
  • Loading branch information
MihaiValentin committed Jun 17, 2021
1 parent 1b55cc8 commit 38431b7
Show file tree
Hide file tree
Showing 6 changed files with 1,404 additions and 7 deletions.
5 changes: 3 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,7 @@ Lunr Languages is a [Lunr](http://lunrjs.com/) addon that helps you search in do
* ![](https://raw.githubusercontent.com/madebybowtie/FlagKit/master/Assets/PNG/TH.png) Thai
* ![](https://raw.githubusercontent.com/madebybowtie/FlagKit/master/Assets/PNG/VN.png) Vietnamese
* ![](https://raw.githubusercontent.com/madebybowtie/FlagKit/master/Assets/PNG/IQ.png) Arabic
* ![](https://raw.githubusercontent.com/madebybowtie/FlagKit/master/Assets/PNG/CN.png) Chinese
* [Contribute with a new language](CONTRIBUTING.md)

Lunr Languages is compatible with Lunr version `0.6`, `0.7`, `1.0` and `2.X`.
Expand Down Expand Up @@ -151,10 +152,10 @@ Searching inside documents is not as straight forward as using `indexOf()`, sinc

# Technical details & Credits

I've created this project by compiling and wrapping stemmers toghether with stop words from various sources so they can be directly used with all the current versions of Lunr.
I've created this project by compiling and wrapping stemmers toghether with stop words from various sources ([including users contributions](https://github.com/MihaiValentin/lunr-languages/pulls?q=is%3Apr)) so they can be directly used with all the current versions of Lunr.

* <https://github.com/fortnightlabs/snowball-js> (the stemmers for all languages, ported from snowball-js)
* <https://github.com/brenes/stopwords-filter> (the stop words list for the other languages)
* <http://chasen.org/~taku/software/TinySegmenter/> (the tinyseg Tiny Segmente Japanese tokenizer)

I am providing code in the repository to you under an open source license. Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook)
I am providing code in the repository to you under an [open source license](LICENSE). Because this is my personal repository, the license you receive to my code is from me and not my employer (Facebook)
2 changes: 2 additions & 0 deletions build/build.js
Original file line number Diff line number Diff line change
Expand Up @@ -124,6 +124,8 @@ var list = [
wordCharacters: wordCharacters('Latin')
}, {
locale: 'vi',
}, {
locale: 'zh',
}
];

Expand Down
10 changes: 6 additions & 4 deletions lunr.zh.js
Original file line number Diff line number Diff line change
Expand Up @@ -86,24 +86,26 @@

lunr.zh.tokenizer = function(obj) {
if (!arguments.length || obj == null || obj == undefined) return []
if (Array.isArray(obj)) return obj.map(function (t) { return isLunr2 ? new lunr.Token(t.toLowerCase()) : t.toLowerCase() })
if (Array.isArray(obj)) return obj.map(function(t) {
return isLunr2 ? new lunr.Token(t.toLowerCase()) : t.toLowerCase()
})

nodejiebaDictJson && nodejieba.load(nodejiebaDictJson)

var str = obj.toString().trim().toLowerCase();
var tokens = [];

nodejieba.cut(str, true).forEach(function (seg) {
nodejieba.cut(str, true).forEach(function(seg) {
tokens = tokens.concat(seg.split(' '))
})

tokens = tokens.filter(function (token) {
tokens = tokens.filter(function(token) {
return !!token;
});

var fromIndex = 0

return tokens.map(function (token, index) {
return tokens.map(function(token, index) {
if (isLunr2) {
var start = str.indexOf(token, fromIndex)

Expand Down
1 change: 1 addition & 0 deletions min/lunr.zh.min.js

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit 38431b7

Please sign in to comment.