Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Does Flask-WhooshAlchemy not support Chinese? #46

Closed
ToonoW opened this issue Apr 26, 2016 · 12 comments
Closed

Does Flask-WhooshAlchemy not support Chinese? #46

ToonoW opened this issue Apr 26, 2016 · 12 comments

Comments

@ToonoW
Copy link

ToonoW commented Apr 26, 2016

I have setup Flask-WhooshAlchemy in my project.

I try to use Flask-WhooshAlchemy to query database with some English character information. It is fine.

But I use Chinese to query database, I alway get a empty array.
Please help me.

It is my models:

class Post(db.Model):
    __tablename__ = 'posts'
    __searchable__ = ['title']  # these fields will be indexed by whoosh

    id = db.Column(db.Integer, primary_key=True)
    title = db.Column(db.Text)

It is what I do and what it show to me:

>>> p = Post(title='my third and last post 失意时行不行')
>>> db.session.add(p)
>>> db.session.commit()
>>> Post.query.whoosh_search('my').all()
[<app.models.Post object at 0x10629f630>]
>>> Post.query.whoosh_search('行不行').all()
[]
@chenbotao828
Copy link

哈哈, 我遇到了同样的问题, 请问 @ToonoW 是否已经解决?

@ToonoW
Copy link
Author

ToonoW commented May 19, 2016

@chenbotao828 我最后转折地用了,还算是可以,就是分词不是很理想,能用。http://www.v2ex.com/t/274600#reply5

@ToonoW ToonoW closed this as completed May 19, 2016
@ToonoW ToonoW reopened this May 19, 2016
@chenbotao828
Copy link

@ToonoW 按照链接里的方法试过了,貌似还是不行,依然与之前一样,不能进行中文分词,是不是在此之外还有其他的地方需要设置呢?

@ToonoW
Copy link
Author

ToonoW commented May 19, 2016

@chenbotao828 你用的是python3?如果是的话要用那个外国人改过的whoosh

@chenbotao828
Copy link

@ToonoW 我用的是2 而且我就是按照他的教程中文版一步一步弄的,而且之前英文的搜索是可以的.现在中文搜索,可以搜到整个句子比如“我爱北京天安门”,搜整句可以搜到,但是搜“北京”就不行了,是不是要自己新建一个类似分词器的东西?

@ToonoW
Copy link
Author

ToonoW commented May 19, 2016

分词的话用jieba分词啊,我那个是针对3的,2的话没那么多问题

@chenbotao828
Copy link

chenbotao828 commented May 19, 2016

嗯,具体说下情况:
python 版本 2.7
之前一直按照老外教程来的,一直到全文搜索那一章
英文的没问题,然后中文的不会自己分词
后来 根据链接的方法

  1. whooshalchemyplus 已安装,并且替换了whooshalchemy
  2. jieba已经在models.py增加了analyzer,就两句
from jieba.analysis import ChineseAnalyzer
....
__analyzer__ = ChineseAnalyzer() 
...

然后其他还有需要设置的吗?请教@ToonoW
ps:抓机打字 排版勿喷

@ToonoW
Copy link
Author

ToonoW commented May 19, 2016

初始化app的时候做了这个吗?

from flask_whooshalchemyplus import whoosh_index

# Post 是需要全文检索的表
whoosh_index(app, Post)

__searchable__字段也是需要添加在需要检索的表里的.

@chenbotao828
Copy link

做了, 按照教程一步步来的, 并且删了所有post, 再重新输入, 如果没有的话, 估计英文的搜索也不行, 不过现在由于电脑不在身边, 我明天再试试, 多谢 @ToonoW

@chenbotao828
Copy link

@ToonoW 现在解决了, 之前只是删除所有Post, 而现在是把数据库完全重置, 已经可以得到中文分词的搜索结果了. 而且分词结果还比较理想. 感谢!

@ToonoW
Copy link
Author

ToonoW commented May 20, 2016

@chenbotao828 啊,我忘了跟你说,新加入的数据才会增加到它的全文搜索的索引里,也就是新的数据才能全文检索

@ToonoW ToonoW closed this as completed May 20, 2016
@fanne
Copy link

fanne commented Jun 27, 2016

https://segmentfault.com/q/1010000005811334 这是我遇到的问题,不知是为咋的
@chenbotao828
@ToonoW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants