-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ValueError: cannot copy sequence with size 37 to array axis with dimension 36 #3
Comments
我也出现了同样的问题,例如36是batch里的第一个数据,当后面的数据比36长时就会报错,不知道如何解决,如有思路可发邮件至[email protected] |
@whyalwaysonline 还没解决,暂时放弃啦 |
不好意思,这两天比较忙,下周我看一下这个问题~ |
@hemingkx 谢谢 麻烦啦 |
sentences.append((self.tokenizer.convert_tokens_to_ids(words), token_start_idxs)) |
发现问题所在了,当数据中包含英文单词时比如“Air Jordan”,在token的时候就会把空格略去,导致size不匹配 |
请问那应该如何解决呢? |
解决了,把数据中的空格去掉即可 |
@whyalwaysonline> 发现问题所在了,当数据中包含英文单词时比如“Air Jordan”,在token的时候就会把空格略去,导致size不匹配 |
😂不知道了
…---Original---
From: ***@***.***>
Date: Fri, Apr 22, 2022 21:53 PM
To: ***@***.***>;
Cc: ***@***.******@***.***>;
Subject: Re: [hemingkx/CLUENER2020] ValueError: cannot copy sequence with size 37 to array axis with dimension 36 (#3)
发现问题所在了,当数据中包含英文单词时比如“Air Jordan”,在token的时候就会把空格略去,导致size不匹配
请问 去掉空格了还是有这个问题怎么办
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
最简单的方法是将空格替换成下划线“_”。仅去掉空格而不去掉相应的标签,会导致对应错误。我的训练数据也是混合了中英文的,解决办法就是将空格替换成下划线,模型最终效果非常好。 |
你好 我换成BIEOS数据标签后,test数据没有标签。我每个字添加一个临时标签都是O,
然后允许模型,出现了以下错误,请指教!
The text was updated successfully, but these errors were encountered: