-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Refactor Japanese ます, んかった form #1385
Conversation
CodSpeed Performance ReportMerging #1385 will not alter performanceComparing Summary
Benchmarks breakdown
|
cc @Lyroxide for review. If no review in 3 days I'll glance over it |
Any reason for including ませば? It's only used in kobun, where ませば is not "-ba of ます", but the "-ba of 未然形 of まし". I'd wager that ませば doesn't even exist in modern texts. ますれば seems fine though. ませんかった threw me off at first but it seems legitimate according to 大辞林・日国 Is it possible to remove "-n past" but chain "-ta" with "-n"? Didn't had a chance to review when this was added initially. |
Thanks for review. Let's remove ませば for consistency since we are not going to include the まする form. |
I also agree with chaining ませんかった is the only exception |
Remove |
1. Refactor masu form
Problems
Currently there are many wrong conjugations with masu form, for example:
Verbs end with ます also is also treated as polite ます form
Some may not appear in real text, but it can create wrong impressions on Japanese grammar for learners, especially when they start to output on their own.
Solution
Remove
v5m
, addmasu
andmasen
conditions.masu form is a special form in Japanese with different rules of conjugation and we should not treat it as a godan verb.
Follow the conjugation rules in https://en.wiktionary.org/wiki/Appendix:Japanese_verbs
Only add universal rules: ません、ませんでした、ました、ましたら、まして、ます、ますれば、ませば、ましょう
past
as out condition (make sure it's consistent with other かった ending), update name to '-n past'.Will update some tests to make sure there are no false positives.
Maybe we should have a refactor PR to update the language transform test logic too. Currently it only checks if a transformation is correct, it does not check if a term has any extra wrong transformation. For testing false positive, we have to know it in advance and explicitly add it to the test. This makes it hard to track the effect when we're adding new forms to the language.