Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Checksum error since 5.7 #38

Open
Write opened this issue Mar 31, 2022 · 7 comments
Open

Checksum error since 5.7 #38

Write opened this issue Mar 31, 2022 · 7 comments
Assignees

Comments

@Write
Copy link

Write commented Mar 31, 2022

Hi, Using the container linux/amd64 (5.11.0-49-generic #55-Ubuntu SMP Wed Jan 12 17:36:34 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux) latest output a checksum error related to ngrams (whichever the language is requested). Reverting to 5.6-dockerupdate-3 work just fine.

Example on 5.6 (working) :
CleanShot 2022-04-01 at 00 59 31

On 5.7 :

At first I tought my french ngrams were corrupted, but the result is the same in english, and I redownloaded all ngrams just in case.

CleanShot 2022-04-01 at 01 02 13

The following configuration is passed to LanguageTool:
languageModel=/ngrams
+ java -Xms512m -Xmx1g -cp languagetool-server.jar org.languagetool.server.HTTPServer --port 8010 --public --allow-origin '*' --config config.properties
2022-03-31 23:00:54.436 +0000 INFO  org.languagetool.server.DatabaseAccessOpenSource Not setting up database access, dbDriver is not configured
2022-03-31 23:00:54 +0000 WARNING: running in HTTP mode, consider running LanguageTool behind a reverse proxy that takes care of encryption (HTTPS)
2022-03-31 23:00:54 +0000 WARNING: running in public mode, LanguageTool API can be accessed without restrictions!
2022-03-31 23:00:54 +0000 Setting up thread pool with 10 threads
2022-03-31 23:00:55 +0000 Starting LanguageTool 5.7 (build date: 2022-03-30 13:58:36 +0000, 35d0d40) server on http://localhost:8010...
2022-03-31 23:00:55 +0000 Server started
2022-03-31 23:00:57.496 +0000 INFO  org.languagetool.server.LanguageToolHttpHandler Handling POST /v2/check
2022-03-31 23:01:02.143 +0000 ERROR org.languagetool.server.LanguageToolHttpHandler An error has occurred: 'java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=97ec8ffc actual=901b5b3c (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ngrams/fr/1grams/_16.fdx"))), detected: fr', sending HTTP code 500. Access from 172.18.0.1, HTTP user agent: curl/7.74.0, User agent param: null, Referrer: null, language: fr, h: 1, r: 1, time: 4656text length: 8, m: ALL, l: DEFAULT, Stacktrace follows:java.lang.RuntimeException: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=97ec8ffc actual=901b5b3c (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ngrams/fr/1grams/_16.fdx"))), detected: fr
        at org.languagetool.server.TextChecker.checkText(TextChecker.java:496)
        at org.languagetool.server.ApiV2.handleCheckRequest(ApiV2.java:173)
        at org.languagetool.server.ApiV2.handleRequest(ApiV2.java:84)
        at org.languagetool.server.LanguageToolHttpHandler.handle(LanguageToolHttpHandler.java:185)
        at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77)
        at jdk.httpserver/sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:82)
        at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:80)
        at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:730)
        at jdk.httpserver/com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:77)
        at jdk.httpserver/sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:699)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
        at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.util.concurrent.ExecutionException: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=97ec8ffc actual=901b5b3c (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ngrams/fr/1grams/_16.fdx")))
        at java.base/java.util.concurrent.FutureTask.report(FutureTask.java:122)
        at java.base/java.util.concurrent.FutureTask.get(FutureTask.java:191)
        at org.languagetool.server.TextChecker.checkText(TextChecker.java:477)
        ... 12 more
Caused by: java.lang.RuntimeException: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=97ec8ffc actual=901b5b3c (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ngrams/fr/1grams/_16.fdx")))
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.getCachedLuceneSearcher(LuceneSingleIndexLanguageModel.java:186)
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.addIndex(LuceneSingleIndexLanguageModel.java:118)
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.<init>(LuceneSingleIndexLanguageModel.java:93)
        at org.languagetool.languagemodel.LuceneLanguageModel.<init>(LuceneLanguageModel.java:65)
        at org.languagetool.Language.initLanguageModel(Language.java:180)
        at org.languagetool.language.French.getLanguageModel(French.java:149)
        at org.languagetool.JLanguageTool.activateLanguageModelRules(JLanguageTool.java:594)
        at org.languagetool.server.Pipeline.activateLanguageModelRules(Pipeline.java:103)
        at org.languagetool.server.PipelinePool.createPipeline(PipelinePool.java:121)
        at org.languagetool.server.PipelinePool.getPipeline(PipelinePool.java:78)
        at org.languagetool.server.TextChecker.getPipelineResults(TextChecker.java:789)
        at org.languagetool.server.TextChecker.getRuleMatches(TextChecker.java:743)
        at org.languagetool.server.TextChecker.lambda$checkText$4(TextChecker.java:460)
        at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
        ... 3 more
Caused by: org.apache.lucene.index.CorruptIndexException: checksum failed (hardware problem?) : expected=97ec8ffc actual=901b5b3c (resource=BufferedChecksumIndexInput(MMapIndexInput(path="/ngrams/fr/1grams/_16.fdx")))
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:334)
        at org.apache.lucene.codecs.CodecUtil.checkFooter(CodecUtil.java:364)
        at org.apache.lucene.codecs.compressing.CompressingStoredFieldsReader.<init>(CompressingStoredFieldsReader.java:140)
        at org.apache.lucene.codecs.compressing.CompressingStoredFieldsFormat.fieldsReader(CompressingStoredFieldsFormat.java:121)
        at org.apache.lucene.codecs.lucene50.Lucene50StoredFieldsFormat.fieldsReader(Lucene50StoredFieldsFormat.java:173)
        at org.apache.lucene.index.SegmentCoreReaders.<init>(SegmentCoreReaders.java:117)
        at org.apache.lucene.index.SegmentReader.<init>(SegmentReader.java:65)
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:58)
        at org.apache.lucene.index.StandardDirectoryReader$1.doBody(StandardDirectoryReader.java:50)
        at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:731)
        at org.apache.lucene.index.StandardDirectoryReader.open(StandardDirectoryReader.java:50)
        at org.apache.lucene.index.DirectoryReader.open(DirectoryReader.java:63)
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel$LuceneSearcher.<init>(LuceneSingleIndexLanguageModel.java:241)
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel$LuceneSearcher.<init>(LuceneSingleIndexLanguageModel.java:229)
        at org.languagetool.languagemodel.LuceneSingleIndexLanguageModel.getCachedLuceneSearcher(LuceneSingleIndexLanguageModel.java:182)
        ... 16 more

2022-03-31 23:01:02.171 +0000 INFO  org.languagetool.server.LanguageToolHttpHandler Handled request in 4685ms; sending code 500
@Erikvl87 Erikvl87 self-assigned this Apr 1, 2022
@Erikvl87
Copy link
Owner

Erikvl87 commented Apr 7, 2022

Hi @Write, could you try increasing both minimal and maximum heap sizes and see if that solves the issue?

@Write
Copy link
Author

Write commented Apr 7, 2022

Tried Xms1G and Xmx2g, up to Xmx3g, unfortunately, the error still occure

@gerroon
Copy link

gerroon commented Apr 21, 2022

I have this issue as well, I believe this is introduced with the latest update. It is broken as is, cant be used with the browser extension.

@gerroon
Copy link

gerroon commented Apr 21, 2022

5.6-dockerupdate-3 version seams to work. I rolled back to it.

@Write
Copy link
Author

Write commented Apr 24, 2022

Now I have the issue too with 5.6-dockerupdate-3, absolutely no clue 🤷

Well, adding user: 0:0 to run as root worked for my issue, however I tried for 5.7 but same error.

@Erikvl87
Copy link
Owner

Sorry, due to personal circumstances I was not able to spend a lot of time on this.

I never got this reproduced, and I am not sure if the issue would be within this dockerized version of LanguageTool. Are you perhaps running this on a Synology NAS? Are the disks OK? Are you able to do a full drive scan and check for bad sectors?

@Write
Copy link
Author

Write commented Jun 29, 2022

No idea, not using a Synology NAS.
Only using SSD. Everything is perfect.
My only idea is there's some sort of issue with RAM Allocation and Java JVM. For now i'm using an other image (image: ghcr.io/someone-stole-my-name/docker-languagetool) with - JAVAOPTIONS=-Xms512M -Xmx2G and it works so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants