-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Committer queue is not fully processed #16
Comments
When using Does that address your issue? |
Thanks for your reply Pascal. I came across this issue while using the Solr Committer. Since it implements the AbstractFileQueueCommitter (Committer Core) it's why I'm posting the issue here. The commit is also called by commitIfReady (AbstractCommitter) which in turn is called by the "add" and "remove" methods in AbstractCommitter. The commitIfReady also checks the queue size. This means that commit is only run from here. However, since the queue sizes in commitIfReady() and commit() are the same and the code is asynchronous, as multiple threads can call the methods, the queue grows and grows because items are added to the queue while the commit takes a bit of time to be processed. A quick solution would be to remove the queue limit in the commit() method. Many things happen in the commit() method. Mainly because of rereading the complete directory of the file queue and because of the iteration of the filesToCommit. I would suggest to build a more robust commit queue, maybe based on events. Something like RxJava could help to make the committer-core more pluggable so people can implement there own queues. What do you think? |
If it can't keep up right now, you may have to slow it down, unfortunately. I agree the queue could be improved and I am already sold to the idea of being able to supply your own queue. I am marking this as a feature request. The Committers will be seriously revisited in the next major version and something like RxJava will be given consideration. Have you used RxJava in a few projects yourself? Any examples? |
I have some experience with Reactor, which is another Reactive Streams framework. It's also used by the Spring framework. It would be my first choice as it's targeted to Java 8 and easily integrates with Kafka and RabbitMQ. |
Hi Pascal, I'm also facing this issue, after crawling 174K files, there are 12000 files left in the commiter-queue folder not processed. |
Last year I fixed it. I can come up with a pull request tomorrow. |
thanks for for prompt response Jeroen, I will check it out, that will be in the commiter-core right? |
Yes, it's the committer-core. I need a little bit of extra time to make some unit tests and I have some stuff I haven't committed yet in my fork. In the meantime you can take a look of what I did: https://github.com/jsteggink/committer-core |
Hi Jeroen, Thanks for the update, as you mentioned current code in your fork is not your final submit, looking forward to final submit; |
@jsteggink Jeroen any update? if need I can help with the test. |
The committer queue is not fully processed because it's capped by the queueSize property. Since the queue can be bigger than the queueSize and is only called after a commit, the file queue grows and grows.
https://github.com/Norconex/committer-core/blob/master/norconex-committer-core/src/main/java/com/norconex/committer/core/AbstractFileQueueCommitter.java#L175
The text was updated successfully, but these errors were encountered: