Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Arakoon dies when --copy-db-to-head while --optimize-db was already running #203

Open
jtorreke opened this issue Sep 7, 2017 · 1 comment

Comments

@jtorreke
Copy link
Member

jtorreke commented Sep 7, 2017

The scenario: run --optimize-db on a running member. While this is executing, launch --copy-db-to-head on the same instance. Both commands will bail out, the arakoon member will crash.

Output of the --optimize-db command:

root@NY1SRV0008:/mnt/ssd1/arakoon/ny1-hddbackend04-nsm_04# arakoon --optimize-db ny1-hddbackend04-nsm_04 172.17.16.11 26450
Uncaught exception:

  End_of_file

Raised at file "src/core/lwt.ml", line 805, characters 16-23
Called from file "src/unix/lwt_main.ml", line 34, characters 8-18
Called from file "src/main/arakoon.ml" (inlined), line 517, characters 21-136
Called from file "src/main/arakoon.ml", line 612, characters 7-23
Called from file "src/main/arakoon.ml", line 626, characters 9-16
root@NY1SRV0008:/mnt/ssd1/arakoon/ny1-hddbackend04-nsm_04#

Arakoon's log file:

017-08-31 10:43:39 632236 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078398 - info - copy_db_to_head tlogs_to_keep:10
2017-08-31 10:43:39 632248 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078399 - info - quiesce_db: Pushing quiesce request
2017-08-31 10:43:39 632255 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078400 - info - quiesce_db: waiting for quiesce request to be completed
2017-08-31 10:43:39 632306 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078401 - fatal - Exception in fsm thread: (Failure "Store already quiesced. Blocking second attempt")
2017-08-31 10:43:39 632353 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078402 - fatal - going down: (Failure "Store already quiesced. Blocking second attempt")
2017-08-31 10:43:39 632360 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078403 - fatal - after pick
2017-08-31 10:43:39 632425 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078404 - info - going to drop outgoing connection as well: Lwt.Canceled
2017-08-31 10:43:39 632449 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078405 - info - going to drop outgoing connection as well: Lwt.Canceled
2017-08-31 10:43:39 632462 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078406 - info - going to drop outgoing connection as well: Lwt.Canceled
2017-08-31 10:43:39 632469 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078407 - info - waiting for 3 client_threads
2017-08-31 10:43:39 632521 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078408 - info - waiting for 2 client_threads
2017-08-31 10:43:39 632612 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078409 - warning - exception while closing, too little too late: Unix.Unix_error(Unix.EBADF, "check_descriptor", "")
2017-08-31 10:43:39 632641 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078410 - warning - exception while closing, too little too late: Unix.Unix_error(Unix.EBADF, "check_descriptor", "")
2017-08-31 10:43:39 632652 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078411 - warning - exception while closing, too little too late: Unix.Unix_error(Unix.EBADF, "check_descriptor", "")
2017-08-31 10:43:39 632672 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078412 - info - messaging_172.17.16.11_62: closing
2017-08-31 10:43:39 632686 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078413 - info - messaging_172.17.16.11_50: closing
2017-08-31 10:43:39 632696 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078414 - info - messaging_172.17.16.11_57: closing
2017-08-31 10:43:39 632745 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078415 - info - Exception in client thread messaging_172.17.16.11_62: Lwt.Canceled
2017-08-31 10:43:39 632755 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078416 - info - Exception in client thread messaging_172.17.16.11_50: Lwt.Canceled
2017-08-31 10:43:39 632762 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078417 - info - Exception in client thread messaging_172.17.16.11_57: Lwt.Canceled
2017-08-31 10:43:39 632775 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078418 - info - waiting for 1 client_threads
2017-08-31 10:43:39 632796 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078419 - info - shutting down server on port 26451
2017-08-31 10:43:41 070204 -0400 - NY1SRV0008 - 7929/0000 - arakoon - 2078420 - info - Crash log dumped
ovs-arakoon-ny1-hddbackend04-nsm_04.service: Main process exited, code=exited, status=1/FAILURE
@wimpers wimpers added this to the Roadmap milestone Nov 27, 2017
@wimpers
Copy link

wimpers commented Nov 27, 2017

Set to Roadmap as should only occur when run manually. We internally already protect against this. Won't fix in this case might be a bit too drastic.

@JeffreyDevloo JeffreyDevloo modified the milestones: Backlog, Icebox Nov 7, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants