Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Neofs-cli control command hangs up if called immediately after storage node start #1688

Closed
vdomnich-yadro opened this issue Aug 15, 2022 · 2 comments
Labels
bug Something isn't working U3 Regular

Comments

@vdomnich-yadro
Copy link
Contributor

vdomnich-yadro commented Aug 15, 2022

Minor issue, has something to do with race condition on startup.
We did workaround this in tests by adding a small delay between node start and calling neofs-cli command.

Steps to Reproduce (for bugs)

Given dev-env environment with all 4 storage nodes running and online.

echo 'password: ""' > config.yaml

neofs-cli --endpoint s02.neofs.devenv:8081 -w services/storage/wallet02.json -c config.yaml control set-status --status offline

# Tick epoch
sleep 10s
bin/tick.sh

# Remove all data from storage volume
docker stop s02
sleep 5s
rm -rf /var/lib/docker/volumes/storage_storage_s02/_data/*

# Start container using python (it seems to be faster than docker start command and is necessary to reproduce race condition)
python3 -c "import docker; d = docker.APIClient(); d.start('s02');"
sleep 0.4s  # 0.4s was selected experimentally, it happens after grpc api is up, but before morph notifications start working

# This command hangs up:
neofs-cli --endpoint s02.neofs.devenv:8081 -w neofs-dev-env/services/storage/wallet02.json -c config.yaml control set-status --status online

Current Behavior

Once set-status command from the script times out, all subsequent calls of neofs-cli control are failing as well. The docker container needs to be restarted in order to bring things into normal state.

Expected Behavior

The set-status command from the script above does not hang up. Or, at least, it does not leave node in the broken state and subsequent calls to set-status are working.

Regression

No

Your Environment

  • Version used:
    v0.31.0
  • Server setup and configuration:
    Local devenv
  • Operating System and version (uname -a):
    Linux ubuntu 4.15.0-189-generic #200-Ubuntu SMP Wed Jun 22 19:53:37 UTC 2022 x86_64 x86_64 x86_64 GNU/Linux
@vdomnich-yadro
Copy link
Contributor Author

If we increase delay between python code that starts container and neofs-cli control command, then everything works.
Attached are logs for hang-up case (log-bad.txt) and for case with increased delay, when everything works (log-good.txt).

@roman-khimov
Copy link
Member

Should be OK now with reordered init, #2585.

@roman-khimov roman-khimov closed this as not planned Won't fix, can't repro, duplicate, stale Dec 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working U3 Regular
Projects
None yet
Development

No branches or pull requests

4 participants