Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync incomplete #600

Open
nickbaileysmg opened this issue Nov 12, 2019 · 7 comments
Open

Sync incomplete #600

nickbaileysmg opened this issue Nov 12, 2019 · 7 comments

Comments

@nickbaileysmg
Copy link

I'm using the command line to sync a lot of small files to B2. A typical sync is 600,000 files and a couple hundred Gb in size. When performing syncs occasionally they will go through without an issue but most of the time I will see 503 errors. I know how B2 works so this is usually not concerning but on a large sync with 10-15 threads running sometimes I will see this for a couple minutes straight with no successful uploads. At the same time I see the "updated files" count increasing and when the sync does finish I receive a "ERROR: sync is incomplete". If they are 503 errors shouldn't the sync be reattempting the connection?

@ppolewicz
Copy link
Collaborator

Sync is reattempting it, but after 4 or 5 failed attempts it gives up. The cloud service must have been down for a while, I guess.

I think we should change the retry mechanism to try for a longer period of time (minutes) instead of giving up after a certain number of attempts...

@nickbaileysmg
Copy link
Author

Yeah, in my use case I would prefer for it to keep trying to sync the files for as long as it takes. Maybe a way to make it wait and reattempt every x amount of time until it completes a successful upload and continues on.

@dumbasPL
Copy link

any updates on this? I'm getting b2.exception.MaxRetriesExceeded: FAILED to upload after 5 tries. Encountered exceptions: 503 service_unavailable no tomes available and the sync fails completely. This is happening every almost every day.

@ppolewicz
Copy link
Collaborator

The number of retries has been increased a few versions ago, perhaps if you update to the latest version the problem will go away?

As for reworking the retry mechanism, it's important to understand what the expectation of the user is before I can do any changes in the code. Adding an --infinite-retries switch isn't very hard, but then a failing daily backup job can for some reason never terminate and keep using memory while another job starts the next day and they do it until the memory of the entire server is exhausted. I think it's practical to put and end to it at some point, but how would the users want to define the condition that would end the patience of the retry mechanism?

Would something like --infinite-retries-for 86400 be appropriate?

@dumbasPL
Copy link

dumbasPL commented Jun 16, 2023

I think implementing an exponential backoff (no point spamming all the time if it's not gonna work anyway) and a retry limit would be a good solution.

Would something like --infinite-retries-for 86400 be appropriate?

It's not really infinite it has a timeout is it, so maybe just --retry-for? --max-retries would also work but I think a time-based limit makes way more sense in a tool like this.

Edit: Another thing to consider is whether the retry limit should apply to a single failure or the entire process. One way to do this would be to terminate the process if the time since the start is greater than the limit and a failure occurred. But the problem with this approach is that it has the potential to break huge one-off transfers. Maybe we could use a combination of the two approaches? Something like:

  1. if only --max-retries is present then on each failure retry up to x times and terminate if the limit is exceeded
  2. if only --retry-for is present then retry until the total time reaches the limit and if it does retry for an additional (insert default value for max-retries here) before terminating
  3. if both are present then it's the same as nr 2 but we use the --max-retries value after the time limit expires

Or am I just overcomplicating things?

@samueldashadrach
Copy link

samueldashadrach commented Aug 28, 2024

Rerunning the sync command worked for me, it found the missing files and uploaded those. You can consider adding a help message after ERROR sync is incomplete asking the user to rerun the same sync command.

@samueldashadrach
Copy link

P.S. Before implementing a retry flag please consider whether poor network is the only reason sync fails. For me all the failures were due to this error:

b2sdk._internal.exception.UploadTokenUsedConcurrently: More than one concurrent upload using auth token 4_005e2085a4bde050000000000_01b6a814_4c92f3_uplg_dXgloeWUIYG3GO1gIGEuJcv8Mpw=
b2_upload(/home/samueldashadrach/downloads/LG-allepubs-targroups/994.tar, main/994.tar, 1724578690873): UploadTokenUsedConcurrently() More than one concurrent upload using auth token 4_005e2085a4bde050000000000_01b6a814_4c92f3_uplg_dXgloeWUIYG3GO1gIGEuJcv8Mpw=

Until you've enumerated all the possible errors I'm not sure how to identify which ones should be handled by the cli tool and which ones should be left to the user to debug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants