Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
uploader: update exponential backoff timeouts
uploader: update exponential backoff timeouts In the current architecture, catalyst-uploader instances are launched to upload each segment. During any given time, we can have multiple pids running where each instance attempts to write to s3 storage. If there's an outage on the storage provider, the exponential backoff retry logic kicks in and attempts to retry uploads. When multiple instances of catalyst-uploader are running, the retries tend to happen at roughly the same time in short burts leading us to quickly hit the kernel pthread_create limits. When this happens, the pods become CPU/mem bound eventually and pods may stop responding. To reduce the impact of this, the following changes are being made: * reduce # of retries from 7 to 4 * set initial interval to 30s to space out the retry attempts * set max interval to 2min to space out even further Note that this reduces the probability of running into the same issue and is not a true fix. A proper fix would require a rearchitecture of how catalyst-uploader works in conjunction with Mist.
- Loading branch information