You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
First off, lovely tool. Use it daily to make life better in our group. Thanks for your work on this project.
Up until July 2 our automated Jenkins job that backs up R53 zones to S3 was working normally with the script below. It started failing the morning of July 3. It runs once a day and keeps the past 30 days in S3.
We have 182 hosted zones in our AWS account.
#!/usr/bin/env bash# Enter Bash "strict mode"set -o errexit # Exit immediately on any non-zero error exit statusset -o nounset # Trigger error when expanding unset variablesset -o pipefail # Prevent errors in a pipeline from being masked
IFS=\$'\n\t'# Internal Field Separator controls Bash word splitting# Declare backup path & master zone files
BACKUP_PATH="$(date +%F)"
ZONES_FILE="all-zones.txt"
DNS_FILE="all-dns.txt"echo"Backing up Route53: ${BACKUP_PATH}"# Create date-stamped backup directory and enter it
mkdir -p "$BACKUP_PATH"cd"$BACKUP_PATH"# Create a list of all hosted zones
cli53 list --debug --format text >"$ZONES_FILE"2>&1# Create a list of domain names only
sed '/Name:/!d'"$ZONES_FILE"| cut -d: -f2 | sed 's/^..//'| sed 's/.\{3\}$//'>"$DNS_FILE"# Create backup files for each domainwhileread -r line;do
cli53 export --debug --full "$line">"$line.txt"done<"$DNS_FILE"cd ../
tar czvf "${BACKUP_PATH}.tgz"$BACKUP_PATH
aws s3 cp $BACKUP_PATH.tgz "s3://<bucket-name>/route53/${BACKUP_PATH}.tgz"
rm -rf $BACKUP_PATH# Prune any tgz files older than 30 days
find *.tgz -type f -mtime +30 -exec rm -f {} \;# Exit Bash "strict mode"set +o errexit
set +o nounset
set +o pipefail
exit 0
Expected behaviour
I expected the list and export commands to complete without error.
Actual behaviour
cli53 exceeds the rate limit, receives a 400 bad request response from Route53, and continually retries, keeping the rate limit in effect. The Jenkins job was running for over 6 hours before I discovered that was why no one could make changes to zones and records. It was effectively maintaining the ban.
Have you checked if the documentation has the information you require?
Yes, I've googled and read and tried sleep 5 between commands. Once that rate limit is exceeded it remains in effect for an unknown time before the "ban" is lifted. AWS uses the term "throttled", but you are effectively unable to use the API for at least 30m.
Could you contribute a fix or help testing with this issue?
I'd love to, but I don't know Go yet. It would be lovely to have an option to turn off retries and fail on the first error. Or, add an option for a delay between requests. Or more lovelier, a retry, delay, backoff strategy.
The text was updated successfully, but these errors were encountered:
Issue type
cli53 version (cli53 --version)
0.8.12 and 0.8.15
OS / Platform
Linux 386
Darwin
Steps to reproduce
First off, lovely tool. Use it daily to make life better in our group. Thanks for your work on this project.
Up until July 2 our automated Jenkins job that backs up R53 zones to S3 was working normally with the script below. It started failing the morning of July 3. It runs once a day and keeps the past 30 days in S3.
We have 182 hosted zones in our AWS account.
Expected behaviour
I expected the list and export commands to complete without error.
Actual behaviour
cli53 exceeds the rate limit, receives a 400 bad request response from Route53, and continually retries, keeping the rate limit in effect. The Jenkins job was running for over 6 hours before I discovered that was why no one could make changes to zones and records. It was effectively maintaining the ban.
Have you checked if the documentation has the information you require?
Yes, I've googled and read and tried
sleep 5
between commands. Once that rate limit is exceeded it remains in effect for an unknown time before the "ban" is lifted. AWS uses the term "throttled", but you are effectively unable to use the API for at least 30m.Could you contribute a fix or help testing with this issue?
I'd love to, but I don't know Go yet. It would be lovely to have an option to turn off retries and fail on the first error. Or, add an option for a delay between requests. Or more lovelier, a retry, delay, backoff strategy.
The text was updated successfully, but these errors were encountered: