-
Notifications
You must be signed in to change notification settings - Fork 12
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
"Download all my photos"-button for face detection #3173
Comments
This would be really nice to have. There's also #2770. I think we could tackle both of them by writing an AWS Lambda that zips a list of s3 objects, and saves it to S3. The original problems for #2770 then no longer apply. I think it would be nice to keep track of those archives with a model, for zipped albums as well as face detection photos. Then we can:
We could keep track of the exact files that need to be included in an archive by hashing the photos (filenames or even content hashes), and storing that on the zip model. It's really easy and efficient to check whether the stored archive is up to date. For the UI, when an up-to-date archive already exists, a link to that can just be rendered. For facedetection if an archive doesn't exist yet, we can have JS do an API call to trigger the creation, and either poll until it's done, or have a slow hanging response if we await for lambda completion in the response cycle. |
https://www.antstack.com/blog/create-zip-using-lambda-with-files-streamed-from-s3/ for some inspiration for a lambda |
@LucAngevare here's some inspiration (definitely not ready to copy-paste) for how parts of the backend could look (this is for doing an album, facedetection would be similar of course): class DownloadZip(models.Model):
token = models.CharField(
max_length=40,
default=secure_token,
editable=False,
help_text="Token used by a Lambda to authenticate "
"to the API to submit encoding(s) for this source.",
)
digest = models.CharField(
max_length=40,
editable=False,
help_text="Digest of the (digests of the) photos in the ZIP file.",
)
file = models.FileField(null=True, upload_to="downloads/zip/")
def _get_download_zip_digest(photos):
"""Return a digest over the digest of the provided queryset of photos.
To make the digest reproducible, the queryset of photos is ordered by the
photos' individual digests. The SHA1 digest over these represents the files encoded
in a ZIP file and can be used to check if the ZIP file is up to date.
"""
hash = hashlib.sha1()
for photo_digest in photos.values_list("_digest", flat=True).order_by("_digest"):
hash.update(photo_digest.encode())
return hash.hexdigest()
def create_download_zip(album: Album):
photos = album.photo_set.all()
zip_digest = _get_download_zip_digest(photos)
album.download_zip = DownloadZip(digest=zip_digest)
album.download_zip.save()
album.save()
# Trigger creating the ZIP file.
if settings.PHOTOS_ZIP_LAMBDA_ARN is None:
logger.warning(
"No ZIP Lambda ARN has been configured. ZIP file will be created locally."
)
# TODO: local version or maybe celery task (open and zip the files, save it and process the fact that it's done).
else:
s3_client = boto3.client(
service_name="s3",
aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
)
# Get a presigned request that allows the lambda to upload the ZIP file directly to S3.
presigned_post_data = s3_client.generate_presigned_post(
settings.AWS_STORAGE_BUCKET_NAME,
"downloads/album/" + album.slug + ".zip",
Fields={"acl": settings.AWS_DEFAULT_ACL, "Content-Type": "application/zip"},
Conditions=[
{"acl": settings.AWS_DEFAULT_ACL},
{"Content-Type": "application/zip"},
],
ExpiresIn=3600,
)
lambda_client = boto3.client(
service_name="lambda",
aws_access_key_id=settings.AWS_ACCESS_KEY_ID,
aws_secret_access_key=settings.AWS_SECRET_ACCESS_KEY,
)
# Invoke the lambda to create the ZIP file.
response = lambda_client.invoke(
FunctionName=settings.FACEDETECTION_LAMBDA_ARN,
InvocationType="Event",
Payload=json.dumps(
{
"api_url": settings.BASE_URL,
"token": album.download_zip.token,
"upload": presigned_post_data,
"files": [
{"url": photo.file.url, "name": f"{i:04d}"}
for i, photo in enumerate(photos)
],
}
),
)
if response["StatusCode"] != 202:
raise Exception("Lambda response was not 202.") It'd probably be pretty easy to start off without an actual AWS Lambda and make it run in the webserver (as a celery task). Whenever an album is saved, we can then do simply:
|
Is your feature request related to a problem? Please describe.
I want to download all photos of me
Describe the solution you'd like
A "Download all my photos"-button
Motivation
Describe alternatives you've considered
Additional context
The text was updated successfully, but these errors were encountered: