Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cached assets on s3 for all datasets #1882

Merged
merged 1 commit into from
Sep 28, 2023

Conversation

AndreaFrancis
Copy link
Contributor

No description provided.

@codecov-commenter
Copy link

codecov-commenter commented Sep 28, 2023

Codecov Report

Attention: 2 lines in your changes are missing coverage. Please review.

Comparison is base (3c6c79b) 87.65% compared to head (40bb174) 82.07%.
Report is 2 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1882      +/-   ##
==========================================
- Coverage   87.65%   82.07%   -5.59%     
==========================================
  Files          22       13       -9     
  Lines         875      926      +51     
==========================================
- Hits          767      760       -7     
- Misses        108      166      +58     
Flag Coverage Δ
services_admin ?
services_rows 84.16% <100.00%> (?)
services_search 78.55% <33.33%> (?)

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
services/rows/src/rows/routes/rows.py 57.47% <100.00%> (ø)
services/rows/tests/routes/test_rows.py 93.13% <100.00%> (ø)
services/search/src/search/routes/search.py 53.60% <33.33%> (ø)

... and 32 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@severo severo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK. I was wondering if it's OK to delete the alternative option (fielsystem). But I think it's easy to set it back if needed, right?

Can we also deprecate some environment variables at the same time, or are they all needed yet

@AndreaFrancis AndreaFrancis requested a review from a team September 28, 2023 16:47
Copy link
Member

@lhoestq lhoestq left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

let's go !

@AndreaFrancis
Copy link
Contributor Author

OK. I was wondering if it's OK to delete the alternative option (fielsystem). But I think it's easy to set it back if needed, right?

I think so, but the problem is with DNS configuration, not sure if there is a way to redirect some datasets to EFS and others to S3.
Maybe @rtrompier could help us confirm.

@AndreaFrancis AndreaFrancis merged commit eefd206 into main Sep 28, 2023
@AndreaFrancis AndreaFrancis deleted the cached-assets-s3-all-datasets branch September 28, 2023 16:56
@severo
Copy link
Collaborator

severo commented Sep 28, 2023

Oh, yes, I hadn't this in mind. OK to do that. Should we remove /cached-assets from the nginx reverse proxy configuration?

@severo
Copy link
Collaborator

severo commented Sep 28, 2023

(I'm wondering how we will handle the tests: end to end, unit)

@AndreaFrancis
Copy link
Contributor Author

Oh, yes, I hadn't this in mind. OK to do that. Should we remove /cached-assets from the nginx reverse proxy configuration?

I think so, I created this issue for tracking #1885

@AndreaFrancis AndreaFrancis mentioned this pull request Sep 28, 2023
4 tasks
@severo
Copy link
Collaborator

severo commented Sep 29, 2023

Currently the docker compose files don't use S3. I think it means we cannot use /rows and /search with docker for images and audio, right? We should be testing this in the e2e tests, I think.

Should we keep the switch between directory and S3 in the code, to support docker compose (and thus: the init tests and the e2e tests)? The alternatives are: a) to include S3 in docker compose (and thus: use or mock S3 for tests), b) to remove docker compose completely and only use k8s/helm (it still means that we need S3 for the tests)

@rtrompier
Copy link
Collaborator

I think so, but the problem is with DNS configuration, not sure if there is a way to redirect some datasets to EFS and others to S3.

Nop we can't. The switch is based on url path in cloudfront.
But if you can have a different path between datasets, it should be possible.

@lhoestq
Copy link
Member

lhoestq commented Sep 29, 2023

Another alternative: use a a storage agnostic tool like fsspec to store the images (currently we use boto if I'm not mistaken).
In prod it would point to s3://bucket-name and in docker-compose it would point to a volume shared with the reverse proxy

@severo
Copy link
Collaborator

severo commented Sep 29, 2023

Another alternative: use a a storage agnostic tool like fsspec to store the images (currently we use boto if I'm not mistaken). In prod it would point to s3://bucket-name and in docker-compose it would point to a volume shared with the reverse proxy

I like this option -> I created #1887

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants