Download directly from S3 for faster query times #65

jankaifer · 2024-10-15T14:26:06Z

Opportunity

Whenever you query with Athena it will save the result into some S3 bucket. You can simply download this .csv file like any normal file from S3.

Another observation is that Athena API can return at most 1000 rows on one single page. This has a significant performance impact if you try to download 100k + rows. There need to be 100+ requests, even if it's just a few MB.

In our case, we are querying Athena from a different region (and different continent) so just the latency alone on those 100+ requests is multiple seconds.

Downloading from S3 is a single request, which is faster. There are almost no downsides.

Result

After I implemented fetching directly from Athena we observed a significant speed-up in our query times. For queries that ~100k rows, it went from 38 seconds to just 18 seconds which is more than a 2x improvement. This is even more significant for queries that return more rows (in some places it was even 4x speed-up).

Request

It would be nice if some form of S3 fetching would be implemented upstream. I have opened PR with my implementation, it's not in a mergeable state right now. I will not have time to clean it up and create a proper PR, but I wanted to share my code anyway in case it helps someone or someone finds the time to properly integrate that functionality into athenadriver API.

The text was updated successfully, but these errors were encountered:

jankaifer mentioned this issue Oct 15, 2024

Add support for downloading result from s3 #66

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Download directly from S3 for faster query times #65

Download directly from S3 for faster query times #65

jankaifer commented Oct 15, 2024 •

edited

Loading

Download directly from S3 for faster query times #65

Download directly from S3 for faster query times #65

Comments

jankaifer commented Oct 15, 2024 • edited Loading

Opportunity

Result

Request

jankaifer commented Oct 15, 2024 •

edited

Loading