-
Notifications
You must be signed in to change notification settings - Fork 207
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat: increase data serialization speed #85
base: main
Are you sure you want to change the base?
Feat: increase data serialization speed #85
Conversation
Use pandas to_json to serialize the dataframe. On a test dataframe of 10_000 x 3 this sped up the whole rending process by over 10x. From 2.5 sec to 0.22 sec. This also reduces the complexity and browser processing of nested grids as there is no need to parse the json for the row details.
@thunderbug1 tagging you as this is similar to a pull request of yours that is still open. Have you or @luke321321 heard back from PablocFonseca on this performance improvement? Looks great tbh :) |
@marduk2 ah I didn't realise there was an almost duplicate PR in #62. I haven't heard from @PablocFonseca |
I've been busy over the past months, with little time to work on ag-grid. |
Just checked. |
Thanks for the context. That's brilliant speedup (similar to pandas).
I'd be happy to review the PR if you'd like a second pair of eyes on it.
…On Wed, 6 Jul 2022, 00:21 Pablo Fonseca, ***@***.***> wrote:
Just checked.
The issue is that pandas convert tz-aware to UTC when serializing to json,
it is a known issue pandas-dev/pandas#46730
<pandas-dev/pandas#46730>.
I did that first implementation as a work around, I just changed it now
using a new method: _cast_tz_aware_date_columns_to_iso8601. Below are the
initial results I got when serializing a 1_000_000 x 8 df on my modest
computer.
[image: Captura de Tela 2022-07-05 às 20 17 30]
<https://user-images.githubusercontent.com/17606067/177432578-d16c7139-b557-4c2c-abf4-bd98351f5eb8.png>
—
Reply to this email directly, view it on GitHub
<#85 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAH6OY4XY3W4D3NL3GEZJETVSS7RNANCNFSM5UBOQNGQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Any updates on this? |
Use pandas
to_json()
to serialize the dataframe instead of the custom serialization.On a test dataframe of length
10_000
this sped up the conversion to json by 10x.I've tested the outputs and they are consistent with the previous
get_row_data()
function for most use cases. There's a slight tweak with nested dataframes that the nested datafarme is already in json format and so doesn't have to be unpacked in JS. This makes it even easier for users as the previous behavior was confusing.Test results