Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

simplify row building and avoid costly applymap #62

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

thunderbug1
Copy link
Contributor

not sure if there was a special reason not to use the pandas to_json but for me it improved performance significantly and it also makes the code simpler.
... it also handles the nested dataframe case nicely without the need to convert it to a JSON string seperately.

not sure if there was a special reason not to use the pandas to_json but for me it improved performance significantly and it also makes the code simpler.
... it also handles the nested dataframe case nicely without the need to convert it to a JSON string seperately.
@cmayoracurzio
Copy link

Haven't been able to test this yet, but super interested in this potential performance improvement!

@thunderbug1
Copy link
Contributor Author

Since I sometimes run into issues with the data packet of the dataframe being too large, I also thought about possible alternatives to JSON as well.
Maybe the arrow IPC format could be used insead of JSON. It would look something like this: https://gist.github.com/camerondavison/cbe43c326f37ab0fe34680baa960634e

However, I haven't made any experiments in that direction yet.

@thunderbug1
Copy link
Contributor Author

Since I sometimes run into issues with the data packet of the dataframe being too large, I also thought about possible alternatives to JSON as well. Maybe the arrow IPC format could be used insead of JSON. It would look something like this: https://gist.github.com/camerondavison/cbe43c326f37ab0fe34680baa960634e

However, I haven't made any experiments in that direction yet.

The pandas to_json method also comes with a compression option. I think we could probably easily use that to minimize the size of the json tranmitted over the wire, especially since it is record oriented it should have quite a big effect.
We just have to figure out how to unpack it on the typescript side again, but I don't think that that would be a big issue.

@stantont
Copy link

stantont commented Apr 5, 2022

I just made a similar change myself in my local code in order to make the ag-grid Sparklines work. I was going to suggest the change but I see it has already been suggested.

I have a list in a column and the list is getting turned into a string by the existing code. So the Pandas column

"change": [[1, 2, 3], [4, 5, 6]]

would get turned into a JSON string column instead of a list of numbers, which the Sparklines can't use

"change": "[1, 2, 3]" 
//...
"change": "[4, 5, 6]"

This change made the JSON have lists of numbers and allowed the Sparklines to work in the cell.

"change": [1, 2, 3]
//...
"change": [4, 5, 6]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants