Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pandas dataframes #7

Open
chrhck opened this issue Mar 19, 2020 · 3 comments
Open

pandas dataframes #7

chrhck opened this issue Mar 19, 2020 · 3 comments
Assignees
Labels
enhancement New feature or request

Comments

@chrhck
Copy link
Collaborator

chrhck commented Mar 19, 2020

Dataframes are awesome. Do we want to support them?
This would mean either switching over from numpy recarrays completely or adding an abstraction layer that can handle both..

@chrhck chrhck added the enhancement New feature or request label Mar 19, 2020
@HansN87
Copy link
Contributor

HansN87 commented Mar 19, 2020

Dataframes are awesome! I am in favor of supporting them. Completely switching probably requires some performance comparisons (found this: http://gouthamanbalaraman.com/blog/numpy-vs-pandas-comparison.html).

@martwo
Copy link
Collaborator

martwo commented Mar 19, 2020

SkyLLH doesn't use numpy recarray internally. it's all 1d ndarrays. I would be surprised if Dataframes would be faster than that

@chrhck
Copy link
Collaborator Author

chrhck commented Mar 20, 2020

From the link @HansN87 posted above it seems Dataframes are always quicker for more than 500k rows, and potentially quicker for more than 50k rows, although having a larger memory footprint.
For now I think a good starting point would be to add support for Dataframes in the usercode (e.g. when dealing with analyzing trials).

@chrhck chrhck self-assigned this Mar 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

3 participants