Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(Pyclient): use column metadata when processing data from CSV API #4502

Open
YpeZ opened this issue Nov 18, 2024 · 0 comments
Open

feat(Pyclient): use column metadata when processing data from CSV API #4502

YpeZ opened this issue Nov 18, 2024 · 0 comments
Labels
emx2_python_client enhancement New feature or request

Comments

@YpeZ
Copy link
Contributor

YpeZ commented Nov 18, 2024

Issue

Currently the Pyclient implicitly converts CSV data from the get method to a pandas DataFrame. In this conversion pandas makes assumptions about the contents of each column. This can lead to unwanted results, such as the casting of integer values to string values, or string values to floats. The behaviour of dealing with NA values is often unpredictable too.

Solution

The Pyclient already contains functionality for working with column metadata. This metadata can then be used to ensure the conversion occurs in the way it is expected to.
A newly written parsing function must be implemented within the get method of the Pyclient.

Alternatives

No response

Additional context

When processing data from the National Node staging areas into BBMRI-ERIC tables, this unexpected behaviour was encountered (f.e. latitude values were seen as floats and zero's were removed from the values). To circumvent this, the pyclient's get method has been copied into the BBMRI-ERIC publish package and adjusted in such a way that no pandas DataFrame is included in the process. A second function has been added that resets the datatypes: reset_data_types

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
emx2_python_client enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant