Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add JSONPath parser support #44

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

TyShkan
Copy link
Contributor

@TyShkan TyShkan commented Mar 25, 2023

Closes #42

I struggled to run unittests due to dependencies conflicts, so I ran the test config with Meltano:

    config:
      tables:
        - "name": "json_sample_multiple_records"
          "path": "file://../../data/test/"
          "pattern": "^sample\\.json$"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "json"
        - "name": "json_sample_one_record"
          "path": "file://../../data/test/"
          "pattern": "^one-row-sample\\.json$"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "json"
        - "name": "jsonl_sample_multiple_records"
          "path": "file://../../data/test/"
          "pattern": "sample-jsonl\\.json"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "jsonl"
          "universal_newlines": True
        - "name": "jsonl_sample_one_record"
          "path": "file://../../data/test/"
          "pattern": "one-row-sample-jsonl\\.json"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "jsonl"
        - "name": "jsonl_sample_multiple_records_detect"
          "path": "file://../../data/test/"
          "pattern": "^sample\\.jsonl$"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "detect"
        - "name": "jsonl_sample_one_record_detect"
          "path": "file://../../data/test/"
          "pattern": "^one-row-sample\\.jsonl$"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "format": "detect"
        - "name": "json_sample_nested_path"
          "path": "file://../../data/test/"
          "pattern": "^sample_path\\.json$"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "json_path": "data"
          "format": "json"
        - "name": "json_sample_deep_nested_path"
          "path": "file://../../data/test/"
          "pattern": "sample_deep_path\\.json"
          "start_date": "2017-05-01T00:00:00Z"
          "key_properties": [ "id" ]
          "json_path": "data.response[*]"
          "format": "json"

with some testing files included: test.zip

@menzenski menzenski self-requested a review March 28, 2023 14:48
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these test cases be added as net-new test cases, rather than updating existing tests?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to clean it up a bit to remove a confusion and also added one new test for JSONPath case to the end of the file.

All of the tests call only one function json_handler.get_row_iterator which expects only one option (json_path) from the configuration. Table specs dict had excel config which is a bit confusing for json handler test, so I updated it, and regrouped specs and their related tests to make them more transparent. And I still missed "badnewlines" name of the first table spec (:

If you consider changes in old tests as a bad practise, I could rollback my changes and add new tests on top of the old ones.

Copy link
Contributor Author

@TyShkan TyShkan Mar 30, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've managed to build env with working dependencies and run tests. Some of the old tests don't work though, but it's not caused by the changes in this pull request and related to Excel-files.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clean test branch converted to poetry & GitHub workflow running test on different python versions committed here: https://github.com/TyShkan/tap-spreadsheets-anywhere/commits/poetry

@TyShkan TyShkan requested a review from menzenski March 30, 2023 14:31
@jcbmllgn
Copy link

jcbmllgn commented May 1, 2023

Hello 👋 any update on when this might make it into main? This would be very useful for me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Extend "json_path" config option with JSONPath parser for deep nested data
3 participants