-
Notifications
You must be signed in to change notification settings - Fork 4.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Connector/Source: Publish the *request* object in the record transformation like the config, stream_slice, stream_interval, etc. are #50395
Comments
At record extraction step, in each record add the service field $root holding a reference to: * the root response object, when parsing JSON format * the original record, when parsing JSONL format that each record to process is extracted from. More service fields could be added in future. The service fields are available in the record's filtering and transform steps. Avoid: * reusing the maps/dictionaries produced, thus avoid building cyclic structures * transforming the service fields in the Flatten transformation. Explicitly cleanup the service field(s) after the transform step, thus making them: * local for the filter and transform steps * not visible to the next mapping and store steps (as they should be) * not visible in the tests beyond the test_record_selector (as they should be) This allows the record transformation logic to define its "local variables" to reuse some interim calculations. The contract of body parsing seems irregular in representing the cases of bad JSON, no JSON and empty JSON. Cannot be unified as that that irregularity is already used. Update the development environment setup documentation * to organize and present the setup steps explicitly * to avoid misunderstandings and wasted efforts. Update CONTRIBUTING.md to * collect and organize the knowledge on running the test locally. * state the actual testing steps. * clarify and make explicit the procedures and steps. The unit, integration, and acceptance tests in this exactly version succeed under Fedora 41, while one of them fails under Oracle Linux 8.7. not related to the contents of this PR. The integration tests of the CDK fail due to missing `secrets/config.json` file for the Shopify source. See airbytehq#197
Replaced with airbytehq/airbyte-python-cdk#214 |
The JIRA API can surprise you with another case that is impossible to store in (second-level) tables:
The attempt to store the elements of the change log (path:histories[].items[]) is not possible even with the suggested change above, as each such element requires the timestamp from the parent histories[*] and the issue key from the root JSON object. Suggestion
Pros and cons
|
Discussed in #49971
Originally posted by rpopov December 20, 2024
Status in Airbyte 1.2.0
The record transformation allows removing existing fields from the record and adding new fields by calculating them using:
Example
The JIRA /issue response:
In JIRA the list of custom fields is highly dynamic, and it makes no practical sense to have a single record per issue, combining in it all standard and custom fields.
Instead, the JIRA issue can be represented using 2 tables:
ISSUE 1 --- * CUSTOM_FIELD
in one-to-many / master-detail relation.Turning the response into a list of records, one per custom field, could be done by iterating over the schema.* sub-list in the response, but taking their values from the fields map and taking their human-readble names from the name map.
Problem
In this configuration, the fields and names maps are not accessible. They could be, if the response object were available in the Tranformations' context. The response object exists in the context of the Pagination section.
Suggestion
The text was updated successfully, but these errors were encountered: