Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exclude Compound Fields when using the bulk-api #42

Open
s7clarke10 opened this issue Aug 3, 2023 · 2 comments
Open

Exclude Compound Fields when using the bulk-api #42

s7clarke10 opened this issue Aug 3, 2023 · 2 comments

Comments

@s7clarke10
Copy link
Contributor

s7clarke10 commented Aug 3, 2023

If you use the Bulk v1 API you cannot extract compound fields. You must exclude them from your list of ingested fields.

"CRITICAL Error syncing ContactPointAddress: InvalidBatch : Failed to process query: FUNCTIONALITY_NOT_ENABLED: Selecting compound data not supported in Bulk Query", "level": "info", "timestamp": "2023-08-02T10:03:30.226398Z"}

I believe Compound fields could be automatically excluded by doing a look-up first against the Salesforce Data Dictionary and examining for the given table being ingested whether any of the columns are compound fields e.g. "compoundFieldName": "Coordinates__c", . In the example provided, the compound field Coordinates__c should be excluded when using a Bulk API method.

To find out more about the Salesforce Data Dictionary and dumping the Data Dictionary look at this github repo https://github.com/s7clarke10/get-salesforce-data-dictionary.

@dlouseiro
Copy link
Contributor

@s7clarke10 I experienced the same issue and opened a PR to fix it.

Feel free to test it out on your side to see if it fixes your issue as well, as it definitely fixed mine!

@s7clarke10
Copy link
Contributor Author

s7clarke10 commented Jan 27, 2024 via email

edgarrmondragon pushed a commit that referenced this issue Jun 28, 2024
In the current implementation of the tap, fields described with
`type=address` are correctly excluded when using `api_type=BULK`.

Although, this is not the case for geolocation fields (described with
`type=location`).

I'm not a Salesforce specialist per se, but know that the changes I'm
applying on this branch solved the issue described
[here](#42).

In my specific case I had a field like this:

```
                    "Geolocation__c": {
                        "type": [
                            "number",
                            "object",
                            "null"
                        ],
                        "properties": {
                            "longitude": {
                                "type": [
                                    "null",
                                    "number"
                                ]
                            },
                            "latitude": {
                                "type": [
                                    "null",
                                    "number"
                                ]
                            }
                        }
                    },
```

As it's a compound field, split over multiple properties (`longitude`
and `latitude`), the parent field `Geolocation__c` does not have much
value as the relevant information is propagated in the "sub-fields"
(`latitude` and `longitude`).

So what I'm doing in this PR is to ensure that:
- `Geolocation__c` is marked with `unsupported` in the schema metadata
(snippet below) while still ensuring that the "sub-fields" are correctly
queried.

Snippet of the excluded schema after this PR:
```
                {
                    "breadcrumb": [
                        "properties",
                        "Geolocation__c"
                    ],
                    "metadata": {
                        "inclusion": "unsupported",
                        "unsupported-description": "cannot query compound address fields with bulk API"
                    }
                },
```

Snippet of the "sub-fields" in the schema:

```
                    "Geolocation__Latitude__s": {
                        "type": [
                            "null",
                            "number"
                        ]
                    },
                    "Geolocation__Longitude__s": {
                        "type": [
                            "null",
                            "number"
                        ]
                    },
```
nezd pushed a commit to dext/tap-salesforce that referenced this issue Sep 26, 2024
In the current implementation of the tap, fields described with
`type=address` are correctly excluded when using `api_type=BULK`.

Although, this is not the case for geolocation fields (described with
`type=location`).

I'm not a Salesforce specialist per se, but know that the changes I'm
applying on this branch solved the issue described
[here](MeltanoLabs#42).

In my specific case I had a field like this:

```
                    "Geolocation__c": {
                        "type": [
                            "number",
                            "object",
                            "null"
                        ],
                        "properties": {
                            "longitude": {
                                "type": [
                                    "null",
                                    "number"
                                ]
                            },
                            "latitude": {
                                "type": [
                                    "null",
                                    "number"
                                ]
                            }
                        }
                    },
```

As it's a compound field, split over multiple properties (`longitude`
and `latitude`), the parent field `Geolocation__c` does not have much
value as the relevant information is propagated in the "sub-fields"
(`latitude` and `longitude`).

So what I'm doing in this PR is to ensure that:
- `Geolocation__c` is marked with `unsupported` in the schema metadata
(snippet below) while still ensuring that the "sub-fields" are correctly
queried.

Snippet of the excluded schema after this PR:
```
                {
                    "breadcrumb": [
                        "properties",
                        "Geolocation__c"
                    ],
                    "metadata": {
                        "inclusion": "unsupported",
                        "unsupported-description": "cannot query compound address fields with bulk API"
                    }
                },
```

Snippet of the "sub-fields" in the schema:

```
                    "Geolocation__Latitude__s": {
                        "type": [
                            "null",
                            "number"
                        ]
                    },
                    "Geolocation__Longitude__s": {
                        "type": [
                            "null",
                            "number"
                        ]
                    },
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: No status
Development

No branches or pull requests

2 participants