-
Notifications
You must be signed in to change notification settings - Fork 29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Stream schema does not seem to be respected in produced records #11
Comments
Indeed, this is something @edgarrmondragon and I have been discussing as of late. We probably should be excluding undeclared subproperties but as of today I think they get included or excluded based on the @edgarrmondragon , fyi, as related to recent discussions over on the SDK. I was previously thinking Note: all of the above is in regards to properties and subproperties in the stream's catalog schema, and not necessary to the metadata selection. Meaning, omitting a child nodes selection metadata would still cause the node to default to the parent value. The implicit removal only applies if a node is completely unknown/undeclared by the catalog. @laurentS - does this sound like it meets your expectations as well? Meaning, as tap developer, you'd have confidence that nothing undeclared in the schema will slip downstream to the target? Thanks, both. |
That could work. If we're gonna walk the entire JSON schema tree to figure out which props are declared, it might also make sense to update our |
I'm a bit light on the metadata part of the singer spec, so I'll chime in with my "user's" perspective. What I'm seeing:
I'm not sure this addresses your questions exactly, but my feeling from thinking through it is that if a field is not declared in the schema, it should probably not appear in records 🙂 |
I might be misunderstanding how the schema definition works for a stream, but this bothers me.
With the following schema (from
issue_comments
in 71b07b7):I am seeing the following records:
A number of the nested
*_url
fields are present in the record, although they are excluded from the schema definition. It looks like a call to https://gitlab.com/meltano/sdk/-/blob/main/singer_sdk/helpers/_singer.py#L23 frompop_deselected_record_properties
causes the field to be included becauseuser
is included, and somehow the details of the nested object do not appear in the selection mask.I suspect this is a bug in the sdk, but I might have misunderstood how the code is supposed to work 🤔
The text was updated successfully, but these errors were encountered: