Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add schema naming strategy for Kafka Connect #126

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

averemee-si
Copy link

Issue #, if available: 125
#125

Description of changes:
Added a new implementation of AWSSchemaNamingStrategy that makes AWS Glue Schema names same as Kafka Connect schema names. Default AWSSchemaNamingStrategy implementation makes schema names same as topic name and this cause strange naming and versioning in AWS Glue Schema Registry for Kafka Connect records created using both key and value schemas.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

import org.apache.avro.Schema;

@Slf4j
public class AWSSchemaNamingStrategyConnectNameImpl implements AWSSchemaNamingStrategy {
Copy link
Contributor

@blacktooth blacktooth Jan 11, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for submitting this PR!

Few comments:

  1. In my opinion this is also useful for use-cases beyond Kafka connect. We can drop 'Connect' from the class name. Also, we support multiple data-formats like JSON / Protobuf (future) but logic here seems to be specific to Avro, I suggest either of the following suggestions:
    1.1 Add a TODO / issue and a note that this needs to be improved to support other data formats. I can take up the improvement later. Or
    1.2 Feel free to implement it for JSON format if you can. :)

Name suggestion: RecordSchemaNameNamingStrategy or better.

  1. Can we also support Avro SpecificRecord?
  2. In order to have predictable behavior, can we throw an exception if fullName is null or it's not an Avro record.
  3. Were you able to test this end-to-end from serializer to de-serializer? It would be great if you can add a unit-test.

Thanks again for the PR!

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for review and superb suggestions!

  1. I will rename it to AWSSchemaNamingStrategyRecordSchemaNameImpl
    Unfortunately it seems to me that JSON Schema converter ignores value of both key.converter.schemaNameGenerationClass and value.converter.schemaNameGenerationClass parameters. So I'm unable to overload default implementation, unfortunately.
  2. Yes, this works with key.converter.avroRecordType=SPECIFIC_RECORD and value.converter.avroRecordType=SPECIFIC_RECORD
  3. Will be done.
  4. Currently I'm testing in my Oracle RDBMS source and sink connector environment. I will try to backport some record examples for use in end-to-end unit test.

Regards,
Aleksei

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Thanks! We can drop Impl from the name.
  2. Should the code be updated as well? From the code, it seems like it only works if the passed object is GenericRecord.

Thank you!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this PR @averemee-si !

Would it be possible for you to implement the comments and resubmit this PR ?

@MolochkoVitaly
Copy link

Hi @blacktooth @averemee-si
Could you please share status or any details on this PR?
It would be super cool to have this PR-feature in master

@averemee-si
Copy link
Author

Hi @blacktooth @averemee-si
Could you please share status or any details on this PR?
It would be super cool to have this PR-feature in master

Hi Vitaly,

This naming strategy is also available in separate repo - aws-glue-schema-naming

Regards,
Aleksei

@blacktooth
Copy link
Contributor

@averemee-si Do you want to update this PR with suggested feedback?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants