Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add additional timestamp sources #298

Draft
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

mkeskells
Copy link
Contributor

from a header value
from a data field
via a custom extractors

@mkeskells mkeskells requested review from a team as code owners September 26, 2024 12:30
@mkeskells mkeskells marked this pull request as draft September 26, 2024 12:30
@mkeskells mkeskells requested a review from Claudenw October 4, 2024 09:57
@mkeskells mkeskells force-pushed the i-297-custom-record-grouping branch from ea0cd33 to 74049bb Compare October 8, 2024 12:19
Copy link
Contributor

@Claudenw Claudenw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry it took so long to get back to this review.


* @return this
*/
public Builder configuration(final String configuration) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method does not make sens to me.

There should be a setType(Type type) method and a setArgs(String... args) method. Worry about configuration packing outside of this class. I know that you are thinking that you can put HEADER:field into a configuration but do the unpacking outside of this class, then you don't have to jump through hoops to change the split pattern or similar gyrations.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

happy to separate the type and parameter, but its not varargs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think its probably better to allow all of the additional class parameters to be able to be specified via additional parameters - which become available with #300. That way its explicit, documented and extensible

Comment on lines 156 to +164
this.zoneId = zoneId;
this.type = type;
this.dataExtractor = dataExtractor;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are no checks for any of these being null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There doesn't need to be. The caller validates. Adding null checks at all levels just blurs the concerns IMO

Comment on lines +167 to +174
public ZonedDateTime time(SinkRecord record) {
return fromRawTime(dataExtractor.extractDataFrom(record));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you make this an abstract method then you can simply override it in your implementations and not pollute this class with methods that are only used by one instance. This will also make the code easier to read and maintain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This allows reuse. Its can still be overridden if a subclass needs it but doesn't require that. What is here is a default, available to be

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are two separate concerns here. The location of the date like field, and the conversion of that to a ZonedDateTime - e.g. 982b347 adds java.util.Date as a source conversion
This isn't polluting the api, its promoting reuse, and when extending this library to allow external code to use it, I believe its best to enable extension code to access to common and shared concepts, like implicit type conversion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not a class that just does the conversions. I think that classes should try to do one thing and do it very well. A class to convert time formats means they are reusable elsewhere in the chain.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so you want o have the Extraction as one component in the TimstampSource?
So roughly an extractor and a convertor?

that not the way that the existing TimestampSources work

e.g.
for event

        @Override
        public ZonedDateTime time(final SinkRecord record) {
            return ZonedDateTime.ofInstant(Instant.ofEpochMilli(record.timestamp()), zoneId);
        }

does 3 things locate, convert and add zoneId

Do you want to change the split of the functionality for all of these existing cases? or have I missed what you are proposing/unhappy about

final String params = parts.length > 1 ? parts[1] : null;
try {
final Class<?> clazz = Class.forName(className);
return (TimestampSource) clazz.getConstructor(String.class, ZoneId.class).newInstance(params, zoneId);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The constructor argument seems backwards. ZoneId is required for all instances so shouldn't that be first and then the params?

Copy link
Contributor Author

@mkeskells mkeskells Oct 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by definition both are required parameters in the signature

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that its probably better to move all of the configuration to separate properties, once #300 is merged. That allows for external classes to add to the configuration parameters, and read them. That way its not limited in the configuration, and self documenting

Thoughts?

Mike Skells added 4 commits November 4, 2024 13:56
from a header value
from a data field
via a custom extractors

Remove a few simple classes and make a DataExtractor to read things from the `sinkRecord`
and few tidyups
@mkeskells mkeskells force-pushed the i-297-custom-record-grouping branch from d59ca8e to 9318097 Compare November 4, 2024 13:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants