Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add sql + ingestion compatibility for first/last on numeric values #15607

Conversation

ankit0811
Copy link
Contributor

@ankit0811 ankit0811 commented Dec 29, 2023

Follow-up on 14462

Description

Enables sql querying for numeric last and first value. Instead of returning pair object as a string, now latest, latest_by and earliest, earilest_by will return the actual numeric value for column type lastNumeric and firstNumeric

Release note

SQL compatibility for numeric last and first column types.
Ingestion UI now provides option for first and last aggregation as well.


Key changed/added classes in this PR
  • EarliestLatestAnySqlAggregator
  • MetricSpec

This PR has:

  • been self-reviewed.
  • using the concurrency checklist (Remove this item if the PR doesn't have any relation to concurrency.)
  • added documentation for new or modified features or behaviors.
  • a release note entry in the PR description.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

@github-actions github-actions bot added Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 labels Dec 29, 2023
@LakshSingla LakshSingla self-requested a review January 2, 2024 05:52
@LakshSingla
Copy link
Contributor

Tested out the changes locally, and went through the test cases and the code changes. Thank you for your continued efforts in the area 🚀

@LakshSingla LakshSingla self-requested a review January 9, 2024 21:07
if (type instanceof RowSignatures.ComplexSqlType) {
ColumnType complexColumnType = ((RowSignatures.ComplexSqlType) type).getColumnType();
String complexTypeName = complexColumnType.getComplexTypeName();
if (complexTypeName != null && (complexTypeName.equals(SerializablePairLongLongComplexMetricSerde.TYPE_NAME) || complexTypeName.equals(SerializablePairLongFloatComplexMetricSerde.TYPE_NAME) || complexTypeName.equals(SerializablePairLongDoubleComplexMetricSerde.TYPE_NAME))) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this correct? Shouldn't we be returning the scalar return type then? Why are we not doing this for string as well?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can return the same for String but since its return type is already string which is handled in the below if condition, I dint bother changing this

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SELECT LATEST_BY("last_float_added", "__time") + 4.0 FROM "wikiticker_long_string_last_first"

This doesn't work because the return type is incorrect.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the input type is complex, it should return the scalar value for that complex type (float for longFloatPair etc), else if the input is numeric, it should return that numeric type, else it should return varchar (in unknown complex, string and any other sql type cases)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for catching this @LakshSingla
updated the PR

Comment on lines +142 to +149
'longFirst',
'longLast',
'doubleFirst',
'doubleLast',
'floatFirst',
'floatLast',
'stringFirst',
'stringLast',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps this change isn't required. KNOWN_TYPES already contains the first/last aggregations.

@LakshSingla LakshSingla merged commit 355c2f5 into apache:master Jan 10, 2024
83 checks passed
@LakshSingla LakshSingla added this to the 29.0.0 milestone Jan 29, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area - Batch Ingestion Area - MSQ For multi stage queries - https://github.com/apache/druid/issues/12262 Area - Querying Area - Web Console
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants