-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Iceberg] Add Iceberg metadata table $metadata_log_entries #24302
base: master
Are you sure you want to change the base?
[Iceberg] Add Iceberg metadata table $metadata_log_entries #24302
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this feature, overall looks good to me, except one little problem about timestamp with tz and some nits.
.add(new ColumnMetadata("timestamp", TIMESTAMP_WITH_TIME_ZONE)) | ||
.add(new ColumnMetadata("file", VARCHAR)) | ||
.add(new ColumnMetadata("latest_snapshot_id", BIGINT)) | ||
.add(new ColumnMetadata("latest_schema_id", BIGINT)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Should this type be INTEGER
?
{ | ||
InMemoryRecordSet.Builder table = InMemoryRecordSet.builder(COLUMNS); | ||
|
||
TableMetadata metadata = ((org.apache.iceberg.BaseTable) icebergTable).operations().current(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: use static import
Long snapshotId = null; | ||
Snapshot snapshot = null; | ||
try { | ||
snapshotId = SnapshotUtil.snapshotIdAsOfTime(icebergTable, entry.timestampMillis()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
snapshotId = SnapshotUtil.snapshotIdAsOfTime(icebergTable, entry.timestampMillis()); | |
snapshotId = snapshotIdAsOfTime(icebergTable, entry.timestampMillis()); |
nit: I know this code is from iceberg lib, but we can still use static import as much as possible.
|
||
private void addRow(InMemoryRecordSet.Builder table, ConnectorSession session, long timestampMillis, String fileLocation, Long snapshotId, Snapshot snapshot) | ||
{ | ||
table.addRow(packDateTimeWithZone(timestampMillis, session.getSqlFunctionProperties().getTimeZoneKey()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we consider the situation when session.getSqlFunctionProperties().isLegacyTimestamp()
is false? As I understand, in that case we should use UTC
as time zone key. Any misunderstanding please let me know.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your review @hantangwangd
Currently with and w/o isLegacyTimestamp the output for timestamp
in metadata_log_entries
entries looks same -
presto:iceberg_schema> set session legacy_timestamp=true;
SET SESSION
presto:iceberg_schema> select * from "region_legacy$metadata_log_entries";
timestamp | file | latest_snapshot_id | latest_schema_id | latest_s>
--------------------------------------+---------------------------------------------------------------------------------------------------------------------------------------------+---------------------+------------------+--------->
2025-01-02 22:39:00.666 Asia/Kolkata | hdfs://localhost:9000/user/hive/warehouse/iceberg_schema.db/region_legacy/metadata/00000-26e6389a-ab54-455b-b5ce-6648e241ce29.metadata.json | 7341611993609958569 | 0 | >
2025-01-02 22:39:12.478 Asia/Kolkata | hdfs://localhost:9000/user/hive/warehouse/iceberg_schema.db/region_legacy/metadata/00001-15c85566-fcf1-413d-8115-5fc4376426cf.metadata.json | 8958386941531340808 | 0 | >
(2 rows)
presto:iceberg_schema> set session legacy_timestamp=false;
SET SESSION
presto:iceberg_schema> select * from "region_nolegacy$metadata_log_entries";
timestamp | file | latest_snapshot_id | latest_schema_id | latest>
--------------------------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+---------------------+------------------+------->
2025-01-02 22:40:50.877 Asia/Kolkata | hdfs://localhost:9000/user/hive/warehouse/iceberg_schema.db/region_nolegacy/metadata/00000-1de7672a-8a9e-4249-afe8-d526b094ca57.metadata.json | 1517277585224920583 | 0 | >
2025-01-02 22:41:03.948 Asia/Kolkata | hdfs://localhost:9000/user/hive/warehouse/iceberg_schema.db/region_nolegacy/metadata/00001-8bf52dfe-2601-4ce8-bb3c-30ac435573ea.metadata.json | 2705037583472111886 | 0 | >
(2 rows)
It looks like both cases are taking up my local time and timezone since the session object has local TZ
Could you please help me understand if this is not expected?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My mistake, I confused the result column type timestamp with tz
with timestamp
. The property isLegacyTimestamp
is used for timestamp
type, so there is no need to consider it here.
@Test | ||
public void testMetadataLogTable() | ||
{ | ||
try { | ||
assertUpdate("CREATE TABLE test_table_metadatalog (id1 BIGINT, id2 BIGINT)"); | ||
assertQuery("SELECT count(*) FROM \"test_table_metadatalog$metadata_log_entries\"", "VALUES 1"); | ||
//metadata file created at table creation | ||
assertQuery("SELECT latest_snapshot_id FROM \"test_table_metadatalog$metadata_log_entries\"", "VALUES NULL"); | ||
|
||
assertUpdate("INSERT INTO test_table_metadatalog VALUES (0, 00), (1, 10), (2, 20)", 3); | ||
Table icebergTable = loadTable("test_table_metadatalog"); | ||
Snapshot latestSnapshot = icebergTable.currentSnapshot(); | ||
assertQuery("SELECT count(*) FROM \"test_table_metadatalog$metadata_log_entries\"", "VALUES 2"); | ||
assertQuery("SELECT latest_snapshot_id FROM \"test_table_metadatalog$metadata_log_entries\" order by timestamp DESC limit 1", "values " + latestSnapshot.snapshotId()); | ||
} | ||
finally { | ||
assertUpdate("DROP TABLE IF EXISTS test_table_metadatalog"); | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it convenient to add some test cases considering different timezone
and legacyTimestamp
, and verify the output column timestamp
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@hantangwangd Could you please provide me some example around which type of testcases would fit in here considering different timezone?
I just looked at other metadata tables with timestamp
column, but couldn't find any example around same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to Iceberg's test case, I think we can add some tests similar with the following code:
Session session = sessionWithTimezone(zoneId);
assertUpdate(session, "CREATE TABLE test_table_metadatalog (id1 BIGINT, id2 BIGINT)");
assertQuery(session, "SELECT count(*) FROM \"test_table_metadatalog$metadata_log_entries\"", "VALUES 1");
Table icebergTable = loadTable("test_table_metadatalog");
TableMetadata tableMetadata = ((HasTableOperations) icebergTable).operations().current();
ZonedDateTime zonedDateTime1 = ZonedDateTime.ofInstant(Instant.ofEpochMilli(tableMetadata.lastUpdatedMillis()), ZoneId.of(zoneId));
String metadataFileLocation1 = "file:" + tableMetadata.metadataFileLocation();
assertUpdate(session, "INSERT INTO test_table_metadatalog VALUES (0, 00), (1, 10), (2, 20)", 3);
tableMetadata = ((HasTableOperations) icebergTable).operations().refresh();
ZonedDateTime zonedDateTime2 = ZonedDateTime.ofInstant(Instant.ofEpochMilli(tableMetadata.lastUpdatedMillis()), ZoneId.of(zoneId));
String metadataFileLocation2 = "file:" + tableMetadata.metadataFileLocation();
Snapshot latestSnapshot = tableMetadata.currentSnapshot();
MaterializedResult result = getQueryRunner().execute(session, "SELECT * FROM \"test_table_metadatalog$metadata_log_entries\"");
assertThat(result).hasSize(2);
assertThat(result)
.anySatisfy(row -> assertThat(row)
.isEqualTo(new MaterializedRow(MaterializedResult.DEFAULT_PRECISION, zonedDateTime1, metadataFileLocation1, null, null, null)))
.anySatisfy(row -> assertThat(row)
.isEqualTo(new MaterializedRow(MaterializedResult.DEFAULT_PRECISION, zonedDateTime2, metadataFileLocation2, latestSnapshot.snapshotId(), latestSnapshot.schemaId(), latestSnapshot.sequenceNumber())));
And test it under different zoneIds.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One minor thing. I also agree with @hantanwangd to make sure this works with proper TZ configuration. Otherwise lgtm
@Override | ||
public RecordCursor cursor(ConnectorTransactionHandle transactionHandle, ConnectorSession session, TupleDomain<Integer> constraint) | ||
{ | ||
InMemoryRecordSet.Builder table = InMemoryRecordSet.builder(COLUMNS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than use the builder, I would recommend using the public constructor and passing an iterator. It will help reduce memory pressure on the coordinator by streaming records rather than requiring us to aggregate all at once in-memory. The overall footprint of this table shouldn't be too large but I think using an iterator approach to generate the records is not difficult to implement.
When generating records you can just use java's Stream and map operations and just call .iterator()
at the end.
List<MetadataLogEntry> metadataLogEntries = metadata.previousFiles(); | ||
|
||
processMetadataLogEntries(table, session, metadataLogEntries); | ||
addLatestMetadataEntry(table, session, metadata); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to add the latest entry I think you can just do Stream.concat+Stream.of()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM! (docs)
Pull branch, local doc build, looks good. Thank you for the documentation!
8242d7d
to
8f6f007
Compare
8f6f007
to
e66d10e
Compare
Description
Add Iceberg metadata table $metadata_log_entries
Motivation and Context
Add Iceberg metadata table $metadata_log_entries
This will help to get metadata changes on the Iceberg table https://iceberg.apache.org/docs/latest/spark-queries/#metadata-log-entries
Impact
Iceberg Connector
Test Plan
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.