Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix CTAS for non-hdfs storages, also fixes multi storage cases #256

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

vikrambohra
Copy link
Collaborator

@vikrambohra vikrambohra commented Nov 19, 2024

Summary

This PR introduces the following changes

  1. Fix CTAS for non-hdfs storage types
    While extracting UUID from a snapshot, the code constructs the database path excluding the endpoint (scheme) when checking if it is a prefix of the manifestList that is part of the snapshot

Example
ManifestList (from snapshot): s3://bucket-name/database/table-uuid/file.avro
Database prefix: bucket-name/database (not a prefix of above)

Fix: Strip the endpoint(scheme) from the manifest list by resolving the correct storage from the tableLocation

After fix
ManifestList stripped : bucket-name/database/table-uuid/file.avro
Database prefix: bucket-name/database (is a prefix of above)

  1. Fix multiple storage scenario
    There are assumptions in code that storage is always cluster default. This fails when the default is a storage without scheme (hdfs) and the db.table is being stored in a storage with scheme (S3, BlobFs)

Fix: Resolve the correct storage for by extracting the tableLocation from table props and checking the scheme (endpoint)

  1. Adds a method to storage interface to check if the tableLocation exists
  2. Add a method to StorageClient to check if a specified path exists.

Changes

  • Client-facing API Changes
  • Internal API Changes
  • Bug Fixes
  • New Features
  • Performance Improvements
  • Code Style
  • Refactoring
  • Documentation
  • Tests

For all the boxes checked, please include additional details of the changes made in this pull request.

Testing Done

  • Manually Tested on local docker setup. Please include commands ran, and their output.
  • Added new tests for the changes made.
  • Updated existing tests to reflect the changes made.
  • No tests added or updated. Please explain why. If unsure, please feel free to ask for help.
  • Some other form of testing like staging or soak time in production. Please explain.
  1. Updated TableUUIDGeneratorTest
  2. Added TableUUIDGeneratorMultiStorageTest

For all the boxes checked, include a detailed description of the testing done for the changes made in this pull request.

Additional Information

  • Breaking Changes
  • Deprecations
  • Large PR broken into smaller PRs, and PR plan linked in the description.

For all the boxes checked, include additional details of the changes made in this pull request.

Copy link
Collaborator

@HotSushi HotSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with Storage API changes. One feedback on not introducing iceberg at rest layer.

public boolean pathExists(String path) {
try {
return fs.exists(new Path(path));
} catch (IOException e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does fs.exists throw an IOException for non-existing paths? thats weird

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it returns true or false: https://github.com/apache/hadoop/blob/cd2cffe73f909a106ba47653acf525220f2665cf/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1860

But could throw an IOException if getFileStatus() throws an IOException.

I am wondering if it could throw an IOException in scenarios like HDFS is unavailable but the file exists?

storageManager, dbIdFromProps, tblIdFromProps, tableUUIDProperty);
if (TableType.REPLICA_TABLE != tableType && !doesPathExist(previousPath)) {
log.error("Previous tableLocation: {} doesn't exist", previousPath);
Storage storage = catalog.resolveStorage(TableIdentifier.of(dbIdFromProps, tblIdFromProps));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolveStorage will use SelectStorage if the entry doesn't exist in HTS. That code is not guaranteed to be idempotent during migration, lets avoid using selectStorage again in the consecutive COMMIT/PutSnapshot call.

Can we instead use tableLocation and derive storage from there? ie:
if s3://a/b/c -> S3Storage
if abs://a/b/c -> ABSStorage
if a/b/c -> HDFSStorage

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done. Please check. there is a catch though, because both local and hdfs storage have same paths


@Override
public boolean pathExists(String path) {
// TODO: Support pathExists on ADLS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this as part of separate PR? DLFC is not initialized: #148

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we then throw not implement exception instead?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UnsupportedOperationException is what we throw usually for such cases. I've rephrased the exception msg to mention its not implemented yet.

public boolean pathExists(String path) {
try {
return fs.exists(new Path(path));
} catch (IOException e) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it returns true or false: https://github.com/apache/hadoop/blob/cd2cffe73f909a106ba47653acf525220f2665cf/hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/fs/FileSystem.java#L1860

But could throw an IOException if getFileStatus() throws an IOException.

I am wondering if it could throw an IOException in scenarios like HDFS is unavailable but the file exists?

@@ -55,4 +56,11 @@ public S3Client getNativeClient() {
public StorageType.Type getStorageType() {
return S3_TYPE;
}

@Override
public boolean pathExists(String path) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the intended usage of this existence check?

  1. Do you want to check whether the bucket exists or not?
  2. Or, do you want to check whether any blob for the table has been created yet or not?

The two checks above are very different. For 1 you should check existence of bucket. For 2 you should check existence of objects with a prefix that way you are doing it now.

I would imagine that you intend to check that the bucket or root location has been created and hence 1 and not 2. Is that right?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intention is to whether directory exists so it is check 2.
Current code passes the absolute path until table
-> blobfs://bucket_name/data/openhouse/db/t-uuid (blobfs)
-> s3://bucket_name/data/openhouse/db/t-uuid (s3)
-> /data/openhouse/db/t-uuid (hdfs)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is no longer valid. See #256 (comment)

@vikrambohra
Copy link
Collaborator Author

vikrambohra commented Nov 22, 2024

@HotSushi @jainlavina Addressed the comments. Some changes in the latest commit

  1. Removed tableLocationExists() from the storage api - we dont need to construct the table location since we fetch it from table properties.
  2. Changed the pathExists in storageClient api to fileExists to be explicit about the check since we are now checking the absolute path of metadata,json file (table location) instead of only the path till table directory.

* @param path absolute path to a file including scheme
* @return true if path exists else false
*/
boolean fileExists(String path);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
boolean fileExists(String path);
boolean exists(Path path);

* @return true if path exists else false
*/
@Override
public boolean fileExists(String path) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this could be default impl and go to BaseStorageClient

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets keep BaseStorageClient FileSystem agnostic.

* scheme. Scheme is not prefix for local and hdfs storage. See:
* https://github.com/linkedin/openhouse/issues/121
*
* @param path absolute path to a file including scheme
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So this comment is not always true because the scheme may not have been passed along for hdfs files?
In that case, may be clarify that the path may or not have scheme specified. If scheme is not specified then it is assumed to be hdfs. For all other storage types, scheme must be specified.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

public Storage getStorageFromPath(String path) {
for (Storage storage : storages) {
if (storage.isConfigured()) {
if (StorageType.LOCAL.equals(storage.getType())) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be fallback only if path does not start with any other configured endpoint?
What if the path is an S3 storage path but local storage is also configured for some other tables?


@Override
public boolean pathExists(String path) {
// TODO: Support pathExists on ADLS
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: should we then throw not implement exception instead?

@@ -48,4 +49,20 @@ public FileSystem getNativeClient() {
public StorageType.Type getStorageType() {
return HDFS_TYPE;
}

/**
* Checks if the path exists on the backend storage. Scheme is not prefix for local and hdfs
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I am having a hard time parsing the language "scheme is not prefix". Would you please consider paraphrasing to something like "Scheme is not specified in the path for local and hdfs storage".

public boolean fileExists(String path) {
try {
HeadObjectRequest headObjectRequest =
HeadObjectRequest.builder().bucket(getRootPrefix()).key(path).build();
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you check the headObject API spec that passing the full path including scheme and bucket name as key will work? It usually expects object name without the bucket name etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants