[Bug]: S3 paths with special characters are not escaped #10163

c-thiel · 2025-01-05T10:04:00Z

What happened

Maybe related to #8524

In our deployment we had a user that created a namespace with a questionmark in its name. Since then we cannot run gc anymore. I managed to reproduce it in our tests.
GC It fails with:

Caused by: java.lang.RuntimeException: Failed to get manifest files for ICEBERG_TABLE tests.a.C?mpl Name .table, content-ID 0749c4e9-ac98-4c82-ad2f-1bc387f5045f at commit 4696d1290522386c8ee4db51c64c1fd3ed567351a0026a5fefb6154faa1629d2 via s3://wyrm-lakehouse-pytest/tests.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/00000-6cc479b6-0667-4c01-93e8-d59ed46c4c4c.metadata.json
...
Caused by: org.apache.iceberg.exceptions.RuntimeIOException: Failed to open file: s3://wyrm-lakehouse-pytest/tests.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/snap-6014480353781191283-1-fe723339-fd23-4db3-a534-95f1de851af9.avro

The file exists. It seems to me that the ? is not escaped, and thus the rest of the path is interpreted as a query argument.

Is there a flag to make gc altogether more robust? I don't want to abort it halfway through due to a single table failure.

How to reproduce it

Create a namespace "a" with a subnamespace "C?mpl3 Name " (I used spark)
Create a table in it called "table"
Drop the table
Run Nessie gc

Nessie server type (docker/uber-jar/built from source) and version

docker

Client type (Ex: UI/Spark/pynessie ...) and version

None - Nessie gc

Additional information

Nessie 0.99.0

The text was updated successfully, but these errors were encountered:

dimas-b · 2025-01-06T16:39:30Z

Sorry, @c-thiel , I was not able to reproduce this problem with the current main code (local build) 🤔

Could you share your full exception stack trace and exact gc CLI arguments (minus secret stuff, of course)?

c-thiel · 2025-01-08T12:19:40Z

Hm, strange.

Here is the callstack:

java.lang.RuntimeException: java.lang.RuntimeException: java.lang.RuntimeException: Failed to get manifest files for ICEBERG_TABLE tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name .table, content-ID 0749c4e9-ac98-4c82-ad2f-1bc387f5045f at commit f150347568bc34bfd27b611d6588a00bf472e7bd76c477b3b96864deddf3ed5d via s3://my-bucket/tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/00000-6cc479b6-0667-4c01-93e8-d59ed46c4c4c.metadata.json
        at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
        at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:540)
        at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)
        at java.base/java.util.concurrent.ForkJoinTask.join(ForkJoinTask.java:653)
        at java.base/java.util.concurrent.ForkJoinPool.invoke(ForkJoinPool.java:2822)
        at org.projectnessie.gc.expire.local.DefaultLocalExpire.expire(DefaultLocalExpire.java:75)
        at org.projectnessie.gc.tool.cli.commands.BaseRepositoryCommand.expire(BaseRepositoryCommand.java:250)
        at org.projectnessie.gc.tool.cli.commands.MarkAndSweep.call(MarkAndSweep.java:57)
        at org.projectnessie.gc.tool.cli.commands.BaseRepositoryCommand.call(BaseRepositoryCommand.java:84)
        at org.projectnessie.gc.tool.cli.commands.BaseCommand.call(BaseCommand.java:26)
        at org.projectnessie.gc.tool.cli.commands.BaseCommand.call(BaseCommand.java:21)
        at picocli.CommandLine.executeUserObject(CommandLine.java:2045)
        at picocli.CommandLine.access$1500(CommandLine.java:148)
        at picocli.CommandLine$RunLast.executeUserObjectOfLastSubcommandWithSameParent(CommandLine.java:2465)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2457)
        at picocli.CommandLine$RunLast.handle(CommandLine.java:2419)
        at picocli.CommandLine$AbstractParseResultHandler.execute(CommandLine.java:2277)
        at picocli.CommandLine$RunLast.execute(CommandLine.java:2421)
        at picocli.CommandLine.execute(CommandLine.java:2174)
        at org.projectnessie.gc.tool.cli.CLI.runMain(CLI.java:104)
        at org.projectnessie.gc.tool.cli.CLI.runMain(CLI.java:75)
        at org.projectnessie.gc.tool.cli.CLI.main(CLI.java:70)
Caused by: java.lang.RuntimeException: java.lang.RuntimeException: Failed to get manifest files for ICEBERG_TABLE tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name .table, content-ID 0749c4e9-ac98-4c82-ad2f-1bc387f5045f at commit f150347568bc34bfd27b611d6588a00bf472e7bd76c477b3b96864deddf3ed5d via s3://my-bucket/tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/00000-6cc479b6-0667-4c01-93e8-d59ed46c4c4c.metadata.json
        at java.base/jdk.internal.reflect.DirectConstructorHandleAccessor.newInstance(DirectConstructorHandleAccessor.java:62)
        at java.base/java.lang.reflect.Constructor.newInstanceWithCaller(Constructor.java:502)
        at java.base/java.lang.reflect.Constructor.newInstance(Constructor.java:486)
        at java.base/java.util.concurrent.ForkJoinTask.getThrowableException(ForkJoinTask.java:540)
        at java.base/java.util.concurrent.ForkJoinTask.reportException(ForkJoinTask.java:567)
        at java.base/java.util.concurrent.ForkJoinTask.invoke(ForkJoinTask.java:670)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateParallel(ReduceOps.java:927)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:233)
        at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:657)
        at org.projectnessie.gc.expire.local.DefaultLocalExpire.expireInForkJoinPool(DefaultLocalExpire.java:96)
        at java.base/java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1456)
        at java.base/java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:387)
        at java.base/java.util.concurrent.ForkJoinPool$WorkQueue.topLevelExec(ForkJoinPool.java:1312)
        at java.base/java.util.concurrent.ForkJoinPool.scan(ForkJoinPool.java:1843)
        at java.base/java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1808)
        at java.base/java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:188)
Caused by: java.lang.RuntimeException: Failed to get manifest files for ICEBERG_TABLE tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name .table, content-ID 0749c4e9-ac98-4c82-ad2f-1bc387f5045f at commit f150347568bc34bfd27b611d6588a00bf472e7bd76c477b3b96864deddf3ed5d via s3://my-bucket/tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/00000-6cc479b6-0667-4c01-93e8-d59ed46c4c4c.metadata.json
        at org.projectnessie.gc.iceberg.IcebergContentToFiles.lambda$extractTableFiles$0(IcebergContentToFiles.java:170)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:273)
        at java.base/java.util.stream.Streams$StreamBuilderImpl.forEachRemaining(Streams.java:411)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.StreamSpliterators$WrappingSpliterator.forEachRemaining(StreamSpliterators.java:310)
        at java.base/java.util.stream.Streams$ConcatSpliterator.forEachRemaining(Streams.java:735)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ForEachOps$ForEachOp.evaluateSequential(ForEachOps.java:151)
        at java.base/java.util.stream.ForEachOps$ForEachOp$OfRef.evaluateSequential(ForEachOps.java:174)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.forEach(ReferencePipeline.java:596)
        at java.base/java.util.stream.ReferencePipeline$7$1.accept(ReferencePipeline.java:276)
        at java.base/java.util.HashMap$KeySpliterator.forEachRemaining(HashMap.java:1715)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
        at java.base/java.util.stream.ReduceOps$5.evaluateSequential(ReduceOps.java:258)
        at java.base/java.util.stream.ReduceOps$5.evaluateSequential(ReduceOps.java:248)
        at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
        at java.base/java.util.stream.ReferencePipeline.count(ReferencePipeline.java:709)
        at org.projectnessie.gc.expire.PerContentDeleteExpired.identifyLiveFiles(PerContentDeleteExpired.java:136)
        at org.projectnessie.gc.expire.PerContentDeleteExpired.expire(PerContentDeleteExpired.java:73)
        at org.projectnessie.gc.expire.local.DefaultLocalExpire.expireSingleContent(DefaultLocalExpire.java:109)
        at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
        at java.base/java.util.concurrent.ConcurrentHashMap$KeySpliterator.forEachRemaining(ConcurrentHashMap.java:3573)
        at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
        at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:960)
        at java.base/java.util.stream.ReduceOps$ReduceTask.doLeaf(ReduceOps.java:934)
        at java.base/java.util.stream.AbstractTask.compute(AbstractTask.java:327)
        at java.base/java.util.concurrent.CountedCompleter.exec(CountedCompleter.java:754)
        ... 5 more
Caused by: org.apache.iceberg.exceptions.RuntimeIOException: Failed to open file: s3://my-bucket/tests.2023-10-13-101415-edd293e4-1eee-47d7-b964-171340e13ae9.a.C?mpl Name /table_2559f350-e7c7-4216-b29e-5033091a4d06/metadata/snap-6014480353781191283-1-fe723339-fd23-4db3-a534-95f1de851af9.avro
        at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:104)
        at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:77)
        at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:37)
        at org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:333)
        at org.apache.iceberg.relocated.com.google.common.collect.Lists.newLinkedList(Lists.java:242)
        at org.apache.iceberg.ManifestLists.read(ManifestLists.java:45)
        at org.apache.iceberg.BaseSnapshot.cacheManifests(BaseSnapshot.java:146)
        at org.apache.iceberg.BaseSnapshot.allManifests(BaseSnapshot.java:164)
        at org.projectnessie.gc.iceberg.IcebergContentToFiles.allManifests(IcebergContentToFiles.java:268)
        at org.projectnessie.gc.iceberg.IcebergContentToFiles.allManifestsAndDataFiles(IcebergContentToFiles.java:254)
        at org.projectnessie.gc.iceberg.IcebergContentToFiles.lambda$extractTableFiles$0(IcebergContentToFiles.java:155)
        ... 37 more
Caused by: org.apache.avro.InvalidAvroMagicException: Not an Avro data file
        at org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:78)
        at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:102)
        ... 47 more

The gc command is:

        - gc
        - --uri
        - http://....:19120/api/v2
        - --inmemory
        - --default-cutoff
        - PT80H
        - --iceberg
        - $(S3_CREDENTIALS)

The credential has access to the file. I am on Version 0.99 - if main doesn't have the problem, we could just try an update and see if the problem persists.

dimas-b · 2025-01-08T14:36:35Z

@c-thiel : From the stack trace it looks like it is not a problem with S3 URI handling, but might be a problem with the file contents. Could you double check in S3 UI/CLI whether that avro file is correct?

dimas-b · 2025-01-08T15:59:58Z

I tried Spark + AWS and created the top-level namespace with ? and a in the name.

Then Iceberg fails to insert on the Spark side.

spark-sql ()> create namespace `ns?test abc`;

then

use `ns?test abc`

then

spark-sql (`ns?test abc`)> create table t2 (n string)
                         > ;
Time taken: 1.609 seconds
spark-sql (`ns?test abc`)> insert into t2 values ('t2');
25/01/08 10:57:35 WARN SnapshotProducer: Failed to load committed table metadata or during cleanup, skipping further cleanup
org.apache.iceberg.exceptions.RuntimeIOException: Failed to open file: s3://EXAMPLE/ns?test abc/t2_11c36b10-a8c1-477a-8a91-b90733dad13f/metadata/snap-4983815506953660385-1-b650452a-4aba-45ae-b417-c9a3adc99462.avro
	at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:104)
	at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:77)
	at org.apache.iceberg.avro.AvroIterable.iterator(AvroIterable.java:37)
	at org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:333)
	at org.apache.iceberg.relocated.com.google.common.collect.Lists.newLinkedList(Lists.java:258)
	at org.apache.iceberg.ManifestLists.read(ManifestLists.java:45)
	at org.apache.iceberg.BaseSnapshot.cacheManifests(BaseSnapshot.java:146)
	at org.apache.iceberg.BaseSnapshot.allManifests(BaseSnapshot.java:164)
	at org.apache.iceberg.SnapshotProducer.commit(SnapshotProducer.java:451)
	at org.apache.iceberg.spark.source.SparkWrite.commitOperation(SparkWrite.java:233)
	at org.apache.iceberg.spark.source.SparkWrite$BatchAppend.commit(SparkWrite.java:301)
	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2(WriteToDataSourceV2Exec.scala:399)
	at org.apache.spark.sql.execution.datasources.v2.V2TableWriteExec.writeWithV2$(WriteToDataSourceV2Exec.scala:359)
	at org.apache.spark.sql.execution.datasources.v2.AppendDataExec.writeWithV2(WriteToDataSourceV2Exec.scala:225)
	at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run(WriteToDataSourceV2Exec.scala:337)
	at org.apache.spark.sql.execution.datasources.v2.V2ExistingTableWriteExec.run$(WriteToDataSourceV2Exec.scala:336)
	at org.apache.spark.sql.execution.datasources.v2.AppendDataExec.run(WriteToDataSourceV2Exec.scala:225)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result$lzycompute(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.result(V2CommandExec.scala:43)
	at org.apache.spark.sql.execution.datasources.v2.V2CommandExec.executeCollect(V2CommandExec.scala:49)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.$anonfun$applyOrElse$1(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$6(SQLExecution.scala:125)
	at org.apache.spark.sql.execution.SQLExecution$.withSQLConfPropagated(SQLExecution.scala:201)
	at org.apache.spark.sql.execution.SQLExecution$.$anonfun$withNewExecutionId$1(SQLExecution.scala:108)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.execution.SQLExecution$.withNewExecutionId(SQLExecution.scala:66)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:107)
	at org.apache.spark.sql.execution.QueryExecution$$anonfun$eagerlyExecuteCommands$1.applyOrElse(QueryExecution.scala:98)
	at org.apache.spark.sql.catalyst.trees.TreeNode.$anonfun$transformDownWithPruning$1(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.trees.CurrentOrigin$.withOrigin(origin.scala:76)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDownWithPruning(TreeNode.scala:461)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.org$apache$spark$sql$catalyst$plans$logical$AnalysisHelper$$super$transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning(AnalysisHelper.scala:267)
	at org.apache.spark.sql.catalyst.plans.logical.AnalysisHelper.transformDownWithPruning$(AnalysisHelper.scala:263)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.plans.logical.LogicalPlan.transformDownWithPruning(LogicalPlan.scala:32)
	at org.apache.spark.sql.catalyst.trees.TreeNode.transformDown(TreeNode.scala:437)
	at org.apache.spark.sql.execution.QueryExecution.eagerlyExecuteCommands(QueryExecution.scala:98)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted$lzycompute(QueryExecution.scala:85)
	at org.apache.spark.sql.execution.QueryExecution.commandExecuted(QueryExecution.scala:83)
	at org.apache.spark.sql.Dataset.<init>(Dataset.scala:220)
	at org.apache.spark.sql.Dataset$.$anonfun$ofRows$2(Dataset.scala:100)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.Dataset$.ofRows(Dataset.scala:97)
	at org.apache.spark.sql.SparkSession.$anonfun$sql$4(SparkSession.scala:691)
	at org.apache.spark.sql.SparkSession.withActive(SparkSession.scala:900)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:682)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:713)
	at org.apache.spark.sql.SparkSession.sql(SparkSession.scala:744)
	at org.apache.spark.sql.SQLContext.sql(SQLContext.scala:651)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLDriver.run(SparkSQLDriver.scala:68)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processCmd(SparkSQLCLIDriver.scala:501)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1(SparkSQLCLIDriver.scala:619)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.$anonfun$processLine$1$adapted(SparkSQLCLIDriver.scala:613)
	at scala.collection.Iterator.foreach(Iterator.scala:943)
	at scala.collection.Iterator.foreach$(Iterator.scala:943)
	at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
	at scala.collection.IterableLike.foreach(IterableLike.scala:74)
	at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
	at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.processLine(SparkSQLCLIDriver.scala:613)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver$.main(SparkSQLCLIDriver.scala:310)
	at org.apache.spark.sql.hive.thriftserver.SparkSQLCLIDriver.main(SparkSQLCLIDriver.scala)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at java.base/jdk.internal.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:75)
	at java.base/jdk.internal.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:52)
	at java.base/java.lang.reflect.Method.invoke(Method.java:580)
	at org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
	at org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1029)
	at org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:194)
	at org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:217)
	at org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:91)
	at org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1120)
	at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1129)
	at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: org.apache.iceberg.shaded.org.apache.avro.InvalidAvroMagicException: Not an Avro data file
	at org.apache.iceberg.shaded.org.apache.avro.file.DataFileReader.openReader(DataFileReader.java:79)
	at org.apache.iceberg.avro.AvroIterable.newFileReader(AvroIterable.java:102)
	... 74 more
Time taken: 2.957 seconds

So, my guess is that it's a bug in Iceberg code 🤷

dimas-b · 2025-01-08T16:16:42Z

@c-thiel : Does this use case work with a JDBC Iceberg catalog?

c-thiel · 2025-01-08T16:38:55Z

@c-thiel : Does this use case work with a JDBC Iceberg catalog?

I haven't tried actually. Thanks for investigating @dimas-b.

I don't really care much about this specific use-case, but if someone manages to create a broken table (very small fraction), it would be great if Nessie would continue with gc instead of aborting. Not really sure how I can get the gc to run normally again without touching MongoDB

dimas-b · 2025-01-08T17:03:24Z

Nessie GC uses Iceberg jars for metadata access very similarly to how Spark uses them.

So it is conceivable that, for example, Python might be able to write something that Iceberg java would not be able to read 🤷

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug]: S3 paths with special characters are not escaped #10163

[Bug]: S3 paths with special characters are not escaped #10163

c-thiel commented Jan 5, 2025

dimas-b commented Jan 6, 2025

c-thiel commented Jan 8, 2025

dimas-b commented Jan 8, 2025

dimas-b commented Jan 8, 2025 •

edited

Loading

dimas-b commented Jan 8, 2025

c-thiel commented Jan 8, 2025

dimas-b commented Jan 8, 2025

[Bug]: S3 paths with special characters are not escaped #10163

[Bug]: S3 paths with special characters are not escaped #10163

Comments

c-thiel commented Jan 5, 2025

What happened

How to reproduce it

Nessie server type (docker/uber-jar/built from source) and version

Client type (Ex: UI/Spark/pynessie ...) and version

Additional information

dimas-b commented Jan 6, 2025

c-thiel commented Jan 8, 2025

dimas-b commented Jan 8, 2025

dimas-b commented Jan 8, 2025 • edited Loading

dimas-b commented Jan 8, 2025

c-thiel commented Jan 8, 2025

dimas-b commented Jan 8, 2025

dimas-b commented Jan 8, 2025 •

edited

Loading