这是indexloc提供的服务,不要输入任何密码
Skip to content

RewriteTablePathUtil doesn't work with v3 deletes #13671

@nastra

Description

@nastra

Apache Iceberg version

1.9.2 (latest release)

Query engine

None

Please describe the bug 🐞

When running TestRewriteTablePathsAction with format-version=3 the tests fail with the below exception

Caused by: java.lang.IllegalArgumentException: Content offset is required for DV
	at org.apache.iceberg.relocated.com.google.common.base.Preconditions.checkArgument(Preconditions.java:141)
	at org.apache.iceberg.FileMetadata$Builder.build(FileMetadata.java:271)
	at org.apache.iceberg.RewriteTablePathUtil.writeDeleteFileEntry(RewriteTablePathUtil.java:498)
	at org.apache.iceberg.RewriteTablePathUtil.lambda$rewriteDeleteManifest$5(RewriteTablePathUtil.java:436)
	at java.base/java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:197)
	at java.base/java.util.Iterator.forEachRemaining(Iterator.java:133)
	at java.base/java.util.Spliterators$IteratorSpliterator.forEachRemaining(Spliterators.java:1845)
	at java.base/java.util.stream.AbstractPipeline.copyInto(AbstractPipeline.java:509)
	at java.base/java.util.stream.AbstractPipeline.wrapAndCopyInto(AbstractPipeline.java:499)
	at java.base/java.util.stream.ReduceOps$ReduceOp.evaluateSequential(ReduceOps.java:921)
	at java.base/java.util.stream.AbstractPipeline.evaluate(AbstractPipeline.java:234)
	at java.base/java.util.stream.ReferencePipeline.reduce(ReferencePipeline.java:657)
	at org.apache.iceberg.RewriteTablePathUtil.rewriteDeleteManifest(RewriteTablePathUtil.java:444)
	at org.apache.iceberg.spark.actions.RewriteTablePathSparkAction.writeDeleteManifest(RewriteTablePathSparkAction.java:606)
	at org.apache.iceberg.spark.actions.RewriteTablePathSparkAction.lambda$toManifests$f0b4715c$1(RewriteTablePathSparkAction.java:547)

This is because https://github.com/apache/iceberg/blob/main/core/src/main/java/org/apache/iceberg/RewriteTablePathUtil.java#L493-L498 doesn't carry over DVs properly.

We need to carry over contentOffset / contentSizeInBytes / referencedDataFile (most likely in the copy() method). Also we need to update TestRewriteTablePathsAction to work with format versions 2 and above

Willingness to contribute

  • I can contribute a fix for this bug independently
  • I would be willing to contribute a fix for this bug with guidance from the Iceberg community
  • I cannot contribute a fix for this bug at this time

Metadata

Metadata

Assignees

No one assigned

    Labels

    beginnerIssues for apache iceberg beginners, enjoy to contribute !bugSomething isn't workinggood first issueGood for newcomers

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions