rewrite_data_files generates java.lang.OutOfMemoryError

### Apache Iceberg version

1.7.1

### Query engine

Spark

### Please describe the bug 🐞

Hello, I'd like to get some help on compaction OOM in Spark. Calling `rewrite_data_files` on a relatively large partition in an AWS Glue 5.0 (Spark 3.5.4 and Iceberg 1.7.1) Spark job with 2 R.8X workers (256GB each) always gets java.lang.OutOfMemoryError. This issue is similar to a closed issue https://github.com/apache/iceberg/issues/10054

```sql
CALL system.rewrite_data_files(
  table => 'some_db.some_table', where => "(partition_id = 'some_partition_id')",
  strategy => 'binpack', options => map(
    'partial-progress.enabled','true', 'rewrite-job-order','bytes-asc',
    'target-file-size-bytes','134217728', 'partial-progress.max-commits','50',
    'partial-progress.max-failed-commits','1000', 'max-file-group-size-bytes','536870912',
    'max-concurrent-file-group-rewrites','1', 'min-input-files','10'
  )
) 
```

Partition:
```Python
partition=Row(Row(partition_id='some_partition_id'), spec_id=0, record_count=45984621, file_count=589, position_delete_record_count=5, position_delete_file_count=271, equality_delete_record_count=17, equality_delete_file_count=585)
```

Stacktrace:
```
WARN	2025-07-24T20:57:01,498	226477	org.apache.spark.scheduler.TaskSetManager	[task-result-getter-1]	72	Lost task 2.0 in stage 1.0 (TID 3) (172.34.121.38 executor 1): java.lang.OutOfMemoryError: Java heap space
	at java.base/java.util.HashMap.resize(HashMap.java:702)
	at java.base/java.util.HashMap.putVal(HashMap.java:661)
	at java.base/java.util.HashMap.put(HashMap.java:610)
	at java.base/java.util.HashSet.add(HashSet.java:221)
	at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:102)
	at org.apache.iceberg.util.StructLikeSet.add(StructLikeSet.java:32)
	at org.apache.iceberg.relocated.com.google.common.collect.Iterators.addAll(Iterators.java:366)
	at org.apache.iceberg.relocated.com.google.common.collect.Iterables.addAll(Iterables.java:333)
	at org.apache.iceberg.data.BaseDeleteLoader.loadEqualityDeletes(BaseDeleteLoader.java:110)
	at org.apache.iceberg.data.DeleteFilter.applyEqDeletes(DeleteFilter.java:190)
	at org.apache.iceberg.data.DeleteFilter.eqDeletedRowFilter(DeleteFilter.java:220)
	at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader$ColumnBatchLoader.applyEqDelete(ColumnarBatchReader.java:230)
	at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader$ColumnBatchLoader.loadDataToColumnBatch(ColumnarBatchReader.java:104)
	at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:72)
	at org.apache.iceberg.spark.data.vectorized.ColumnarBatchReader.read(ColumnarBatchReader.java:44)
	at org.apache.iceberg.parquet.VectorizedParquetReader$CachedFileIterator.next(VectorizedParquetReader.java:272)
	at org.apache.iceberg.spark.source.BaseReader.next(BaseReader.java:171)
	at org.apache.spark.sql.execution.datasources.v2.PartitionIterator.hasNext(DataSourceRDD.scala:120)
	at org.apache.spark.sql.execution.datasources.v2.MetricsIterator.hasNext(DataSourceRDD.scala:158)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.$anonfun$hasNext$1$adapted(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1$$Lambda$1336/0x00007f6230b92000.apply(Unknown Source)
	at scala.Option.exists(Option.scala:376)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.advanceToNextIter(DataSourceRDD.scala:97)
	at org.apache.spark.sql.execution.datasources.v2.DataSourceRDD$$anon$1.hasNext(DataSourceRDD.scala:63)
	at org.apache.spark.InterruptibleIterator.hasNext(InterruptibleIterator.scala:37)
	at scala.collection.Iterator$$anon$10.hasNext(Iterator.scala:460)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.columnartorow_nextBatch_0$(Unknown Source)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.processNext(Unknown Source)
	at org.apache.spark.sql.execution.BufferedRowIterator.hasNext(BufferedRowIterator.java:35)
	at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage1.hasNext(Unknown Source)
```

### Willingness to contribute

- [ ] I can contribute a fix for this bug independently
- [ ] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

rewrite_data_files generates java.lang.OutOfMemoryError #13674

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

rewrite_data_files generates java.lang.OutOfMemoryError #13674

Description

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions