Spark writers should set the `sort_order_id` data file entry in manifests to the write ordering

### Apache Iceberg version

1.9.0

### Query engine

Spark

### Please describe the bug 🐞

Currently, when writing files with Spark—I'm using 3.5.6—and Iceberg 1.9.0, when a table declares a sort order—or honestly even when it doesn't—I would expect that when writing files from the Spark compute engine in a manner that is ordered e.g. not using the fanout writer that the `sort_order_id` field be set for written data files in the manifests ([Data File Entry Manifest Spec](https://iceberg.apache.org/spec/#data-file-fields)). Currently, this field is never set when writing data files with Spark. 

[Per the Iceberg Table Spec on Sorting](https://iceberg.apache.org/spec/#sorting)
> A data or delete file is associated with a sort order by the sort order's id within [a manifest](https://iceberg.apache.org/spec/#manifests). Therefore, the table must declare all the sort orders for lookup. A table could also be configured with a default sort order id, indicating how the new data should be sorted by default. Writers should use this default sort order to sort the data on write, but are not required to if the default order is prohibitively expensive, as it would be for streaming writes.

I realize that this is an optional field, so it's not required to be set, however, theoretically setting this field can unlock performance optimizations in the future. For example, I have a feature that I'd love to contribute after this one from an Iceberg fork which enables reporting file ordering to Spark during scans by implementing the `SupportsReportOrdering` interface to enable the query optimizer to eliminate redundant sorts. 

### Willingness to contribute

- [x] I can contribute a fix for this bug independently
- [x] I would be willing to contribute a fix for this bug with guidance from the Iceberg community
- [ ] I cannot contribute a fix for this bug at this time

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Spark writers should set the `sort_order_id` data file entry in manifests to the write ordering #13634

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Spark writers should set the sort_order_id data file entry in manifests to the write ordering #13634

Description

Apache Iceberg version

Query engine

Please describe the bug 🐞

Willingness to contribute

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Spark writers should set the `sort_order_id` data file entry in manifests to the write ordering #13634