Skip to content

Conversation

@Fokko
Copy link
Contributor

@Fokko Fokko commented Sep 30, 2025

Rationale for this change

Fixed the tests in #1500, but kept the commits so @JasperHG90 gets all the credits for the great work!

Closes #1500
Closes #1245

Are these changes tested?

Are there any user-facing changes?

@Fokko Fokko requested a review from kevinjqliu September 30, 2025 19:20
Comment on lines +89 to +109
def asc(
self, source_column_name: str, transform: Transform[Any, Any], null_order: NullOrder = NullOrder.NULLS_LAST
) -> UpdateSortOrder:
"""Add a sort field with ascending order."""
return self._add_sort_field(
source_id=self._column_name_to_id(source_column_name),
transform=transform,
direction=SortDirection.ASC,
null_order=null_order,
)

def desc(
self, source_column_name: str, transform: Transform[Any, Any], null_order: NullOrder = NullOrder.NULLS_LAST
) -> UpdateSortOrder:
"""Add a sort field with descending order."""
return self._add_sort_field(
source_id=self._column_name_to_id(source_column_name),
transform=transform,
direction=SortDirection.DESC,
null_order=null_order,
)
Copy link
Contributor

@jayceslesar jayceslesar Sep 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have these be separate methods instead of an argument/parameter for a sort field?

I guess I would rather just do

table.update_sort_order([SortField1, SortField2, ...])

Ideally whatever abstraction in the above syntax also takes care of the _column_name_to_id conversion

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think its done this way to provider a "builder" for the desired sort order, and to check against previously set sort orders

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it definitely is cleaner than making some static method of the sort idea need to be aware of the schema of the table to map field ids

Comment on lines +1276 to +1278

Sort orders can only be updated by adding a new sort order. They cannot be deleted or modified.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the name of the method should be set_sort_order then, as update_sort_order possibly implies that it could be a modification?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think either is fine. theres a updateSortOrderSchema function in java https://grep.app/search?f.repo=apache%2Ficeberg&q=updatesort

)
def test_update_sort_order(catalog: Catalog, format_version: str, table_schema_simple: Schema) -> None:
simple_table = _simple_table(catalog, table_schema_simple, format_version)
simple_table.update_sort_order().asc("foo", IdentityTransform(), NullOrder.NULLS_FIRST).desc(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should use context manager here? to test the auto commit path

@Fokko
Copy link
Contributor Author

Fokko commented Oct 2, 2025

Maybe we should get this in, to credit @JasperHG90's work, and you can polish the remaining item :)

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

the update_sort_order api is used to build the desired sort order. if it matches a previous sort order, the previous sort order id will be reused. otherwise, we'll create a new sort order with an incremented id

@kevinjqliu kevinjqliu merged commit 7e5cd32 into apache:main Oct 3, 2025
10 checks passed
@kevinjqliu
Copy link
Contributor

Thank you @JasperHG90 for working on this! Appreciate all the efforts here :)
Thanks @gabeiglio @jayceslesar and @Fokko for the review.

This pr sets a great foundation, we can continue to refine it. I personally want to see an integration with our spark tests

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[feat] Support update table's sort order

5 participants