Skip to content

Conversation

@ox
Copy link
Contributor

@ox ox commented Aug 22, 2025

Closes #2342

Rationale for this change

The last attempt at fixing this (#2343) did not actually end up serializing None values to null. This was b/c the AssertRefSnapshotId requirement was often a child object of some other Request object, which was actually getting the model_dump_json() call. This meant that the exclude_none=True override would remove any Nones before any of the Field-specific configurations would be considered (like exclude=False or the model's own model_dump_json method).

I then investigated setting some ConfigDict settings on the model, but that also did not have any useful configurations.

I looked into setting exclude_none=True at all of the model_json_dump() callsites but there were too many and it would be easy to forget to set that value every time. The issue is that many values are supposed to be absent to instruct the REST Catalog about what the client wants (things like filtering, for instance).

The solution I ended up with was allowing snapshot_id to be a required int with a default value of -1, and adding a json-specific field serializer to write out null when dumping to json. This seems to work

The solution that worked and allowed for snapshot_id to be set to None (as it's done in Ray Data), was to define a model_serializer method and always return the fields. This was based on conversations in pydantic/pydantic#7326.

Are these changes tested?

Yes

Are there any user-facing changes?

Actually allow snapshot-id to be null for assert-ref-snapshot-id requirements

@ox ox marked this pull request as ready for review August 22, 2025 21:33
Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! slight nit on making a few changes for maintainability :)

Comment on lines 732 to 736
return {
"type": self.type,
"ref": self.ref,
"snapshot-id": self.snapshot_id,
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: is there a way to call super() or the default serializer? we just want to explicitly override for snapshot-id.

otherwise, we'd have to remember to change to this function everytime we add a new variable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes! Added the mode='wrap' param that allows that

Copy link
Contributor

@kevinjqliu kevinjqliu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! very clean, very nice

Copy link
Contributor

@Fokko Fokko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up here @ox 🙌

@kevinjqliu kevinjqliu merged commit 4f02298 into apache:main Aug 25, 2025
10 checks passed
@kevinjqliu
Copy link
Contributor

Thanks for the follow up PR @ox and thanks @Fokko for the review

@ox ox deleted the allow-null-snapshot-id branch August 26, 2025 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

missing snapshot-id for assert-ref-snapshot-id requirement marshals to invalid JSON

3 participants