-
Notifications
You must be signed in to change notification settings - Fork 412
implement stageOnly Commit #2269
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
pyiceberg/table/update/snapshot.py
Outdated
| branch: str = MAIN_BRANCH, | ||
| stage_only: bool = False, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think API wise, this makes more sense. In the case branch is None then we don't set the ref, if it is not None then we set the ref.
| branch: str = MAIN_BRANCH, | |
| stage_only: bool = False, | |
| branch: Optional[str] = MAIN_BRANCH |
@kevinjqliu WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that make sense to me. Instead of setting stage_only=True, it would be branch=None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sense, thanks for taking a look!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at it a bit more, I think that's the way forward.
iceberg-python/pyiceberg/table/__init__.py
Lines 434 to 441 in 4b961f7
| def update_snapshot(self, snapshot_properties: Dict[str, str] = EMPTY_DICT, branch: Optional[str] = None) -> UpdateSnapshot: | |
| """Create a new UpdateSnapshot to produce a new snapshot for the table. | |
| Returns: | |
| A new UpdateSnapshot | |
| """ | |
| if branch is None: | |
| branch = MAIN_BRANCH |
We would change that into:
def update_snapshot(self, snapshot_properties: Dict[str, str] = EMPTY_DICT, branch: Optional[str] = MAIN_BRANCH) -> UpdateSnapshot:
"""Create a new UpdateSnapshot to produce a new snapshot for the table.
Returns:
A new UpdateSnapshot
"""There are a couple more places where we need to change the default to MAIN_BRANCH. Let me know what you think!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this makes sense.
I will take another deeper look to make sure it is backward compatible.
9bec165 to
884eca9
Compare
|
@stevie9868 Thanks for resuming the work on this. I did a quick check over the changes, and it looks good to me. Could you resolve the conflicts? For some reason, the CI did not trigger. |
dad5a1e to
0250e24
Compare
8e1a846 to
08dee72
Compare
|
Thanks for the quick review, and I have rebased my changes to resolve the conflicts. |
Fokko
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left one minor comment, but apart from that, this looks good to me! 🙌 Thanks for working on this @stevie9868
|
Thanks @stevie9868 |
Rationale for this change
In java, snapshotProducer can create stageOnly snapshot meaning only the snapshot is created but the snapshot is not set to a ref.
This is a prerequisite to support wap.id in py-iceberg
Are these changes tested?
Yes, tests are added.
Are there any user-facing changes?
By default, it will stay with the current existing behavior.