-
Notifications
You must be signed in to change notification settings - Fork 413
Add BigQuery Metastore Catalog #2068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
3c94c83 to
e8adf49
Compare
|
@rambleraptor user here, I tried to use bigquery metastore recently when it was announced it has added rest API interface, I am just wondering, why you need to add support for pyiceberg if it is already using rest API ? |
e8adf49 to
87e0734
Compare
|
Edit: looks like that is supported via |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are special credentials needed to run this in CI?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the integration tests, yes. I'm unclear how credentials work on the current AWS test, so I could use some pointers on that. If we need real cloud resources to run these integration tests, they should be owned by Iceberg.
Unlike AWS + mock_aws, GCP doesn't have a full mock implementation that we can use.
87e0734 to
f2ac9dd
Compare
talatuyarer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rambleraptor I dropped few comments. Overall I test this catalog on my local env it is working as expected. LGTM. please address comments before merging this.
f2ac9dd to
78eefcd
Compare
|
we merged a lot of library updates this morning. Could you rebase the PR? |
78eefcd to
c61838f
Compare
|
@kevinjqliu rebased! |
c61838f to
706ab04
Compare
|
Thanks for the pr @rambleraptor I see that the BigQuery Metastore catalog was merged on the java side (apache/iceberg#12808). |
|
Since the Java side already has merged, I think it makes sense to move this forward here as well. From @talatuyarer I understand that the BigQuery Metadata store doesn't have a REST endpoint (yet), in constrast to BigLake (confusing naming :) |
706ab04 to
4c266a8
Compare
|
@Fokko @kevinjqliu yeah, that's the perfect description of the situation. Our product naming is...less than ideal. |
|
Thank you @Fokko and @kevinjqliu Yes There is two independed product. BQ does not support Rest catalog. BigQuery Metastore whhich we implement in this PR, is a GA product since This January https://cloud.google.com/blog/products/data-analytics/introducing-bigquery-metastore-fully-managed-metadata-service |
4c266a8 to
af8afa6
Compare
|
I added a pytest.mark.skipif to ensure that integration tests are not run if credentials are not present. I also filed #2368 to get the integration tests running + clean up the GCP integration tests in general. |
|
I looks like some tests are still running: |
af8afa6 to
e01510b
Compare
|
Tests should be fixed! |
Rationale for this change
This PR brings BigQuery Metastore support to Python after it was merged into the Java implementation.
This allows Iceberg catalog functionality to be backed by BigQuery. It supports creating/deleting/listing namespaces (datasets in BigQuery terminology), creating/deleting/listing tables, and registering tables.
This is my first PR of size to iceberg-python, so any advice would be appreciated!
Are these changes tested?
Integration and unit tests included.
Are there any user-facing changes?
Introduces a new Catalog type.