From 086aef537873e9ed29d4520222e3403b19d85fed Mon Sep 17 00:00:00 2001 From: Peng-Jui Wang Date: Sun, 6 Jul 2025 15:13:30 -0700 Subject: [PATCH 1/6] improve REST catalog documentation with examples and configuration details --- mkdocs/docs/configuration.md | 70 ++++++++++++++++++++++++++++++++++-- 1 file changed, 68 insertions(+), 2 deletions(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index bc514e39af..1bec16df53 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -357,7 +357,7 @@ catalog: #### Headers in RESTCatalog -To configure custom headers in RESTCatalog, include them in the catalog properties with the prefix `header.`. This +To configure custom headers in RESTCatalog, include them in the catalog properties with `header.`. This ensures that all HTTP requests to the REST service include the specified headers. ```yaml @@ -372,7 +372,73 @@ Specific headers defined by the RESTCatalog spec include: | Key | Options | Default | Description | | ------------------------------------ | ------------------------------------- | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `header.X-Iceberg-Access-Delegation` | `{vended-credentials,remote-signing}` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms | +| `header.X-Iceberg-Access-Delegation` | `{vended-credentials,remote-signing}` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. | + +#### Authentication Options +- **SigV4**: For AWS services that require SigV4 signing. +- **Token**: Use the `token` property to pass a bearer token for services that accept token-based authentication. +- **Credential**: Use the `credential` property with format `client_id:client_secret` for authentication. +- **OAuth2**: Use the `oauth2-server-uri` property to specify a custom OAuth2 endpoint for client credentials authentication. + +#### Common Integrations & Examples + +##### Glue (AWS) +```yaml +catalog: + s3_tables_catalog: + type: rest + uri: https://glue..amazonaws.com/iceberg + warehouse: :s3tablescatalog/ + rest.sigv4-enabled: true + rest.signing-name: glue + rest.signing-region: +``` + +##### Unity Catalog (Databricks) +```yaml +catalog: + unity_catalog: + type: rest + uri: https:///api/2.1/unity-catalog/iceberg-rest + warehouse: + token: +``` + +##### R2 Data Catalog (Cloudflare) +```yaml +catalog: + r2_catalog: + type: rest + uri: + warehouse: + token: +``` + +##### Lakekeeper +```yaml +catalog: + lakekeeper_catalog: + type: rest + uri: + warehouse: + credential: : + oauth2-server-uri: http://localhost:30080/realms//protocol/openid-connect/token + scope: lakekeeper +``` + +##### Polaris (Snowflake) +```yaml +catalog: + polaris_catalog: + type: rest + uri: https://.snowflakecomputing.com/polaris/api/catalog + warehouse: + credential: : + header.X-Iceberg-Access-Delegation: vended-credentials + scope: PRINCIPAL_ROLE:ALL + token-refresh-enabled: true + py-io-impl: pyiceberg.io.fsspec.FsspecFileIO +``` ### SQL Catalog From 136a7bea025fd9f7062ebe6ad620d834d4d65ccc Mon Sep 17 00:00:00 2001 From: Peng-Jui Wang Date: Sun, 6 Jul 2025 21:43:19 -0700 Subject: [PATCH 2/6] adjust the position of the auth configuration in the docs --- mkdocs/docs/configuration.md | 45 ++++++++++++++++++------------------ 1 file changed, 23 insertions(+), 22 deletions(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index 1bec16df53..435e727ab2 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -339,21 +339,15 @@ catalog: | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | -| uri | | URI identifying the REST Server | +| uri | | URI identifying the REST Server | | ugi | t-1234:secret | Hadoop UGI for Hive client. | -| credential | t-1234:secret | Credential to use for OAuth2 credential flow when initializing the catalog | -| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | | scope | openid offline corpds:ds:profile | Desired scope of the requested security token (default : catalog) | | resource | rest_catalog.iceberg.com | URI for the target resource or service | | audience | rest_catalog | Logical name of target resource or service | -| rest.sigv4-enabled | true | Sign requests to the REST Server using AWS SigV4 protocol | -| rest.signing-region | us-east-1 | The region to use when SigV4 signing a request | -| rest.signing-name | execute-api | The service signing name to use when SigV4 signing a request | -| oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | -| snapshot-loading-mode | refs | The snapshots to return in the body of the metadata. Setting the value to `all` would return the full set of snapshots currently valid for the table. Setting the value to `refs` would load all snapshots referenced by branches or tags. | -| warehouse | myWarehouse | Warehouse location or identifier to request from the catalog service. May be used to determine server-side overrides, such as the warehouse location. | +| snapshot-loading-mode | refs | The snapshots to return in the body of the metadata. Setting the value to `all` would return the full set of snapshots currently valid for the table. Setting the value to `refs` would load all snapshots referenced by branches or tags. | +| warehouse | myWarehouse | Warehouse location or identifier to request from the catalog service. May be used to determine server-side overrides, such as the warehouse location. | +| `header.X-Iceberg-Access-Delegation` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. (default: `vended-credentials`) | - #### Headers in RESTCatalog @@ -368,21 +362,28 @@ catalog: header.content-type: application/vnd.api+json ``` -Specific headers defined by the RESTCatalog spec include: - -| Key | Options | Default | Description | -| ------------------------------------ | ------------------------------------- | -------------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | -| `header.X-Iceberg-Access-Delegation` | `{vended-credentials,remote-signing}` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. | #### Authentication Options - **SigV4**: For AWS services that require SigV4 signing. -- **Token**: Use the `token` property to pass a bearer token for services that accept token-based authentication. -- **Credential**: Use the `credential` property with format `client_id:client_secret` for authentication. -- **OAuth2**: Use the `oauth2-server-uri` property to specify a custom OAuth2 endpoint for client credentials authentication. +- **OAuth2**: For services that require OAuth2 authentication. + - **Bearer Token**: Use the `token` property to pass a bearer token directly for services that accept token-based authentication. + - **Client Credentials**: Use the `credential` property with the format `client_id:client_secret` to perform the OAuth2 client credentials flow. Optionally, use the `oauth2-server-uri` property to specify a custom OAuth2 endpoint for client credentials authentication. + +| Key | Example | Description | +| ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | +| rest.sigv4-enabled | true | Sign requests to the REST Server using AWS SigV4 protocol | +| rest.signing-region | us-east-1 | The region to use when SigV4 signing a request | +| rest.signing-name | execute-api | The service signing name to use when SigV4 signing a request | +| oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | +| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | +| credential | t-1234:secret | Credential to use for OAuth2 credential flow when initializing the catalog | + + + #### Common Integrations & Examples -##### Glue (AWS) +##### AWS Glue ```yaml catalog: s3_tables_catalog: @@ -394,7 +395,7 @@ catalog: rest.signing-region: ``` -##### Unity Catalog (Databricks) +##### Unity Catalog ```yaml catalog: unity_catalog: @@ -404,7 +405,7 @@ catalog: token: ``` -##### R2 Data Catalog (Cloudflare) +##### R2 Data Catalog ```yaml catalog: r2_catalog: @@ -426,7 +427,7 @@ catalog: scope: lakekeeper ``` -##### Polaris (Snowflake) +##### Apache Polaris ```yaml catalog: polaris_catalog: From 09a3846577a18fd1ba2d96ebf2721fb8e9b2b729 Mon Sep 17 00:00:00 2001 From: Peng-Jui Wang Date: Tue, 8 Jul 2025 00:00:48 -0700 Subject: [PATCH 3/6] refactor auth options in doc --- mkdocs/docs/configuration.md | 24 ++++++++++++------------ 1 file changed, 12 insertions(+), 12 deletions(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index 435e727ab2..28933a2c2c 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -340,12 +340,8 @@ catalog: | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | | uri | | URI identifying the REST Server | -| ugi | t-1234:secret | Hadoop UGI for Hive client. | -| scope | openid offline corpds:ds:profile | Desired scope of the requested security token (default : catalog) | -| resource | rest_catalog.iceberg.com | URI for the target resource or service | -| audience | rest_catalog | Logical name of target resource or service | -| snapshot-loading-mode | refs | The snapshots to return in the body of the metadata. Setting the value to `all` would return the full set of snapshots currently valid for the table. Setting the value to `refs` would load all snapshots referenced by branches or tags. | | warehouse | myWarehouse | Warehouse location or identifier to request from the catalog service. May be used to determine server-side overrides, such as the warehouse location. | +| snapshot-loading-mode | refs | The snapshots to return in the body of the metadata. Setting the value to `all` would return the full set of snapshots currently valid for the table. Setting the value to `refs` would load all snapshots referenced by branches or tags. | | `header.X-Iceberg-Access-Delegation` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. (default: `vended-credentials`) | @@ -364,19 +360,23 @@ catalog: #### Authentication Options -- **SigV4**: For AWS services that require SigV4 signing. -- **OAuth2**: For services that require OAuth2 authentication. - - **Bearer Token**: Use the `token` property to pass a bearer token directly for services that accept token-based authentication. - - **Client Credentials**: Use the `credential` property with the format `client_id:client_secret` to perform the OAuth2 client credentials flow. Optionally, use the `oauth2-server-uri` property to specify a custom OAuth2 endpoint for client credentials authentication. +##### OAuth2 +| Key | Example | Description | +| ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | +| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | +| oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | +| credential | client_id:client_secret | Credential to use for OAuth2 credential flow when initializing the catalog | +| scope | openid offline corpds:ds:profile | Desired scope of the requested security token (default : catalog) | +| resource | rest_catalog.iceberg.com | URI for the target resource or service | +| audience | rest_catalog | Logical name of target resource or service | + +##### SigV4 | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | | rest.sigv4-enabled | true | Sign requests to the REST Server using AWS SigV4 protocol | | rest.signing-region | us-east-1 | The region to use when SigV4 signing a request | | rest.signing-name | execute-api | The service signing name to use when SigV4 signing a request | -| oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | -| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | -| credential | t-1234:secret | Credential to use for OAuth2 credential flow when initializing the catalog | From 710426ed47731317dcc1e204fe8a4f108ad75e25 Mon Sep 17 00:00:00 2001 From: Kevin Liu Date: Tue, 8 Jul 2025 07:38:01 -0700 Subject: [PATCH 4/6] Update mkdocs/docs/configuration.md --- mkdocs/docs/configuration.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index 28933a2c2c..8f5a9c076a 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -345,9 +345,9 @@ catalog: | `header.X-Iceberg-Access-Delegation` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. (default: `vended-credentials`) | -#### Headers in RESTCatalog +#### Headers in REST Catalog -To configure custom headers in RESTCatalog, include them in the catalog properties with `header.`. This +To configure custom headers in REST Catalog, include them in the catalog properties with `header.`. This ensures that all HTTP requests to the REST service include the specified headers. ```yaml From 8cbb3a7833ec8cf0bf18c9a42eee868c5f4acfe3 Mon Sep 17 00:00:00 2001 From: Kevin Liu Date: Tue, 8 Jul 2025 07:38:06 -0700 Subject: [PATCH 5/6] Update mkdocs/docs/configuration.md --- mkdocs/docs/configuration.md | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index 8f5a9c076a..8598bbee0b 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -364,8 +364,9 @@ catalog: ##### OAuth2 | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | -| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | | oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | +| +| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header ``` | credential | client_id:client_secret | Credential to use for OAuth2 credential flow when initializing the catalog | | scope | openid offline corpds:ds:profile | Desired scope of the requested security token (default : catalog) | | resource | rest_catalog.iceberg.com | URI for the target resource or service | From 0055ac1ffdeb7c829aebb19c4f82f0bd6b388c3a Mon Sep 17 00:00:00 2001 From: Kevin Liu Date: Tue, 8 Jul 2025 07:40:42 -0700 Subject: [PATCH 6/6] make lint --- mkdocs/docs/configuration.md | 13 ++++++++----- 1 file changed, 8 insertions(+), 5 deletions(-) diff --git a/mkdocs/docs/configuration.md b/mkdocs/docs/configuration.md index 8598bbee0b..a39b9300ea 100644 --- a/mkdocs/docs/configuration.md +++ b/mkdocs/docs/configuration.md @@ -344,7 +344,6 @@ catalog: | snapshot-loading-mode | refs | The snapshots to return in the body of the metadata. Setting the value to `all` would return the full set of snapshots currently valid for the table. Setting the value to `refs` would load all snapshots referenced by branches or tags. | | `header.X-Iceberg-Access-Delegation` | `vended-credentials` | Signal to the server that the client supports delegated access via a comma-separated list of access mechanisms. The server may choose to supply access via any or none of the requested mechanisms. When using `vended-credentials`, the server provides temporary credentials to the client. When using `remote-signing`, the server signs requests on behalf of the client. (default: `vended-credentials`) | - #### Headers in REST Catalog To configure custom headers in REST Catalog, include them in the catalog properties with `header.`. This @@ -358,21 +357,21 @@ catalog: header.content-type: application/vnd.api+json ``` - #### Authentication Options ##### OAuth2 + | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | | oauth2-server-uri | | Authentication URL to use for client credentials authentication (default: uri + 'v1/oauth/tokens') | -| -| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header ``` +| token | FEW23.DFSDF.FSDF | Bearer token value to use for `Authorization` header | | credential | client_id:client_secret | Credential to use for OAuth2 credential flow when initializing the catalog | | scope | openid offline corpds:ds:profile | Desired scope of the requested security token (default : catalog) | | resource | rest_catalog.iceberg.com | URI for the target resource or service | | audience | rest_catalog | Logical name of target resource or service | ##### SigV4 + | Key | Example | Description | | ------------------- | -------------------------------- | -------------------------------------------------------------------------------------------------- | | rest.sigv4-enabled | true | Sign requests to the REST Server using AWS SigV4 protocol | @@ -381,10 +380,10 @@ catalog: - #### Common Integrations & Examples ##### AWS Glue + ```yaml catalog: s3_tables_catalog: @@ -397,6 +396,7 @@ catalog: ``` ##### Unity Catalog + ```yaml catalog: unity_catalog: @@ -407,6 +407,7 @@ catalog: ``` ##### R2 Data Catalog + ```yaml catalog: r2_catalog: @@ -417,6 +418,7 @@ catalog: ``` ##### Lakekeeper + ```yaml catalog: lakekeeper_catalog: @@ -429,6 +431,7 @@ catalog: ``` ##### Apache Polaris + ```yaml catalog: polaris_catalog: