diff --git a/README.md b/README.md index 520a6b154..65b3f56a9 100644 --- a/README.md +++ b/README.md @@ -27,6 +27,12 @@ create pandas series and data frames. Check out the GoodData Pandas [documentation](https://gooddata-pandas.readthedocs.io/en/latest/) to learn more and get started. +### GoodData Pipelines + +The [gooddata-pipelines](./gooddata-pipelines/) package provides easy ways to manage the lifecycle of GoodData Cloud. + +Check out the GoodData Pipelines [documentation](https://www.gooddata.com/docs/python-sdk/latest/pipelines-overview/) to learn more and get started. + ### GoodData FlexConnect The [gooddata-flexconnect](./gooddata-flexconnect) package is the foundation for writing custom FlexConnect data sources. @@ -45,5 +51,6 @@ into PostgreSQL as foreign tables that you can then query using SQL. Check out the GoodData Foreign Data Wrapper [documentation](https://gooddata-fdw.readthedocs.io/en/latest/) to learn more and get started. ## Contributing + If you would like to improve, extend or fix a feature in the repository, read and follow the [Contributing guide](./CONTRIBUTING.md). diff --git a/docs/content/en/latest/pipelines/provisioning/_index.md b/docs/content/en/latest/pipelines/provisioning/_index.md index 9429ab271..7235d0026 100644 --- a/docs/content/en/latest/pipelines/provisioning/_index.md +++ b/docs/content/en/latest/pipelines/provisioning/_index.md @@ -15,7 +15,7 @@ Resources you can provision using GoodData Pipelines: - [Users](users/) - [User Groups](user_groups/) - [Workspace Permissions](workspace-permissions/) - +- [User Data Filters](user_data_filters/) ## Workflow Types @@ -30,8 +30,8 @@ The provisioning types employ different algorithms and expect different structur Full load provisioning aims to fully synchronize the state of your GoodData instance with the provided input. This workflow will create new resources and update existing ones based on the input. Any resources existing on GoodData Cloud not included in the input will be deleted. -{{% alert color="warning" title="Full loads are destrucitve"%}} -Full load provisioning will delete any existing resources not included in your input data. Test in non-production environment. +{{% alert color="warning" title="Full loads are destructive"%}} +Full load provisioning will delete any existing resources not included in your input data. Test in a non-production environment. {{% /alert %}} ### Incremental Load @@ -40,14 +40,20 @@ During incremental provisioning, the algorithm will only interact with resources ### Workflow Comparison -| **Aspect** | **Full Load** | **Incremental Load** | -|------------|---------------|----------------------| -| **Scope** | Synchronizes entire state | Only specified resources | +| **Aspect** | **Full Load** | **Incremental Load** | +| ------------ | ----------------------------- | ------------------------------------------------ | +| **Scope** | Synchronizes entire state | Only specified resources | | **Deletion** | Deletes unspecified resources | Only deletes resources marked `is_active: False` | -| **Use Case** | Complete environment setup | Targeted updates | +| **Use Case** | Complete environment setup | Targeted updates | ## Usage +You can use either resource-specific Provisioner objects, or a generic function to handle the provisioning logic. + +The generic function validates the data, creates a provisioner instance, and runs the provisioning under the hood, reducing the boilerplate code. On the other hand, the resource-specific approach is more transparent with expected data structures. + +### Provisioner Objects + Regardless of workflow type or resource being provisioned, the typical usage follows these steps: 1. Initialize the provisioner @@ -56,9 +62,38 @@ Regardless of workflow type or resource being provisioned, the typical usage fol 1. Run the selected provisioning method (`.full_load()` or `.incremental_load()`) with your validated data - Check the [resource pages](#supported-resources) for detailed instructions and examples of workflow implementations. +### Generic Function + +You can also use a generic provisioning function: + +```python +from gooddata_pipelines import WorkflowType, provision + +``` + +The function requires the following arguments: + +| name | description | +| ------------- | ------------------------------------------------------ | +| data | Raw data as a list of dictionaries | +| workflow_type | Enum indicating provisioned resource and workflow type | +| host | URL of your GoodData instance | +| token | GoodData Personal Access Token | +| logger | Logger object to subscribe to the logs _[optional]_ | + +The function will validate the raw data against the model corresponding to the selected `workflow_type` value. Note that the function only supports resources listed in the `WorkflowType` enum. + +To see the expected data structure, check out the pages dedicated to individual resources. The raw dictionaries should have the same structure as the validation models outlined there. + +To run the provisioning, simply call the function with its required arguments. + +```python +provision(raw_data, WorkflowType.WORKSPACE_INCREMENTAL_LOAD, host, token) + +``` + ## Logs By default, the provisioners operate silently. To monitor progress and troubleshoot issues, you can subscribe to the emitted logs using the `.subscribe()` method on the `logger` property of the provisioner instance. @@ -89,3 +124,60 @@ provisioner.logger.subscribe(logger) # Continue with the provisioning ... ``` + +## Example + +Here is an example of workspace provisioning using the generic function. + +```python +import logging + +# Import the WorkflowType enum and the generic function from GoodData Pipelines +from gooddata_pipelines import WorkflowType, provision + +# Optional: set up a logger to pass to the function. The logger will be subscribed +# to the logs emitted by the provisioning scripts. +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) + + +host = "http://localhost:3000" +token = "some_user_token" + +# Prepare your raw data +raw_data: list[dict] = [ + { + "parent_id": "parent_workspace_id", + "workspace_id": "workspace_id_1", + "workspace_name": "Workspace 1", + "workspace_data_filter_id": "wdf__id", + "workspace_data_filter_values": ["wdf_value_1"], + "is_active": True, + }, + { + "parent_id": "parent_workspace_id", + "workspace_id": "workspace_id_2", + "workspace_name": "Workspace 2", + "workspace_data_filter_id": "wdf__id", + "workspace_data_filter_values": ["wdf_value_2"], + "is_active": True, + }, + { + "parent_id": "parent_workspace_id", + "workspace_id": "child_workspace_id_1", + "workspace_name": "Workspace 3", + "workspace_data_filter_id": "wdf__id", + "workspace_data_filter_values": ["wdf_value_3"], + "is_active": True, + }, +] + +# Run the provisioning function +provision( + data=raw_data, + workflow_type=WorkflowType.WORKSPACE_INCREMENTAL_LOAD, + host=host, + token=token, + logger=logger, +) +``` diff --git a/docs/content/en/latest/pipelines/provisioning/user_groups.md b/docs/content/en/latest/pipelines/provisioning/user_groups.md index 16dd862ad..194b1198e 100644 --- a/docs/content/en/latest/pipelines/provisioning/user_groups.md +++ b/docs/content/en/latest/pipelines/provisioning/user_groups.md @@ -10,6 +10,8 @@ User groups enable you to organize users and manage permissions at scale by assi You can provision user groups using full or incremental load methods. Each of these methods requires a specific input type. +{{% alert color="info" %}} This section covers the usage with manual data validation. You can also take advantage of the generic provisioning function. You can read more about it on the [Provisioning](../#generic-function) page. {{% /alert %}} + ## Usage Start by importing and initializing the UserGroupProvisioner. @@ -26,10 +28,10 @@ provisioner = UserGroupProvisioner.create(host=host, token=token) ``` - Then validate your data using an input model corresponding to the provisioned resource and selected workflow type, i.e., `UserGroupFullLoad` if you intend to run the provisioning in full load mode, or `UserGroupIncrementalLoad` if you want to provision incrementally. The models expect the following fields: + - **user_group_id**: ID of the user group. - **user_group_name**: Name of the user group. - **parent_user_groups**: A list of parent user group IDs. @@ -130,7 +132,6 @@ provisioner.full_load(validated_data) ``` - ### Incremental Load ```python diff --git a/docs/content/en/latest/pipelines/provisioning/users.md b/docs/content/en/latest/pipelines/provisioning/users.md index e192db9fe..d2ffa0fab 100644 --- a/docs/content/en/latest/pipelines/provisioning/users.md +++ b/docs/content/en/latest/pipelines/provisioning/users.md @@ -4,11 +4,12 @@ linkTitle: "Users" weight: 2 --- - User provisioning allows you to create, update, or delete user profiles in your GoodData environment. You can provision users using full or incremental load methods. Each of these methods requires a specific input type. +{{% alert color="info" %}} This section covers the usage with manual data validation. You can also take advantage of the generic provisioning function. You can read more about it on the [Provisioning](../#generic-function) page. {{% /alert %}} + ## Usage Start by importing and initializing the UserProvisioner. @@ -25,7 +26,6 @@ provisioner = UserProvisioner.create(host=host, token=token) ``` - Then validate your data using an input model corresponding to the provisioned resource and selected workflow type, i.e., `UserFullLoad` if you intend to run the provisioning in full load mode, or `UserIncrementalLoad` if you want to provision incrementally. The models expect the following fields: @@ -147,7 +147,6 @@ provisioner.full_load(validated_data) ``` - ### Incremental Load ```python diff --git a/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md b/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md index d71757322..b5caea785 100644 --- a/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md +++ b/docs/content/en/latest/pipelines/provisioning/workspace-permissions.md @@ -8,6 +8,8 @@ Workspace permission provisioning allows you to create, update, or delete user p You can provision workspace permissions using full or incremental load methods. Each of these methods requires a specific input type. +{{% alert color="info" %}} This section covers the usage with manual data validation. You can also take advantage of the generic provisioning function. You can read more about it on the [Provisioning](../#generic-function) page. {{% /alert %}} + ## Usage Start by importing and initializing the PermissionProvisioner. @@ -24,10 +26,10 @@ provisioner = PermissionProvisioner.create(host=host, token=token) ``` - Then validate your data using an input model corresponding to the provisioned resource and selected workflow type, i.e., `PermissionFullLoad` if you intend to run the provisioning in full load mode, or `PermissionIncrementalLoad` if you want to provision incrementally. The models expect the following fields: + - **permission**: Permission you want to grant, e.g., `VIEW`, `ANALYZE`, `MANAGE`. - **workspace_id**: ID of the workspace the permission will be applied to. - **entity_id**: ID of the entity (user or user group) which will receive the permission. @@ -138,7 +140,6 @@ provisioner.full_load(validated_data) ``` - ### Incremental Load ```python diff --git a/docs/content/en/latest/pipelines/provisioning/workspaces.md b/docs/content/en/latest/pipelines/provisioning/workspaces.md index a3da061a9..b3308b5e8 100644 --- a/docs/content/en/latest/pipelines/provisioning/workspaces.md +++ b/docs/content/en/latest/pipelines/provisioning/workspaces.md @@ -12,6 +12,7 @@ See [Multitenancy: One Platform, Many Customers](https://www.gooddata.com/resour You can provision child workspaces using full or incremental load methods. Each of these methods requires a specific input type. +{{% alert color="info" %}} This section covers the usage with manual data validation. You can also take advantage of the generic provisioning function. You can read more about it on the [Provisioning](../#generic-function) page. {{% /alert %}} ## Usage @@ -29,7 +30,6 @@ provisioner = WorkspaceProvisioner.create(host=host, token=token) ``` - Then validate your data using an input model corresponding to the provisioned resource and selected workflow type, i.e., `WorkspaceFullLoad` if you intend to run the provisioning in full load mode, or `WorkspaceIncrementalLoad` if you want to provision incrementally. The models expect the following fields: @@ -93,7 +93,6 @@ Now with the provisioner initialized and your data validated, you can run the pr provisioner.full_load(validated_data) ``` - ## Workspace Data Filters If you want to apply Workspace Data Filters to a child workspace, the filter must be set up on the parent workspace before you run the provisioning. diff --git a/gooddata-pipelines/README.md b/gooddata-pipelines/README.md index f79ea2b7b..1079a9c77 100644 --- a/gooddata-pipelines/README.md +++ b/gooddata-pipelines/README.md @@ -28,33 +28,37 @@ The provisioning module exposes _Provisioner_ classes reflecting the different e ```python import os +import logging + from csv import DictReader from pathlib import Path # Import the Entity Provisioner class and corresponding model from gooddata_pipelines library from gooddata_pipelines import UserFullLoad, UserProvisioner -from gooddata_pipelines.logger.logger import LogObserver - -# Optionally, subscribe a standard Python logger to the LogObserver -import logging -logger = logging.getLogger(__name__) -LogObserver().subscribe(logger) # Create the Provisioner instance - you can also create the instance from a GDC yaml profile provisioner = UserProvisioner( host=os.environ["GDC_HOSTNAME"], token=os.environ["GDC_AUTH_TOKEN"] ) +# Optional: set up logging and subscribe to logs emitted by the provisioner +logging.basicConfig(level=logging.INFO) +logger = logging.getLogger(__name__) +provisioner.logger.subscribe(logger) + # Load your data from your data source source_data_path: Path = Path("path/to/some.csv") source_data_reader = DictReader(source_data_path.read_text().splitlines()) source_data = [row for row in source_data_reader] -# Validate your input data with +# Validate your input data full_load_data: list[UserFullLoad] = UserFullLoad.from_list_of_dicts( source_data ) + +# Run the provisioning provisioner.full_load(full_load_data) + ``` ## Bugs & Requests @@ -64,5 +68,5 @@ or request features. ## Changelog -See [Github releases](https://github.com/gooddata/gooddata-python-sdk/releases) for released versions +See [Github releases](https://github.com/gooddata/gooddata-python-sdk/releases) for released versions and a list of changes. diff --git a/gooddata-pipelines/gooddata_pipelines/__init__.py b/gooddata-pipelines/gooddata_pipelines/__init__.py index a2c4792df..37ae75da0 100644 --- a/gooddata-pipelines/gooddata_pipelines/__init__.py +++ b/gooddata-pipelines/gooddata_pipelines/__init__.py @@ -51,6 +51,10 @@ ) from .provisioning.entities.workspaces.workspace import WorkspaceProvisioner +# -------- Generic Provisioning -------- +from .provisioning.generic.config import WorkflowType +from .provisioning.generic.provision import provision + __all__ = [ "BackupManager", "BackupRestoreConfig", @@ -79,5 +83,7 @@ "CustomFieldDefinition", "ColumnDataType", "CustomFieldType", + "provision", + "WorkflowType", "__version__", ] diff --git a/gooddata-pipelines/gooddata_pipelines/provisioning/generic/__init__.py b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/__init__.py new file mode 100644 index 000000000..37d863d60 --- /dev/null +++ b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/__init__.py @@ -0,0 +1 @@ +# (C) 2025 GoodData Corporation diff --git a/gooddata-pipelines/gooddata_pipelines/provisioning/generic/config.py b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/config.py new file mode 100644 index 000000000..4b994aba2 --- /dev/null +++ b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/config.py @@ -0,0 +1,118 @@ +# (C) 2025 GoodData Corporation + +from enum import Enum +from typing import Type, TypeAlias + +import attrs + +from gooddata_pipelines.provisioning.entities.users.models.permissions import ( + PermissionFullLoad, + PermissionIncrementalLoad, +) +from gooddata_pipelines.provisioning.entities.users.models.user_groups import ( + UserGroupFullLoad, + UserGroupIncrementalLoad, +) +from gooddata_pipelines.provisioning.entities.users.models.users import ( + UserFullLoad, + UserIncrementalLoad, +) +from gooddata_pipelines.provisioning.entities.users.permissions import ( + PermissionProvisioner, +) +from gooddata_pipelines.provisioning.entities.users.user_groups import ( + UserGroupProvisioner, +) +from gooddata_pipelines.provisioning.entities.users.users import UserProvisioner +from gooddata_pipelines.provisioning.entities.workspaces.models import ( + WorkspaceFullLoad, + WorkspaceIncrementalLoad, +) +from gooddata_pipelines.provisioning.entities.workspaces.workspace import ( + WorkspaceProvisioner, +) + +ValidationModel: TypeAlias = ( + PermissionFullLoad + | PermissionIncrementalLoad + | UserFullLoad + | UserIncrementalLoad + | UserGroupFullLoad + | UserGroupIncrementalLoad + | WorkspaceFullLoad + | WorkspaceIncrementalLoad +) + +Provisioner: TypeAlias = ( + PermissionProvisioner + | UserProvisioner + | UserGroupProvisioner + | WorkspaceProvisioner +) + + +class LoadType(str, Enum): + FULL = "full" + INCREMENTAL = "incremental" + + +class WorkflowType(str, Enum): + WORKSPACE_FULL_LOAD = "workspace_full_load" + WORKSPACE_INCREMENTAL_LOAD = "workspace_incremental_load" + USER_FULL_LOAD = "user_full_load" + USER_INCREMENTAL_LOAD = "user_incremental_load" + USER_GROUP_FULL_LOAD = "user_group_full_load" + USER_GROUP_INCREMENTAL_LOAD = "user_group_incremental_load" + PERMISSION_FULL_LOAD = "permission_full_load" + PERMISSION_INCREMENTAL_LOAD = "permission_incremental_load" + + +@attrs.define +class ProvisioningConfig: + validation_model: Type[ValidationModel] + provisioner_class: Type[Provisioner] + load_type: LoadType + + +PROVISIONING_CONFIG = { + WorkflowType.WORKSPACE_FULL_LOAD: ProvisioningConfig( + validation_model=WorkspaceFullLoad, + provisioner_class=WorkspaceProvisioner, + load_type=LoadType.FULL, + ), + WorkflowType.WORKSPACE_INCREMENTAL_LOAD: ProvisioningConfig( + validation_model=WorkspaceIncrementalLoad, + provisioner_class=WorkspaceProvisioner, + load_type=LoadType.INCREMENTAL, + ), + WorkflowType.USER_FULL_LOAD: ProvisioningConfig( + validation_model=UserFullLoad, + provisioner_class=UserProvisioner, + load_type=LoadType.FULL, + ), + WorkflowType.USER_INCREMENTAL_LOAD: ProvisioningConfig( + validation_model=UserIncrementalLoad, + provisioner_class=UserProvisioner, + load_type=LoadType.INCREMENTAL, + ), + WorkflowType.USER_GROUP_FULL_LOAD: ProvisioningConfig( + validation_model=UserGroupFullLoad, + provisioner_class=UserGroupProvisioner, + load_type=LoadType.FULL, + ), + WorkflowType.USER_GROUP_INCREMENTAL_LOAD: ProvisioningConfig( + validation_model=UserGroupIncrementalLoad, + provisioner_class=UserGroupProvisioner, + load_type=LoadType.INCREMENTAL, + ), + WorkflowType.PERMISSION_FULL_LOAD: ProvisioningConfig( + validation_model=PermissionFullLoad, + provisioner_class=PermissionProvisioner, + load_type=LoadType.FULL, + ), + WorkflowType.PERMISSION_INCREMENTAL_LOAD: ProvisioningConfig( + validation_model=PermissionIncrementalLoad, + provisioner_class=PermissionProvisioner, + load_type=LoadType.INCREMENTAL, + ), +} diff --git a/gooddata-pipelines/gooddata_pipelines/provisioning/generic/provision.py b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/provision.py new file mode 100644 index 000000000..d1669e7ae --- /dev/null +++ b/gooddata-pipelines/gooddata_pipelines/provisioning/generic/provision.py @@ -0,0 +1,49 @@ +# (C) 2025 GoodData Corporation + +from typing import Any + +from gooddata_pipelines.logger import LoggerLike +from gooddata_pipelines.provisioning.generic.config import ( + PROVISIONING_CONFIG, + LoadType, + WorkflowType, +) + + +def provision( + data: list[dict[str, Any]], + workflow_type: WorkflowType, + host: str, + token: str, + logger: LoggerLike | None = None, +) -> None: + """Generic provisioning function accepting raw data and workflow type. + + The function will validate data based on the selected workflow type and run + the corresponding provisioning in full or incremental mode. + + Args: + data: List of dictionaries containing the data to be provisioned. + workflow_type: The type of workflow to run. + host: The host of the GoodData platform. + token: The token for the GoodData platform. + logger: The logger to use for logging. + """ + + if workflow_type not in PROVISIONING_CONFIG: + raise ValueError(f"Invalid workflow type: {workflow_type}") + + config = PROVISIONING_CONFIG[workflow_type] + + provisioner = config.provisioner_class.create(host, token) + validated_data: list = [config.validation_model(**item) for item in data] + + if logger: + provisioner.logger.subscribe(logger) + + if config.load_type == LoadType.FULL: + provisioner.full_load(validated_data) + elif config.load_type == LoadType.INCREMENTAL: + provisioner.incremental_load(validated_data) + else: + raise ValueError(f"Invalid load type: {config.load_type}")