-
Notifications
You must be signed in to change notification settings - Fork 60
feat: add gooddata-pipelines package #1074
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
4b86db9
995b893
be50cea
02c1978
e61d4c6
9342674
0f42e09
3a45cf7
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -48,3 +48,4 @@ jobs: | |
| - 'gooddata-dbt/**' | ||
| - 'gooddata-flight-server/**' | ||
| - 'gooddata-flexconnect/**' | ||
| - 'gooddata-pipelines/**' | ||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,4 +1,5 @@ | ||
| .idea | ||
| .vscode | ||
| *.iml | ||
| .env | ||
| .env.test | ||
|
|
||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,37 @@ | ||
| # Distribution / packaging | ||
| **venv/* | ||
| **/__pycache__/* | ||
| **.env | ||
| **.log | ||
| .Python | ||
| env/ | ||
| build/ | ||
| develop-eggs/ | ||
| dist/ | ||
| downloads/ | ||
| eggs/ | ||
| .eggs/ | ||
| lib/ | ||
| lib64/ | ||
| parts/ | ||
| sdist/ | ||
| var/ | ||
| *.egg-info/ | ||
| .installed.cfg | ||
| *.egg | ||
|
|
||
|
|
||
| # Unit test / coverage reports | ||
| htmlcov/ | ||
| .tox/ | ||
| .coverage | ||
| .coverage.* | ||
| .cache | ||
| nosetests.xml | ||
| coverage.xml | ||
| *,cover | ||
| .hypothesis/ | ||
| .python-version | ||
| .pytest_cache | ||
|
|
||
| .log | ||
Large diffs are not rendered by default.
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,16 @@ | ||
| # (C) 2025 GoodData Corporation | ||
|
|
||
| # Skip tests if running Python 3.9 from CI (gooddata-pipelines doesn't support py39) | ||
| ifeq ($(TEST_ENVS),py39) | ||
| .PHONY: test-ci | ||
| test-ci: | ||
| @echo "Skipping tests for Python 3.9 - gooddata-pipelines doesn't support this version" | ||
| @exit 0 | ||
|
|
||
| .PHONY: test | ||
| test: | ||
| @echo "Skipping tests for Python 3.9 - gooddata-pipelines doesn't support this version" | ||
| @exit 0 | ||
| else | ||
| include ../project_common.mk | ||
| endif |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,63 @@ | ||
| # GoodData Pipelines | ||
|
|
||
| A high level library for automating the lifecycle of GoodData Cloud (GDC). | ||
|
|
||
| You can use the package to manage following resoursec in GDC: | ||
|
|
||
| 1. Provisioning (create, update, delete) | ||
| - User profiles | ||
| - User Groups | ||
| - User/Group permissions | ||
| - User Data Filters | ||
| - Child workspaces (incl. Workspace Data Filter settings) | ||
| 1. _[PLANNED]:_ Backup and restore of workspaces | ||
| 1. _[PLANNED]:_ Custom fields management | ||
| - extend the Logical Data Model of a child workspace | ||
|
|
||
| In case you are not interested in incorporating a library in your own program, but would like to use a ready-made script, consider having a look at [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools). | ||
|
|
||
| ## Provisioning | ||
|
|
||
| The entities can be managed either in _full load_ or _incremental_ way. | ||
|
|
||
| Full load means that the input data should represent the full and complete desired state of GDC after the script has finished. For example, you would include specification of all child workspaces you want to exist in GDC in the input data for workspace provisioning. Any workspaces present in GDC and not defined in the source data (i.e., your input) will be deleted. | ||
|
|
||
| On the other hand, the incremental load treats the source data as instructions for a specific change, e.g., a creation or a deletion of a specific workspace. You can specify which workspaces you would want to delete or create, while the rest of the workspaces already present in GDC will remain as they are, ignored by the provisioning script. | ||
|
|
||
| The provisioning module exposes _Provisioner_ classes reflecting the different entities. The typical usage would involve importing the Provisioner class and the data input data model for the class and planned provisioning method: | ||
|
|
||
| ```python | ||
| import os | ||
| from csv import DictReader | ||
| from pathlib import Path | ||
|
|
||
| # Import the Entity Provisioner class and corresponing model from gooddata_pipelines library | ||
| from gooddata_pipelines import UserFullLoad, UserProvisioner | ||
|
|
||
| # Optional: you can set up logging and subscribe it to the Provisioner | ||
| from utils.logger import setup_logging | ||
|
|
||
| setup_logging() | ||
| logger = logging.getLogger(__name__) | ||
|
|
||
| # Create the Provisioner instance - you can also create the instance from a GDC yaml profile | ||
| provisioner = UserProvisioner( | ||
| host=os.environ["GDC_HOSTNAME"], token=os.environ["GDC_AUTH_TOKEN"] | ||
| ) | ||
|
|
||
| # Optional: subscribe to logs | ||
| provisioner.logger.subscribe(logger) | ||
|
|
||
| # Load your data from your data source | ||
| source_data_path: Path = Path("path/to/some.csv") | ||
| source_data_reader = DictReader(source_data_path.read_text().splitlines()) | ||
| source_data = [row for row in source_data_reader] | ||
|
|
||
| # Validate your input data with | ||
| full_load_data: list[UserFullLoad] = UserFullLoad.from_list_of_dicts( | ||
| source_data | ||
| ) | ||
| provisioner.full_load(full_load_data) | ||
| ``` | ||
|
|
||
| Ready made scripts covering the basic use cases can be found here in the [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools) repository |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,24 @@ | ||
| # TODO | ||
|
|
||
| A list of outstanding tasks, features, or technical debt to be addressed in this project. | ||
|
|
||
| ## Features | ||
|
|
||
| - [ ] Workspace restore | ||
|
|
||
| ## Refactoring / Debt | ||
|
|
||
| - [ ] Integrate with GoodDataApiClient | ||
| - [ ] Consider replacing the SdkMethods wrapper with direct calls to the SDK methods | ||
| - [ ] Consider using orjson library instead of json | ||
| - [ ] Cleanup custom exceptions | ||
| - [ ] Improve test coverage. Write missing unit tests for legacy code (e.g., user data filters) | ||
|
|
||
| ## Documentation | ||
|
|
||
| - [ ] Improve package README | ||
| - [ ] Workspace provisioning | ||
| - [ ] User provisioning | ||
| - [ ] User group provisioning | ||
| - [ ] Permission provisioning | ||
| - [ ] User data filter provisioning |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,59 @@ | ||
| # (C) 2025 GoodData Corporation | ||
|
|
||
| from ._version import __version__ | ||
|
|
||
| # -------- Backup and Restore -------- | ||
| from .backup_and_restore.backup_manager import BackupManager | ||
| from .backup_and_restore.models.storage import ( | ||
| BackupRestoreConfig, | ||
| StorageType, | ||
| ) | ||
| from .backup_and_restore.storage.local_storage import LocalStorage | ||
| from .backup_and_restore.storage.s3_storage import S3Storage | ||
|
|
||
| # -------- Provisioning -------- | ||
| from .provisioning.entities.user_data_filters.models.udf_models import ( | ||
| UserDataFilterFullLoad, | ||
| ) | ||
| from .provisioning.entities.user_data_filters.user_data_filters import ( | ||
| UserDataFilterProvisioner, | ||
| ) | ||
| from .provisioning.entities.users.models.permissions import ( | ||
| PermissionFullLoad, | ||
| PermissionIncrementalLoad, | ||
| ) | ||
| from .provisioning.entities.users.models.user_groups import ( | ||
| UserGroupFullLoad, | ||
| UserGroupIncrementalLoad, | ||
| ) | ||
| from .provisioning.entities.users.models.users import ( | ||
| UserFullLoad, | ||
| UserIncrementalLoad, | ||
| ) | ||
| from .provisioning.entities.users.permissions import PermissionProvisioner | ||
| from .provisioning.entities.users.user_groups import UserGroupProvisioner | ||
| from .provisioning.entities.users.users import UserProvisioner | ||
| from .provisioning.entities.workspaces.models import WorkspaceFullLoad | ||
| from .provisioning.entities.workspaces.workspace import WorkspaceProvisioner | ||
|
|
||
| __all__ = [ | ||
| "BackupManager", | ||
| "BackupRestoreConfig", | ||
| "StorageType", | ||
| "LocalStorage", | ||
| "S3Storage", | ||
| "WorkspaceFullLoad", | ||
| "WorkspaceProvisioner", | ||
| "UserIncrementalLoad", | ||
| "UserGroupIncrementalLoad", | ||
| "PermissionFullLoad", | ||
| "PermissionIncrementalLoad", | ||
| "UserFullLoad", | ||
| "UserGroupFullLoad", | ||
| "UserProvisioner", | ||
| "UserGroupProvisioner", | ||
| "PermissionProvisioner", | ||
| "UserDataFilterProvisioner", | ||
| "UserDataFilterFullLoad", | ||
| "__version__", | ||
| ] | ||
|
Comment on lines
+39
to
+59
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What is the purpose of this
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm using those to stop the language server from reporting unused imports. The original function of these was to explicitly state what should be imported when using |
||
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,7 @@ | ||
| # (C) 2025 GoodData Corporation | ||
| from importlib import metadata | ||
|
|
||
| try: | ||
| __version__ = metadata.version("gooddata-pipelines") | ||
| except metadata.PackageNotFoundError: | ||
| __version__ = "unknown-version" |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| # (C) 2025 GoodData Corporation | ||
|
|
||
| from .gooddata_api_wrapper import GoodDataApi | ||
|
|
||
| __all__ = ["GoodDataApi"] |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,41 @@ | ||
| # (C) 2025 GoodData Corporation | ||
|
|
||
| """Exception class for Panther operations. | ||
|
|
||
| This module defines the internally used `PantherException` class, which is used | ||
| to handle exceptions that occur during operations related to the Panther SDK or | ||
| GoodData Cloud API. | ||
| """ | ||
|
|
||
|
|
||
| class GoodDataApiException(Exception): | ||
| """Exception raised during Panther operations. | ||
|
|
||
| This exception is used to indicate errors that occur during operations | ||
| related to interactions with the GoodData Python SDK or GoodData Cloud API. | ||
| It can include additional context provided through keyword arguments. | ||
| """ | ||
|
|
||
| def __init__(self, message: str, **kwargs: str) -> None: | ||
| """Raise a PantherException with a message and optional context. | ||
|
|
||
| Args: | ||
| message (str): The error message describing the exception. | ||
| **kwargs: Additional context for the exception, such as HTTP status, | ||
| API endpoint, and HTTP method or any other relevant information. | ||
| """ | ||
|
|
||
| super().__init__(message) | ||
| self.error_message: str = message | ||
|
|
||
| # Set default values for attributes. | ||
| # TODO: Consider if the defaults for these are still needed | ||
| # - the values were necessary for log schema implementations, which | ||
| # are not used anymore. | ||
| self.http_status: str = "500 Internal Server Error" | ||
| self.api_endpoint: str = "NA" | ||
| self.http_method: str = "NA" | ||
|
|
||
| # Set attributes from kwargs. | ||
| for key, value in kwargs.items(): | ||
| setattr(self, key, value) |
Uh oh!
There was an error while loading. Please reload this page.