Skip to content

Commit d030129

Browse files
authored
feat: add gooddata-pipelines package (#1074)
feat: add gooddata-pipelines package
1 parent 43e5242 commit d030129

File tree

112 files changed

+9142
-7
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

112 files changed

+9142
-7
lines changed

.github/workflows/rw-collect-changes.yaml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -48,3 +48,4 @@ jobs:
4848
- 'gooddata-dbt/**'
4949
- 'gooddata-flight-server/**'
5050
- 'gooddata-flexconnect/**'
51+
- 'gooddata-pipelines/**'

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
.idea
2+
.vscode
23
*.iml
34
.env
45
.env.test

OSS LICENSES/LICENSE (gooddata-pipelines).txt

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

gooddata-pipelines/.gitignore

Lines changed: 37 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
# Distribution / packaging
2+
**venv/*
3+
**/__pycache__/*
4+
**.env
5+
**.log
6+
.Python
7+
env/
8+
build/
9+
develop-eggs/
10+
dist/
11+
downloads/
12+
eggs/
13+
.eggs/
14+
lib/
15+
lib64/
16+
parts/
17+
sdist/
18+
var/
19+
*.egg-info/
20+
.installed.cfg
21+
*.egg
22+
23+
24+
# Unit test / coverage reports
25+
htmlcov/
26+
.tox/
27+
.coverage
28+
.coverage.*
29+
.cache
30+
nosetests.xml
31+
coverage.xml
32+
*,cover
33+
.hypothesis/
34+
.python-version
35+
.pytest_cache
36+
37+
.log

gooddata-pipelines/LICENSE.txt

Lines changed: 1 addition & 0 deletions
Large diffs are not rendered by default.

gooddata-pipelines/Makefile

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# (C) 2025 GoodData Corporation
2+
3+
# Skip tests if running Python 3.9 from CI (gooddata-pipelines doesn't support py39)
4+
ifeq ($(TEST_ENVS),py39)
5+
.PHONY: test-ci
6+
test-ci:
7+
@echo "Skipping tests for Python 3.9 - gooddata-pipelines doesn't support this version"
8+
@exit 0
9+
10+
.PHONY: test
11+
test:
12+
@echo "Skipping tests for Python 3.9 - gooddata-pipelines doesn't support this version"
13+
@exit 0
14+
else
15+
include ../project_common.mk
16+
endif

gooddata-pipelines/README.md

Lines changed: 63 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,63 @@
1+
# GoodData Pipelines
2+
3+
A high level library for automating the lifecycle of GoodData Cloud (GDC).
4+
5+
You can use the package to manage following resoursec in GDC:
6+
7+
1. Provisioning (create, update, delete)
8+
- User profiles
9+
- User Groups
10+
- User/Group permissions
11+
- User Data Filters
12+
- Child workspaces (incl. Workspace Data Filter settings)
13+
1. _[PLANNED]:_ Backup and restore of workspaces
14+
1. _[PLANNED]:_ Custom fields management
15+
- extend the Logical Data Model of a child workspace
16+
17+
In case you are not interested in incorporating a library in your own program, but would like to use a ready-made script, consider having a look at [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools).
18+
19+
## Provisioning
20+
21+
The entities can be managed either in _full load_ or _incremental_ way.
22+
23+
Full load means that the input data should represent the full and complete desired state of GDC after the script has finished. For example, you would include specification of all child workspaces you want to exist in GDC in the input data for workspace provisioning. Any workspaces present in GDC and not defined in the source data (i.e., your input) will be deleted.
24+
25+
On the other hand, the incremental load treats the source data as instructions for a specific change, e.g., a creation or a deletion of a specific workspace. You can specify which workspaces you would want to delete or create, while the rest of the workspaces already present in GDC will remain as they are, ignored by the provisioning script.
26+
27+
The provisioning module exposes _Provisioner_ classes reflecting the different entities. The typical usage would involve importing the Provisioner class and the data input data model for the class and planned provisioning method:
28+
29+
```python
30+
import os
31+
from csv import DictReader
32+
from pathlib import Path
33+
34+
# Import the Entity Provisioner class and corresponing model from gooddata_pipelines library
35+
from gooddata_pipelines import UserFullLoad, UserProvisioner
36+
37+
# Optional: you can set up logging and subscribe it to the Provisioner
38+
from utils.logger import setup_logging
39+
40+
setup_logging()
41+
logger = logging.getLogger(__name__)
42+
43+
# Create the Provisioner instance - you can also create the instance from a GDC yaml profile
44+
provisioner = UserProvisioner(
45+
host=os.environ["GDC_HOSTNAME"], token=os.environ["GDC_AUTH_TOKEN"]
46+
)
47+
48+
# Optional: subscribe to logs
49+
provisioner.logger.subscribe(logger)
50+
51+
# Load your data from your data source
52+
source_data_path: Path = Path("path/to/some.csv")
53+
source_data_reader = DictReader(source_data_path.read_text().splitlines())
54+
source_data = [row for row in source_data_reader]
55+
56+
# Validate your input data with
57+
full_load_data: list[UserFullLoad] = UserFullLoad.from_list_of_dicts(
58+
source_data
59+
)
60+
provisioner.full_load(full_load_data)
61+
```
62+
63+
Ready made scripts covering the basic use cases can be found here in the [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools) repository

gooddata-pipelines/TODO.md

Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
# TODO
2+
3+
A list of outstanding tasks, features, or technical debt to be addressed in this project.
4+
5+
## Features
6+
7+
- [ ] Workspace restore
8+
9+
## Refactoring / Debt
10+
11+
- [ ] Integrate with GoodDataApiClient
12+
- [ ] Consider replacing the SdkMethods wrapper with direct calls to the SDK methods
13+
- [ ] Consider using orjson library instead of json
14+
- [ ] Cleanup custom exceptions
15+
- [ ] Improve test coverage. Write missing unit tests for legacy code (e.g., user data filters)
16+
17+
## Documentation
18+
19+
- [ ] Improve package README
20+
- [ ] Workspace provisioning
21+
- [ ] User provisioning
22+
- [ ] User group provisioning
23+
- [ ] Permission provisioning
24+
- [ ] User data filter provisioning
Lines changed: 59 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,59 @@
1+
# (C) 2025 GoodData Corporation
2+
3+
from ._version import __version__
4+
5+
# -------- Backup and Restore --------
6+
from .backup_and_restore.backup_manager import BackupManager
7+
from .backup_and_restore.models.storage import (
8+
BackupRestoreConfig,
9+
StorageType,
10+
)
11+
from .backup_and_restore.storage.local_storage import LocalStorage
12+
from .backup_and_restore.storage.s3_storage import S3Storage
13+
14+
# -------- Provisioning --------
15+
from .provisioning.entities.user_data_filters.models.udf_models import (
16+
UserDataFilterFullLoad,
17+
)
18+
from .provisioning.entities.user_data_filters.user_data_filters import (
19+
UserDataFilterProvisioner,
20+
)
21+
from .provisioning.entities.users.models.permissions import (
22+
PermissionFullLoad,
23+
PermissionIncrementalLoad,
24+
)
25+
from .provisioning.entities.users.models.user_groups import (
26+
UserGroupFullLoad,
27+
UserGroupIncrementalLoad,
28+
)
29+
from .provisioning.entities.users.models.users import (
30+
UserFullLoad,
31+
UserIncrementalLoad,
32+
)
33+
from .provisioning.entities.users.permissions import PermissionProvisioner
34+
from .provisioning.entities.users.user_groups import UserGroupProvisioner
35+
from .provisioning.entities.users.users import UserProvisioner
36+
from .provisioning.entities.workspaces.models import WorkspaceFullLoad
37+
from .provisioning.entities.workspaces.workspace import WorkspaceProvisioner
38+
39+
__all__ = [
40+
"BackupManager",
41+
"BackupRestoreConfig",
42+
"StorageType",
43+
"LocalStorage",
44+
"S3Storage",
45+
"WorkspaceFullLoad",
46+
"WorkspaceProvisioner",
47+
"UserIncrementalLoad",
48+
"UserGroupIncrementalLoad",
49+
"PermissionFullLoad",
50+
"PermissionIncrementalLoad",
51+
"UserFullLoad",
52+
"UserGroupFullLoad",
53+
"UserProvisioner",
54+
"UserGroupProvisioner",
55+
"PermissionProvisioner",
56+
"UserDataFilterProvisioner",
57+
"UserDataFilterFullLoad",
58+
"__version__",
59+
]
Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# (C) 2025 GoodData Corporation
2+
from importlib import metadata
3+
4+
try:
5+
__version__ = metadata.version("gooddata-pipelines")
6+
except metadata.PackageNotFoundError:
7+
__version__ = "unknown-version"

0 commit comments

Comments
 (0)