|
| 1 | +# GoodData Pipelines |
| 2 | + |
| 3 | +A high level library for automating the lifecycle of GoodData Cloud (GDC). |
| 4 | + |
| 5 | +You can use the package to manage following resoursec in GDC: |
| 6 | + |
| 7 | +1. Provisioning (create, update, delete) |
| 8 | + - User profiles |
| 9 | + - User Groups |
| 10 | + - User/Group permissions |
| 11 | + - User Data Filters |
| 12 | + - Child workspaces (incl. Workspace Data Filter settings) |
| 13 | +1. _[PLANNED]:_ Backup and restore of workspaces |
| 14 | +1. _[PLANNED]:_ Custom fields management |
| 15 | + - extend the Logical Data Model of a child workspace |
| 16 | + |
| 17 | +In case you are not interested in incorporating a library in your own program, but would like to use a ready-made script, consider having a look at [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools). |
| 18 | + |
| 19 | +## Provisioning |
| 20 | + |
| 21 | +The entities can be managed either in _full load_ or _incremental_ way. |
| 22 | + |
| 23 | +Full load means that the input data should represent the full and complete desired state of GDC after the script has finished. For example, you would include specification of all child workspaces you want to exist in GDC in the input data for workspace provisioning. Any workspaces present in GDC and not defined in the source data (i.e., your input) will be deleted. |
| 24 | + |
| 25 | +On the other hand, the incremental load treats the source data as instructions for a specific change, e.g., a creation or a deletion of a specific workspace. You can specify which workspaces you would want to delete or create, while the rest of the workspaces already present in GDC will remain as they are, ignored by the provisioning script. |
| 26 | + |
| 27 | +The provisioning module exposes _Provisioner_ classes reflecting the different entities. The typical usage would involve importing the Provisioner class and the data input data model for the class and planned provisioning method: |
| 28 | + |
| 29 | +```python |
| 30 | +import os |
| 31 | +from csv import DictReader |
| 32 | +from pathlib import Path |
| 33 | + |
| 34 | +# Import the Entity Provisioner class and corresponing model from gooddata_pipelines library |
| 35 | +from gooddata_pipelines import UserFullLoad, UserProvisioner |
| 36 | + |
| 37 | +# Optional: you can set up logging and subscribe it to the Provisioner |
| 38 | +from utils.logger import setup_logging |
| 39 | + |
| 40 | +setup_logging() |
| 41 | +logger = logging.getLogger(__name__) |
| 42 | + |
| 43 | +# Create the Provisioner instance - you can also create the instance from a GDC yaml profile |
| 44 | +provisioner = UserProvisioner( |
| 45 | + host=os.environ["GDC_HOSTNAME"], token=os.environ["GDC_AUTH_TOKEN"] |
| 46 | +) |
| 47 | + |
| 48 | +# Optional: subscribe to logs |
| 49 | +provisioner.logger.subscribe(logger) |
| 50 | + |
| 51 | +# Load your data from your data source |
| 52 | +source_data_path: Path = Path("path/to/some.csv") |
| 53 | +source_data_reader = DictReader(source_data_path.read_text().splitlines()) |
| 54 | +source_data = [row for row in source_data_reader] |
| 55 | + |
| 56 | +# Validate your input data with |
| 57 | +full_load_data: list[UserFullLoad] = UserFullLoad.from_list_of_dicts( |
| 58 | + source_data |
| 59 | +) |
| 60 | +provisioner.full_load(full_load_data) |
| 61 | +``` |
| 62 | + |
| 63 | +Ready made scripts covering the basic use cases can be found here in the [GoodData Productivity Tools](https://github.com/gooddata/gooddata-productivity-tools) repository |
0 commit comments