@@ -10,7 +10,7 @@ You can use the package to manage following resources in GDC:
1010 - User/Group permissions
1111 - User Data Filters
1212 - Child workspaces (incl. Workspace Data Filter settings)
13- 1 . _ [ PLANNED ] : _ Backup and restore of workspaces
13+ 1 . Backup and restore of workspaces
14141 . _ [ PLANNED] :_ Custom fields management
1515 - extend the Logical Data Model of a child workspace
1616
@@ -33,7 +33,7 @@ import logging
3333from csv import DictReader
3434from pathlib import Path
3535
36- # Import the Entity Provisioner class and corresponding model from gooddata_pipelines library
36+ # Import the Entity Provisioner class and corresponding model from the gooddata_pipelines library
3737from gooddata_pipelines import UserFullLoad, UserProvisioner
3838
3939# Create the Provisioner instance - you can also create the instance from a GDC yaml profile
@@ -70,3 +70,156 @@ or request features.
7070
7171See [ Github releases] ( https://github.com/gooddata/gooddata-python-sdk/releases ) for released versions
7272and a list of changes.
73+
74+ ## Backup and restore of workspaces
75+ The backup and restore module allows you to create snapshots of GoodData Cloud workspaces and restore them later. This is useful for:
76+
77+ - Creating backups before major changes
78+ - Migrating workspaces between environments
79+ - Disaster recovery scenarios
80+ - Copying workspace configurations
81+
82+ ### Backup
83+
84+ The module supports three backup modes:
85+
86+ 1 . ** List of workspaces** - Backup specific workspaces by providing a list of workspace IDs
87+ 2 . ** Workspace hierarchies** - Backup a workspace and all its direct and indirect children
88+ 3 . ** Entire organization** - Backup all workspaces in the organization
89+
90+ Each backup includes:
91+ - Workspace declarative model (logical data model, analytics model, permissions)
92+ - User data filters
93+ - Filter views
94+ - Automations
95+
96+ #### Storage Options
97+
98+ Backups can be stored in:
99+ - ** Local storage** - Save backups to a local directory
100+ - ** S3 storage** - Upload backups to an AWS S3 bucket
101+
102+ #### Basic Usage
103+
104+ ``` python
105+ import os
106+ from pathlib import Path
107+
108+ from gooddata_pipelines import BackupManager
109+ from gooddata_pipelines.backup_and_restore.models.storage import (
110+ BackupRestoreConfig,
111+ LocalStorageConfig,
112+ StorageType,
113+ )
114+ from gooddata_pipelines.logger.logger import LogObserver
115+
116+ # Optionally, subscribe a standard Python logger to the LogObserver
117+ import logging
118+ logger = logging.getLogger(__name__ )
119+ LogObserver().subscribe(logger)
120+
121+ # Configure backup storage
122+ config = BackupRestoreConfig(
123+ storage_type = StorageType.LOCAL ,
124+ storage = LocalStorageConfig(),
125+ batch_size = 10 , # Number of workspaces to process in one batch
126+ api_calls_per_second = 10 , # Rate limit for API calls
127+ )
128+
129+ # Create the BackupManager instance
130+ backup_manager = BackupManager.create(
131+ config = config,
132+ host = os.environ[" GDC_HOSTNAME" ],
133+ token = os.environ[" GDC_AUTH_TOKEN" ]
134+ )
135+
136+ # Backup-specific workspaces
137+ workspace_ids = [" workspace1" , " workspace2" , " workspace3" ]
138+ backup_manager.backup_workspaces(workspace_ids = workspace_ids)
139+
140+ # Or read workspace IDs from a CSV file
141+ backup_manager.backup_workspaces(path_to_csv = " workspaces.csv" )
142+
143+ # Backup workspace hierarchies (workspace + all children)
144+ backup_manager.backup_hierarchies(workspace_ids = [" parent_workspace" ])
145+
146+ # Backup entire organization
147+ backup_manager.backup_entire_organization()
148+ ```
149+
150+ #### Using S3 Storage
151+ ``` python
152+ from gooddata_pipelines.backup_and_restore.models.storage import (
153+ BackupRestoreConfig,
154+ S3StorageConfig,
155+ StorageType,
156+ )
157+
158+ # Configure S3 storage with explicit credentials
159+ config = BackupRestoreConfig(
160+ storage_type = StorageType.S3,
161+ storage = S3StorageConfig(
162+ bucket = " my-backup-bucket" ,
163+ backup_path = " gooddata-backups/" ,
164+ aws_access_key_id = os.environ[" AWS_ACCESS_KEY_ID" ],
165+ aws_secret_access_key = os.environ[" AWS_SECRET_ACCESS_KEY" ],
166+ aws_default_region = " us-east-1"
167+ ),
168+ )
169+
170+ # Or use AWS profile
171+ config = BackupRestoreConfig(
172+ storage_type = StorageType.S3,
173+ storage = S3StorageConfig(
174+ bucket = " my-backup-bucket" ,
175+ backup_path = " gooddata-backups/" ,
176+ profile = " my-aws-profile"
177+ ),
178+ )
179+
180+ backup_manager = BackupManager.create(
181+ config = config,
182+ host = os.environ[" GDC_HOSTNAME" ],
183+ token = os.environ[" GDC_AUTH_TOKEN" ]
184+ )
185+
186+ backup_manager.backup_workspaces(workspace_ids = [" workspace1" ])
187+ ```
188+
189+ #### Using GoodData Profile
190+ You can also create the BackupManager from a GoodData profile file:
191+ ``` python
192+ from pathlib import Path
193+
194+ backup_manager = BackupManager.create_from_profile(
195+ config = config,
196+ profile = " production" ,
197+ profiles_path = Path.home() / " .gooddata" / " profiles.yaml"
198+ )
199+ ```
200+
201+ CSV File Format
202+ When providing workspace IDs via a CSV file, the file should have a workspace_id column:
203+ ``` csv
204+ workspace_id
205+ workspace1
206+ workspace2
207+ workspace3
208+ ```
209+
210+ #### Configuration Options
211+
212+ The BackupRestoreConfig class accepts the following parameters:
213+ - ` storage_type ` - Type of storage (StorageType.LOCAL or StorageType.S3)
214+ - ` storage ` - Storage-specific configuration (LocalStorageConfig or S3StorageConfig)
215+ - ` batch_size ` (optional, default: 10) - Number of workspaces to process in one batch
216+ - ` api_calls_per_second ` (optional, default: 10) - Rate limit for API calls to avoid throttling
217+ - ` api_page_size ` (optional, default: 500) - Page size for paginated API calls
218+
219+
220+ #### Error Handling and Retries
221+
222+ The backup process includes automatic retry logic with exponential backoff. If a batch fails, it will retry up to 3 times before failing completely. Individual workspace errors are logged but don't stop the entire backup process.
223+ Restore
224+
225+ Note: Restore functionality is currently in development.
0 commit comments