Skip to content

Commit 4773086

Browse files
committed
docs(gooddata-pipelines): backup and restore documentation
1 parent 9f73542 commit 4773086

File tree

4 files changed

+399
-0
lines changed

4 files changed

+399
-0
lines changed
Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
title: "Backup & Restore"
3+
linkTitle: "Backup & Restore"
4+
weight: 2
5+
no_list: true
6+
---
7+
8+
The Backup & Restore module lets you create snapshots of GoodData Cloud workspaces and restore them later. It is useful for:
9+
10+
- Backing up before major changes
11+
- Migrating workspaces across environments
12+
- Disaster recovery
13+
- Cloning workspace configurations
14+
15+
Backup and restore share common configuration objects, documented on the [Configuration](configuration/) page. For detailed, step-by-step instructions, see the [Backup](backup/) and [Restore](restore/) guides.
Lines changed: 147 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,147 @@
1+
---
2+
title: "Workspace Backup"
3+
linkTitle: "Workspace Backup"
4+
weight: 2
5+
---
6+
7+
Workspace Backup allows you to create backups of one or more workspaces. Backups can be stored either locally or uploaded to an S3 bucket.
8+
9+
The backup stores following definitions:
10+
11+
- Logical Data Model
12+
- Analytics Model
13+
- User Data Filters
14+
- Filter Views
15+
- Automations
16+
17+
## Usage
18+
19+
Import and initialize the BackupManager and BackupRestoreConfig from GoodData Pipelines:
20+
21+
```python
22+
from gooddata_pipelines import BackupManager, BackupRestoreConfig
23+
24+
host = "http://localhost:3000"
25+
token = "some_user_token"
26+
27+
# Create your customized backup configuration or use default values
28+
config = BackupRestoreConfig(
29+
storage_type="local"
30+
)
31+
32+
# Initialize the BackupManager with your configuration and GoodData Cloud credentials
33+
backup_manager = BackupManager.create(config=config, host=host, token=token)
34+
35+
# Run a backup method. For example, the `backup_entire_organization` method backs up all workspaces in GoodData Cloud.
36+
backup_manager.backup_entire_organization()
37+
38+
```
39+
40+
## Configuration
41+
42+
See [Configuration](/latest/pipelines/backup_and_restore/configuration/) for details on how to set up the configuration object.
43+
44+
## Backup Methods
45+
46+
You can use one of these methods to back up your workspaces:
47+
48+
### Back up specific workspaces
49+
50+
This methods allows you to back up specific workspaces. You can supply the list of their IDs either directly or by specifying a path to a CSV file.
51+
52+
#### Usage with direct input:
53+
54+
```python
55+
workspace_ids = ["workspace_1", "workspace_2", "workspace_3"]
56+
57+
backup_manager.backup_workspaces(workspace_ids=workspace_ids)
58+
59+
```
60+
61+
#### Usage with a csv:
62+
63+
```python
64+
path_to_csv = "path/to/local/file.csv"
65+
66+
backup_manager.backup_workspaces(path_to_csv=path_to_csv)
67+
68+
```
69+
70+
### Back up workspace hierarchies
71+
72+
This method accepts a list of parent workspace IDs and created a backup of each workspace within their hierarchy. That includes the parent workspace and both its direct and indirect children (i.e., the children of child workspaces etc.). The IDs can be provided either directly as a list or as a path to a CSV file containing the IDs.
73+
74+
#### Usage with direct input:
75+
76+
```python
77+
parent_workspace_ids = ["parent_1", "parent_2", "parent_3"]
78+
79+
backup_manager.backup_hierarchies(workspace_ids=parent_workspace_ids)
80+
81+
```
82+
83+
#### Usage with a csv:
84+
85+
```python
86+
path_to_csv = "path/to/local/file.csv"
87+
88+
backup_manager.backup_hierarchies(path_to_csv=path_to_csv)
89+
90+
```
91+
92+
### Back up entire organization
93+
94+
Create a backup of all workspaces within the GoodData organization. The method requires no arguments.
95+
96+
```python
97+
backup_manager.backup_entire_organization()
98+
99+
```
100+
101+
### Input CSV Format
102+
103+
When using a CSV as input for backup, following format is expected:
104+
105+
| **workspace_id** |
106+
| ---------------- |
107+
| parent_1 |
108+
| parent_2 |
109+
| parent_3 |
110+
111+
## Example
112+
113+
Here is a full example of a workspace backup process:
114+
115+
```python
116+
import logging
117+
import os
118+
119+
from gooddata_pipelines import (
120+
BackupManager,
121+
BackupRestoreConfig,
122+
S3StorageConfig,
123+
StorageType,
124+
)
125+
126+
# Create storage configuration
127+
s3_storage_config = S3StorageConfig.from_aws_profile(
128+
backup_path="backup_folder", bucket="backup_bucket", profile="dev"
129+
)
130+
131+
# Create backup configuration
132+
config = BackupRestoreConfig(storage_type=StorageType.S3, storage=s3_storage_config)
133+
134+
# Initialize the BackupManager with your configuration and GoodData credentials
135+
backup_manager = BackupManager.create(
136+
config, os.environ["GD_HOST"], os.environ["GD_TOKEN"]
137+
)
138+
139+
# Optionally set up a logger and subscribe it to the logs from the BackupManager
140+
logging.basicConfig(level=logging.INFO)
141+
logger = logging.getLogger(__name__)
142+
backup_manager.logger.subscribe(logger)
143+
144+
# Run the backup
145+
backup_manager.backup_workspaces(workspace_ids=["workspace_id_1", "workspace_id_2"])
146+
147+
```
Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
---
2+
title: "Configuration"
3+
linkTitle: "Configuration"
4+
weight: 1
5+
---
6+
7+
The backup algorithm is configured via the `BackupRestoreConfig` class.
8+
9+
## Usage
10+
11+
Import `BackupRestoreConfig` from GoodData Pipelines.
12+
13+
```python
14+
from gooddata_pipelines import BackupRestoreConfig
15+
16+
```
17+
18+
If you plan on storing your backups on S3, you will also need to import the `StorageType` enum and `S3StorageConfig` class. You can find more details about configuration for the S3 storage below in the [S3 Storage](#s3-storage) section.
19+
20+
```python
21+
from gooddata_pipelines import BackupRestoreConfig, S3StorageConfig, StorageType
22+
23+
```
24+
25+
The `BackupRestoreConfig` accepts following parameters:
26+
27+
| name | description |
28+
| -------------------- | ------------------------------------------------------------------------------------------------------------ |
29+
| storage_type | The type of storage to use - either `local` or `s3`. Defaults to `local`. |
30+
| storage | Configuration for the storage type. Defaults to local storage configuration. |
31+
| api_page_size | Page size for fetching workspace relationships. Defaults to 100 when unspecified. |
32+
| batch_size | Configures how many workspaces are backed up in a single batch. Defaults to 100 when unspecified. |
33+
| api_calls_per_second | Limits the maximum number of API calls to your GoodData instance. Defaults to 1. Only applied during Backup. |
34+
35+
## Storage
36+
37+
The configuration supports two types of storage - local and S3.
38+
39+
The backups are organized in a tree with following nodes:
40+
41+
- Organization ID
42+
- Workspace ID
43+
- Timestamped folder
44+
45+
The timestamped folder will contain a `gooddata_layouts.zip` file containing the stored definitions.
46+
47+
### Local Storage
48+
49+
Local storage requires a single parameter - `backup_path`. It defines where the backup tree will be saved in your file system. If not defined, the script will default to creating a `local_backups` folder in current working directory and store the backups there.
50+
51+
### S3 Storage
52+
53+
To configure upload of the backups to S3, use the S3StorageConfig object:
54+
55+
```python
56+
from gooddata_pipelines.backup_and_restore.models.storage import S3StorageConfig
57+
58+
```
59+
60+
The configuration is responsible for establishing a valid connection to S3, connecting to a bucket and specyfing the folder where the backups will be stored or read. You can create the object in three ways, depending on the type of AWS credentials you want to use. The common arguments for all three ways are:
61+
62+
| name | description |
63+
| ----------- | ------------------------------------------------------------- |
64+
| bucket | The name of the bucket to use |
65+
| backup_path | Path to the folder serving as the root for the backup storage |
66+
67+
#### Config from IAM Role
68+
69+
Will use default IAM role or environment. You only need to specify the `bucket` and `backup_path` arguments.
70+
71+
```python
72+
s3_storage_config = S3StorageConfig.from_iam_role(
73+
backup_path="backups_folder", bucket="backup_bucket"
74+
)
75+
76+
```
77+
78+
#### Config from AWS Profile
79+
80+
Will use an existing profile to authenticate with AWS.
81+
82+
```python
83+
s3_storage_config = S3StorageConfig.from_aws_profile(
84+
backup_path="backups_folder", bucket="backup_bucket", profile="dev"
85+
)
86+
87+
```
88+
89+
#### Config from AWS Credentials
90+
91+
Will use long lived AWS Access Keys to authenticate with AWS.
92+
93+
```python
94+
s3_storage_config = S3StorageConfig.from_aws_credentials(
95+
backup_path="backups_folder",
96+
bucket="backup_bucket",
97+
aws_access_key_id="AWS_ACCESS_KEY_ID",
98+
aws_secret_access_key="AWS_SECRET_ACCESS_KEY",
99+
aws_default_region="us-east-1",
100+
)
101+
```
102+
103+
## Examples
104+
105+
Here is a couple of examples of different configuration cases.
106+
107+
### Simple Local Backups
108+
109+
If you want to store your backups locally and are okay with the default values, you can create the configuration object without having to specify any values:
110+
111+
```python
112+
from gooddata_pipelines import BackupRestoreConfig
113+
114+
config = BackupRestoreConfig()
115+
116+
```
117+
118+
### Config with S3 and AWS Profile
119+
120+
If you plan to use S3, your config might look like this:
121+
122+
```python
123+
from gooddata_pipelines import (
124+
BackupRestoreConfig,
125+
S3StorageConfig,
126+
StorageType,
127+
)
128+
129+
s3_storage_config = S3StorageConfig.from_aws_profile(
130+
backup_path="backups_folder", bucket="backup_bucket", profile="dev"
131+
)
132+
133+
config = BackupRestoreConfig(storage_type=StorageType.S3, storage=s3_storage_config)
134+
135+
```

0 commit comments

Comments
 (0)