From 1bd2b98c93e09d5f17ae04db59a1a4c4a8e32130 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Mon, 12 Jan 2026 15:05:34 +0200 Subject: [PATCH 1/4] Added page for Backup and Recovery, and update the corresponding link in the lifecycle maintenance guide. --- docs.json | 1 + self-hosting/lifecycle-maintenance.mdx | 8 +- .../backup-and-recovery.mdx | 156 ++++++++++++++++++ 3 files changed, 159 insertions(+), 6 deletions(-) create mode 100644 self-hosting/lifecycle-maintenance/backup-and-recovery.mdx diff --git a/docs.json b/docs.json index 4751b6d6..12fe0b11 100644 --- a/docs.json +++ b/docs.json @@ -394,6 +394,7 @@ "self-hosting/lifecycle-maintenance/telemetry", "self-hosting/lifecycle-maintenance/metrics", "self-hosting/lifecycle-maintenance/diagnostics", + "self-hosting/lifecycle-maintenance/backup-and-recovery", "self-hosting/lifecycle-maintenance/migrating", "self-hosting/lifecycle-maintenance/multiple-instances" ] diff --git a/self-hosting/lifecycle-maintenance.mdx b/self-hosting/lifecycle-maintenance.mdx index 600ce1ce..11822ae3 100644 --- a/self-hosting/lifecycle-maintenance.mdx +++ b/self-hosting/lifecycle-maintenance.mdx @@ -126,10 +126,6 @@ migrations: Note that if you disable automatic migrations, and do not run the migration job manually, the service may run with an outdated storage schema version. This may lead to unexpected and potentially difficult-to-debug errors in the service. -## Backups +## Backup and Recovery -We recommend using Git to backup your configuration files. - -None of the containers use any local storage, so no backups are required there. - -The sync bucket storage database may be backed up using the recommendations for the storage database system. This is not a strong requirement, since this data can be recovered by re-replicating from the source database. +For detailed information on backup strategies and recovery procedures, including how PowerSync handles database restoration and re-sync scenarios, see the [Backup and Recovery](/self-hosting/lifecycle-maintenance/backup-and-recovery) guide. diff --git a/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx new file mode 100644 index 00000000..2c0b1abf --- /dev/null +++ b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx @@ -0,0 +1,156 @@ +--- +title: "Backup and Recovery" +description: "Backup strategies and recovery procedures for PowerSync self-hosted deployments" +--- + +## Overview + +PowerSync self-hosted deployments require backup strategies for configuration files and source databases. Understanding how [PowerSync replicates data](/architecture/powersync-service#replication) and handles database recovery is critical for planning maintenance windows and disaster recovery procedures. + +## Configuration Files + +We recommend using Git to version control and backup your PowerSync configuration files: +- `powersync.yaml` - Service configuration +- [`sync-rules.yaml`](/usage/sync-rules) - Your data synchronization logic + +None of the PowerSync containers use local storage, so no container-level backups are required. + +## Bucket Storage Database + +The [sync bucket storage](/usage/sync-rules/organize-data-into-buckets) database (MongoDB or Postgres used for bucket storage) may be backed up using standard database backup procedures. However, this is not a strict requirement since bucket data can be fully recovered by re-replicating from the source database. + + +Bucket storage contains derived data from your source database. If lost, PowerSync can rebuild it automatically through [replication](/architecture/powersync-service#replication). + + +## Source Database Backup and Recovery + +When backing up and restoring your source database, PowerSync's behavior depends on whether the sync state is preserved. PowerSync uses [checkpoints](/architecture/consistency#how-it-works-checkpoints) to maintain consistency, and the replication process must be able to continue from these checkpoints after a restore. + + + +**Sync State Tracking:** Logical replication slots + +**Restore with Sync State Preserved:** +- Sync continues from the replication slot position +- No re-sync required + +**Restore without Sync State:** +- PowerSync creates a new replication slot +- Triggers a full re-sync from the current database state +- Re-sync duration depends on data volume and complexity + +**Best Practices:** +- Standard backup tools like `pg_basebackup` do not preserve replication slots - plan for full re-sync after restore +- To avoid re-sync, restore using point-in-time recovery (PITR) that maintains WAL continuity +- Monitor replication slot disk usage, especially during deployments + + + +**Sync State Tracking:** Change stream resume tokens + +**Restore to a Point After Resume Token:** +- Sync continues normally from the last known position +- No re-sync required + +**Restore to a Point Before Resume Token:** +- PowerSync detects the invalidated change stream (error `PSYNC_S1344`) +- Automatically starts a fresh change stream +- Triggers a full re-sync from the source database + +**Best Practices:** +- When possible, restore to a point after the stored resume token +- For more details on the `PSYNC_S1344` error, see the [Error Codes Reference](/resources/troubleshooting/error-codes#psync_s13xx-mongodb-replication-issues) + + + +**Sync State Tracking:** Binary log position + +**Restore with Binary Logs Available:** +- Sync continues from the last known position if binary logs are still available +- No re-sync required + +**Restore without Binary Logs:** +- PowerSync detects that the binary log position is no longer valid +- Starts replication from the current position +- Triggers a full re-sync + +**Best Practices:** +- Retain binary logs for your backup retention period +- Use point-in-time recovery (PITR) for consistent restores +- Test restore procedures in staging + + + +**Sync State Tracking:** Change Tracking metadata + +**Restore with Change Tracking Preserved:** +- Sync continues from the last tracked position if retention covers the restore point +- No re-sync required + +**Restore without Change Tracking or Expired Retention:** +- PowerSync detects insufficient change tracking history +- Triggers a full re-sync from the current database state + +**Best Practices:** +- Configure change tracking retention longer than your backup retention period +- Monitor change tracking table sizes to balance retention and storage +- Verify retention covers your maximum recovery time objective + + + + +A full re-sync can take significant time for large datasets. Plan maintenance windows accordingly when sync state cannot be preserved. + + +## Multi-Tenant Backup and Recovery + +For multi-tenant architectures using a shared database with tenant isolation through [sync rules](/usage/sync-rules), you can perform tenant-specific recovery without affecting other tenants. + +### Recovery Approach + +When using tenant isolation through sync rules in a single database: + +**1. Backup and restore tenant-specific data** using your database's native tools with filtering: + - **Postgres:** [`pg_dump`](https://www.postgres.org/docs/current/app-pgdump.html) and [`pg_restore`](https://www.postgres.org/docs/current/app-pgrestore.html) with `WHERE tenant_id = X` + - **MongoDB:** [`mongodump`](https://www.mongodb.com/docs/database-tools/mongodump/) and [`mongorestore`](https://www.mongodb.com/docs/database-tools/mongorestore/) with query filters + - **MySQL:** [`mysqldump`](https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html) with `--where="tenant_id = X"` + - **MSSQL:** [SQL Server backup/restore](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/back-up-and-restore-of-sql-server-databases) with filtered queries + +**2. PowerSync detects restored data as incremental changes** and [replicates them](/architecture/powersync-service#replication) automatically + +**3. Other tenants continue operating** without interruption (assuming replication keeps pace) + + +Implement soft deletes for accidental deletions to avoid database-level recovery in many scenarios. + + +### Performance Considerations + +Large tenant restores increase replication volume. Monitor the following factors: + +- **Replication lag** during restore operations using the [Diagnostics API](/self-hosting/lifecycle-maintenance/diagnostics) +- **Data volume impact** - high volumes can affect all tenants if replication falls behind +- **Database resources** during restore operations + +## Recovery Time Planning + +### Factors Affecting Re-sync Time + +When planning your backup and recovery strategy, consider: + +1. **Data Volume**: Total size of data in synced tables +2. **Sync Rules Complexity**: Number of tables, joins, and transformations +3. **Database Performance**: CPU, memory, and I/O capacity +4. **Network Bandwidth**: Connection speed between PowerSync and source database +5. **Concurrent Load**: Other operations running during re-sync + +### Minimizing Recovery Time + +To minimize downtime during recovery: + +1. **Test Recovery Procedures**: Regularly test backup restoration in a staging environment +2. **Monitor Re-sync Progress**: Use the [Diagnostics API](/self-hosting/lifecycle-maintenance/diagnostics) to track replication progress +3. **Optimize During Re-sync**: Temporarily increase database resources during re-sync +4. **Plan Maintenance Windows**: Schedule restorations during low-traffic periods +5. **Preserve Sync State**: When possible, preserve sync state (MongoDB resume tokens, Postgres replication slots, MySQL binary log positions, or MSSQL change tracking data) From 56ebacc8de1812fad7e986802275772d9a5893c6 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Wed, 14 Jan 2026 12:18:31 +0200 Subject: [PATCH 2/4] Update backup and recovery documentation: rename page to "Backup Considerations" and clarify backup requirements for PowerSync self-hosted deployments. Add link to "Restoring Your Source Database." --- docs.json | 3 +- .../backup-and-recovery.mdx | 143 +----------------- .../restoring-source-database.mdx | 80 ++++++++++ 3 files changed, 90 insertions(+), 136 deletions(-) create mode 100644 usage/lifecycle-maintenance/restoring-source-database.mdx diff --git a/docs.json b/docs.json index 12fe0b11..3fff46d5 100644 --- a/docs.json +++ b/docs.json @@ -172,7 +172,8 @@ "usage/lifecycle-maintenance/handling-write-validation-errors", "usage/lifecycle-maintenance/upgrading-the-client-sdk", "usage/lifecycle-maintenance/postgres-maintenance", - "usage/lifecycle-maintenance/compacting-buckets" + "usage/lifecycle-maintenance/compacting-buckets", + "usage/lifecycle-maintenance/restoring-source-database" ] }, { diff --git a/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx index 2c0b1abf..31acad81 100644 --- a/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx +++ b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx @@ -1,156 +1,29 @@ --- -title: "Backup and Recovery" -description: "Backup strategies and recovery procedures for PowerSync self-hosted deployments" +title: "Backup Considerations" +description: "What to back up for PowerSync self-hosted deployments" --- ## Overview -PowerSync self-hosted deployments require backup strategies for configuration files and source databases. Understanding how [PowerSync replicates data](/architecture/powersync-service#replication) and handles database recovery is critical for planning maintenance windows and disaster recovery procedures. +PowerSync self-hosted deployments have minimal backup requirements. This page covers what you need to consider. ## Configuration Files We recommend using Git to version control and backup your PowerSync configuration files: + - `powersync.yaml` - Service configuration - [`sync-rules.yaml`](/usage/sync-rules) - Your data synchronization logic -None of the PowerSync containers use local storage, so no container-level backups are required. +PowerSync containers use no persistent storage - configuration is mounted from the host and all data is stored in external databases, so no container-level backups are required. ## Bucket Storage Database -The [sync bucket storage](/usage/sync-rules/organize-data-into-buckets) database (MongoDB or Postgres used for bucket storage) may be backed up using standard database backup procedures. However, this is not a strict requirement since bucket data can be fully recovered by re-replicating from the source database. +The [sync bucket storage](/usage/sync-rules/organize-data-into-buckets) database (MongoDB or Postgres) may be backed up using standard database backup procedures. However, this is optional since bucket data can be fully recovered by re-replicating from the source database. Bucket storage contains derived data from your source database. If lost, PowerSync can rebuild it automatically through [replication](/architecture/powersync-service#replication). -## Source Database Backup and Recovery - -When backing up and restoring your source database, PowerSync's behavior depends on whether the sync state is preserved. PowerSync uses [checkpoints](/architecture/consistency#how-it-works-checkpoints) to maintain consistency, and the replication process must be able to continue from these checkpoints after a restore. - - - -**Sync State Tracking:** Logical replication slots - -**Restore with Sync State Preserved:** -- Sync continues from the replication slot position -- No re-sync required - -**Restore without Sync State:** -- PowerSync creates a new replication slot -- Triggers a full re-sync from the current database state -- Re-sync duration depends on data volume and complexity - -**Best Practices:** -- Standard backup tools like `pg_basebackup` do not preserve replication slots - plan for full re-sync after restore -- To avoid re-sync, restore using point-in-time recovery (PITR) that maintains WAL continuity -- Monitor replication slot disk usage, especially during deployments - - - -**Sync State Tracking:** Change stream resume tokens - -**Restore to a Point After Resume Token:** -- Sync continues normally from the last known position -- No re-sync required - -**Restore to a Point Before Resume Token:** -- PowerSync detects the invalidated change stream (error `PSYNC_S1344`) -- Automatically starts a fresh change stream -- Triggers a full re-sync from the source database - -**Best Practices:** -- When possible, restore to a point after the stored resume token -- For more details on the `PSYNC_S1344` error, see the [Error Codes Reference](/resources/troubleshooting/error-codes#psync_s13xx-mongodb-replication-issues) - - - -**Sync State Tracking:** Binary log position - -**Restore with Binary Logs Available:** -- Sync continues from the last known position if binary logs are still available -- No re-sync required - -**Restore without Binary Logs:** -- PowerSync detects that the binary log position is no longer valid -- Starts replication from the current position -- Triggers a full re-sync - -**Best Practices:** -- Retain binary logs for your backup retention period -- Use point-in-time recovery (PITR) for consistent restores -- Test restore procedures in staging - - - -**Sync State Tracking:** Change Tracking metadata - -**Restore with Change Tracking Preserved:** -- Sync continues from the last tracked position if retention covers the restore point -- No re-sync required - -**Restore without Change Tracking or Expired Retention:** -- PowerSync detects insufficient change tracking history -- Triggers a full re-sync from the current database state - -**Best Practices:** -- Configure change tracking retention longer than your backup retention period -- Monitor change tracking table sizes to balance retention and storage -- Verify retention covers your maximum recovery time objective - - - - -A full re-sync can take significant time for large datasets. Plan maintenance windows accordingly when sync state cannot be preserved. - - -## Multi-Tenant Backup and Recovery - -For multi-tenant architectures using a shared database with tenant isolation through [sync rules](/usage/sync-rules), you can perform tenant-specific recovery without affecting other tenants. - -### Recovery Approach - -When using tenant isolation through sync rules in a single database: - -**1. Backup and restore tenant-specific data** using your database's native tools with filtering: - - **Postgres:** [`pg_dump`](https://www.postgres.org/docs/current/app-pgdump.html) and [`pg_restore`](https://www.postgres.org/docs/current/app-pgrestore.html) with `WHERE tenant_id = X` - - **MongoDB:** [`mongodump`](https://www.mongodb.com/docs/database-tools/mongodump/) and [`mongorestore`](https://www.mongodb.com/docs/database-tools/mongorestore/) with query filters - - **MySQL:** [`mysqldump`](https://dev.mysql.com/doc/refman/8.0/en/mysqldump.html) with `--where="tenant_id = X"` - - **MSSQL:** [SQL Server backup/restore](https://learn.microsoft.com/en-us/sql/relational-databases/backup-restore/back-up-and-restore-of-sql-server-databases) with filtered queries - -**2. PowerSync detects restored data as incremental changes** and [replicates them](/architecture/powersync-service#replication) automatically - -**3. Other tenants continue operating** without interruption (assuming replication keeps pace) - - -Implement soft deletes for accidental deletions to avoid database-level recovery in many scenarios. - - -### Performance Considerations - -Large tenant restores increase replication volume. Monitor the following factors: - -- **Replication lag** during restore operations using the [Diagnostics API](/self-hosting/lifecycle-maintenance/diagnostics) -- **Data volume impact** - high volumes can affect all tenants if replication falls behind -- **Database resources** during restore operations - -## Recovery Time Planning - -### Factors Affecting Re-sync Time - -When planning your backup and recovery strategy, consider: - -1. **Data Volume**: Total size of data in synced tables -2. **Sync Rules Complexity**: Number of tables, joins, and transformations -3. **Database Performance**: CPU, memory, and I/O capacity -4. **Network Bandwidth**: Connection speed between PowerSync and source database -5. **Concurrent Load**: Other operations running during re-sync - -### Minimizing Recovery Time - -To minimize downtime during recovery: +## Source Database -1. **Test Recovery Procedures**: Regularly test backup restoration in a staging environment -2. **Monitor Re-sync Progress**: Use the [Diagnostics API](/self-hosting/lifecycle-maintenance/diagnostics) to track replication progress -3. **Optimize During Re-sync**: Temporarily increase database resources during re-sync -4. **Plan Maintenance Windows**: Schedule restorations during low-traffic periods -5. **Preserve Sync State**: When possible, preserve sync state (MongoDB resume tokens, Postgres replication slots, MySQL binary log positions, or MSSQL change tracking data) +For guidance on what happens when you restore your source database, including multi-tenant recovery scenarios, see [Restoring Your Source Database](/usage/lifecycle-maintenance/restoring-source-database). diff --git a/usage/lifecycle-maintenance/restoring-source-database.mdx b/usage/lifecycle-maintenance/restoring-source-database.mdx new file mode 100644 index 00000000..a4328ab1 --- /dev/null +++ b/usage/lifecycle-maintenance/restoring-source-database.mdx @@ -0,0 +1,80 @@ +--- +title: "Restoring your source database" +description: "How PowerSync handles source database restores and what to expect during reprocessing" +--- + +## Overview + +Backing up your source database does not affect PowerSync operation. However, restoring from a backup typically requires PowerSync to perform a **full reprocess** of the data. + +This page explains what happens when you restore your [source database](/installation/database-setup) and how to plan for it. + +## What happens after a database restore + +When you restore your source database from a backup: + +- PowerSync's replication state may become invalid +- The PowerSync Service will need to reprocess data from the restored database +- The reprocessing happens automatically +- Clients can continue syncing during reprocessing, though they may experience temporary delays + + +A full reprocess can take significant time for large datasets. Plan maintenance windows accordingly when restoring source database backups. + + +## Planning for database restores + +When planning source database backup and recovery: + +1. **Test restore procedures** in a staging environment to understand reprocessing duration +2. **Plan maintenance windows** during low-traffic periods for database restores +3. **Consider data volume** when estimating recovery time - larger datasets take longer to reprocess + +### Factors affecting reprocessing time + +- **Data volume**: Total size of data in synced tables +- **Sync rules complexity**: Number of tables, joins, and transformations +- **Database performance**: CPU, memory, and I/O capacity + +For more details on performance considerations, see [Performance and Limits](/resources/performance-and-limits). + +## Multi-tenant considerations + +For multi-tenant architectures using a shared database with tenant isolation through [sync rules](/usage/sync-rules), you can perform tenant-specific recovery without triggering a full reprocess. + +### Tenant-specific recovery + +Instead of restoring the entire database, restore only the affected tenant's data: + +1. **Backup and restore tenant-specific data** using your source database's native tools with filtering (e.g., `WHERE tenant_id = X`) +2. **PowerSync detects restored data as incremental changes** and [replicates them](/architecture/powersync-service#replication) automatically +3. **Other tenants continue operating** without interruption (assuming replication keeps pace) + + +Implement soft deletes for accidental deletions to avoid database-level recovery in many scenarios. See [Handling Update Conflicts](/usage/lifecycle-maintenance/handling-update-conflicts) for related patterns. + + +### Performance considerations + +Large tenant restores increase replication volume. Monitor: + +- **Replication lag** during restore operations using [Monitoring and Alerting](/usage/tools/monitoring-and-alerting) +- **Data volume impact** - high volumes can affect all tenants if replication falls behind + +## Database-specific notes + +Different [source databases](/installation/database-setup) use different replication mechanisms: + +- **Postgres:** Logical replication slots (see [Postgres Maintenance](/usage/lifecycle-maintenance/postgres-maintenance) for slot management) +- **MongoDB:** Change stream resume tokens +- **MySQL:** Binary log positions (GTID) +- **MSSQL:** Change Data Capture (CDC) + +The specific behavior after a restore may vary by database type. PowerSync uses [checkpoints](/architecture/consistency#how-it-works:-checkpoints) to maintain consistency during the recovery process. + +## Related resources + +- [Architecture Overview](/architecture/architecture-overview) +- [PowerSync Service](/architecture/powersync-service) +- [Sync Rules](/usage/sync-rules) +- [Troubleshooting](/resources/troubleshooting) From 2003f6cef6d4452cec283ebcf165178d6a993890 Mon Sep 17 00:00:00 2001 From: bean1352 Date: Wed, 14 Jan 2026 12:28:13 +0200 Subject: [PATCH 3/4] Clarify consistency reference in restoring source database documentation by linking to the consistency section. --- usage/lifecycle-maintenance/restoring-source-database.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/usage/lifecycle-maintenance/restoring-source-database.mdx b/usage/lifecycle-maintenance/restoring-source-database.mdx index a4328ab1..63c6a993 100644 --- a/usage/lifecycle-maintenance/restoring-source-database.mdx +++ b/usage/lifecycle-maintenance/restoring-source-database.mdx @@ -70,7 +70,7 @@ Different [source databases](/installation/database-setup) use different replica - **MySQL:** Binary log positions (GTID) - **MSSQL:** Change Data Capture (CDC) -The specific behavior after a restore may vary by database type. PowerSync uses [checkpoints](/architecture/consistency#how-it-works:-checkpoints) to maintain consistency during the recovery process. +The specific behavior after a restore may vary by database type. PowerSync uses checkpoints to maintain [consistency](/architecture/consistency#consistency) during the recovery process. ## Related resources From cf5f5046d7f3d0c43a8948fec0d965fbd66fd9cb Mon Sep 17 00:00:00 2001 From: bean1352 Date: Wed, 14 Jan 2026 12:38:53 +0200 Subject: [PATCH 4/4] Removed unnecessary link text for sync rules file. --- self-hosting/lifecycle-maintenance/backup-and-recovery.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx index 31acad81..3c7cc3da 100644 --- a/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx +++ b/self-hosting/lifecycle-maintenance/backup-and-recovery.mdx @@ -12,7 +12,7 @@ PowerSync self-hosted deployments have minimal backup requirements. This page co We recommend using Git to version control and backup your PowerSync configuration files: - `powersync.yaml` - Service configuration -- [`sync-rules.yaml`](/usage/sync-rules) - Your data synchronization logic +- [`sync-rules.yaml`](/usage/sync-rules) PowerSync containers use no persistent storage - configuration is mounted from the host and all data is stored in external databases, so no container-level backups are required.