From 4719d9298d818445ae75d16bd264f4faa7fa5ea3 Mon Sep 17 00:00:00 2001 From: Amine Date: Fri, 9 Jan 2026 14:11:28 +0800 Subject: [PATCH 1/5] Add guide for syncing data by time in documentation --- docs.json | 1 + usage/sync-rules/guide-sync-data-by-time.mdx | 152 +++++++++++++++++++ 2 files changed, 153 insertions(+) create mode 100644 usage/sync-rules/guide-sync-data-by-time.mdx diff --git a/docs.json b/docs.json index 4751b6d6..20a49c7f 100644 --- a/docs.json +++ b/docs.json @@ -130,6 +130,7 @@ "usage/sync-rules/case-sensitivity", "usage/sync-rules/glossary", "usage/sync-rules/guide-many-to-many-and-join-tables", + "usage/sync-rules/guide-sync-data-by-time", { "group": "Advanced Topics", "pages": [ diff --git a/usage/sync-rules/guide-sync-data-by-time.mdx b/usage/sync-rules/guide-sync-data-by-time.mdx new file mode 100644 index 00000000..59230101 --- /dev/null +++ b/usage/sync-rules/guide-sync-data-by-time.mdx @@ -0,0 +1,152 @@ +--- +title: "Guide: Syncing Data by Time" +--- + +A common need in offline-first apps is syncing data based on time—for example, only syncing issues updated in the last 7 days instead of the entire dataset. +You might expect to write something like: + +```yaml focus={4} lines +bucket_definitions + issues_after_start_date: + parameters: SELECT request.parameters() ->> 'start_at' as start_at + data: SELECT * FROM issues WHERE updated_at > bucket.start_date +``` + +However, This won't work. Here's why. + +# The Problem + +Sync rules only support a limited set of operators when filtering on parameters. You can use `=`, `IN`, and `IS NULL`—but not range operators like `>`, `<`, `>=`, or `<=`. + +Additionally, sync rule functions must be deterministic. Time-based functions like `now()` aren't allowed because the result changes depending on when the query runs. + +These constraints exist for good reason—they ensure buckets can be pre-computed and cached efficiently. But they make time-based filtering less obvious to implement. + +This guide covers a few practical workarounds. + +We are working on a more elegant solution for this problem. When ready this guide will be updated accordingly + +# Workarounds + +## 1: Boolean Columns + +Add a boolean column to your table that indicates whether a row falls within a specific time range. Keep this column updated in your source database using a scheduled job. + +For example, add an `updated_this_week` column: + +```sql +ALTER TABLE issues ADD COLUMN updated_this_week BOOLEAN DEFAULT false; +``` +Update it periodically using a cron job (e.g., with pg_cron): + +```sql +UPDATE issues SET updated_this_week = (updated_at > now() - interval '7 days'); +``` + +```yaml +bucket_definitions: + recent_issues: + data: + - SELECT * FROM issues WHERE updated_this_week = true +``` +For multiple time ranges, add multiple columns and let the client choose which bucket to sync: + +```yaml +bucket_definitions: + issues_1week: + parameters: SELECT WHERE request.parameters() ->> 'range' = '1week' + data: + - SELECT * FROM issues WHERE updated_this_week = true + + issues_1month: + parameters: SELECT WHERE request.parameters() ->> 'range' = '1month' + data: + - SELECT * FROM issues WHERE updated_this_month = true +``` + +This approach works well when you have a small, fixed set of time ranges. However, it requires schema changes and a scheduled job to keep the columns updated. + +If you need more flexibility like letting users pick arbitrary date ranges [see Workaround 2](). + +## 2: Buckets Per Date + +Instead of pre-defined ranges, create a bucket for each date and let the client specify which dates to sync. + +Use `substring` to extract the date portion from a timestamp and match it with `=`: + +```sql +bucket_definitions: + issues_by_update_at: + parameters: SELECT value as date FROM json_each(request.parameters() ->> 'dates') + data: + - SELECT * FROM issues WHERE substring(updated_at, 1, 10) = bucket.date +``` +The client then passes the dates it wants as connection params: + +```javascript focus={2-4} lines +await db.connect(connector, { + params: { + dates: ["2026-01-07", "2026-01-08", "2026-01-09"], + }, +}) +``` + +This gives users full control over which dates to sync—no schema changes or scheduled jobs required. + +The trade-off is granularity. In this example we're using daily buckets. If you need finer precision (hourly), syncing a large range means many buckets. If you go coarser (monthly), you lose the ability to filter accurately. + +You have to pick a granularity and stick with it. If that's a problem—say, you want hourly precision for recent data but don't want hundreds of buckets when syncing a full month [see Workaround 3](). + +## 3: Multiple Granularities + +Combine multiple granularities in a single bucket definition. This lets you use larger buckets (days) for older data and smaller buckets (hours, minutes) for recent data. + +```yaml +bucket_definitions: + issues_by_time: + parameters: SELECT value as partition FROM json_each(request.parameters() ->> 'partitions') + data: + # By day (e.g., "2026-01-07") + - SELECT * FROM issues WHERE substring(updated_at, 1, 10) = bucket.partition + # By hour (e.g., "2026-01-07T14") + - SELECT * FROM issues WHERE substring(updated_at, 1, 13) = bucket.partition + # By 10 minutes (e.g., "2026-01-07T14:3") + - SELECT * FROM issues WHERE substring(updated_at, 1, 15) = bucket.partition +``` + +The client then mixes granularities as needed: + +```javascript focus={2-12} lines +await db.connect(connector, { + params: { + partitions: [ + "2026-01-05", + "2026-01-06", + "2026-01-07T10", + "2026-01-07T11", + "2026-01-07T12:0", + "2026-01-07T12:1", + "2026-01-07T12:2" + ] + }, +}) +``` + +This syncs January 5–6 by day, the morning of January 7 by hour, and the last 30 minutes in 10-minute chunks, without creating hundreds of buckets. + +The trade-off is complexity. The client must decide which granularity to use for each time segment, and each row belongs to multiple buckets, which increases replication overhead. + + +When using multiple time granularities (e.g., monthly, daily, hourly), rows move between buckets as time passes. Since each granularity creates a different bucket ID, the client must re-download the row from the new bucket even if it already has the data. This re-download overhead can nullify the benefits of granular filtering. For this reason, in some cases it may be better to sync entire months avoiding the re-sync overhead, even if you sync more data initially. + + +# Conclusion + +Time-based sync is a common need, but current sync rules don't support range operators or time-based functions directly. +To recap the workarounds: + +- **Boolean Columns** — Simplest option. Use when you have a fixed set of time ranges and don't mind schema changes. +- **Buckets Per Date** — More flexible. Use when you need arbitrary date ranges but can live with a single granularity. +- **Multiple Granularities** — Most flexible. Use when you need precision for recent data without syncing hundreds of buckets. Be mindful of the re-sync overhead. + +We're working on a more elegant solution. This guide will be updated when it's ready. \ No newline at end of file From ebfdf82118f5db03655dcb52a5f91374bf3afa41 Mon Sep 17 00:00:00 2001 From: Amine Date: Fri, 9 Jan 2026 14:45:35 +0800 Subject: [PATCH 2/5] Added cons section to each workaround in the "Guide: Syncing Data by Time" documentation --- usage/sync-rules/guide-sync-data-by-time.mdx | 28 ++++++++++++++------ 1 file changed, 20 insertions(+), 8 deletions(-) diff --git a/usage/sync-rules/guide-sync-data-by-time.mdx b/usage/sync-rules/guide-sync-data-by-time.mdx index 59230101..3fba5f9a 100644 --- a/usage/sync-rules/guide-sync-data-by-time.mdx +++ b/usage/sync-rules/guide-sync-data-by-time.mdx @@ -2,7 +2,7 @@ title: "Guide: Syncing Data by Time" --- -A common need in offline-first apps is syncing data based on time—for example, only syncing issues updated in the last 7 days instead of the entire dataset. +A common need in offline-first apps is syncing data based on time, for example, only syncing issues updated in the last 7 days instead of the entire dataset. You might expect to write something like: ```yaml focus={4} lines @@ -16,11 +16,11 @@ However, This won't work. Here's why. # The Problem -Sync rules only support a limited set of operators when filtering on parameters. You can use `=`, `IN`, and `IS NULL`—but not range operators like `>`, `<`, `>=`, or `<=`. +Sync rules only support a limited set of operators when filtering on parameters. You can use `=`, `IN`, and `IS NULL`, but not range operators like `>`, `<`, `>=`, or `<=`. Additionally, sync rule functions must be deterministic. Time-based functions like `now()` aren't allowed because the result changes depending on when the query runs. -These constraints exist for good reason—they ensure buckets can be pre-computed and cached efficiently. But they make time-based filtering less obvious to implement. +These constraints exist for good reason, they ensure buckets can be pre-computed and cached efficiently. But they make time-based filtering less obvious to implement. This guide covers a few practical workarounds. @@ -66,7 +66,11 @@ bucket_definitions: This approach works well when you have a small, fixed set of time ranges. However, it requires schema changes and a scheduled job to keep the columns updated. -If you need more flexibility like letting users pick arbitrary date ranges [see Workaround 2](). + +**Cons:** Requires schema changes and scheduled jobs (e.g., pg_cron). Limited to pre-defined time ranges. + + +If you need more flexibility like letting users pick arbitrary date ranges [see Workaround 2](#2:-buckets-per-date). ## 2: Buckets Per Date @@ -91,11 +95,15 @@ await db.connect(connector, { }) ``` -This gives users full control over which dates to sync—no schema changes or scheduled jobs required. +This gives users full control over which dates to sync, with no schema changes or scheduled jobs required. -The trade-off is granularity. In this example we're using daily buckets. If you need finer precision (hourly), syncing a large range means many buckets. If you go coarser (monthly), you lose the ability to filter accurately. +The trade-off is granularity. In this example we're using daily buckets. If you need finer precision (hourly), syncing a large range means many buckets. If you use larger buckets (monthly), you lose the ability to filter accurately. -You have to pick a granularity and stick with it. If that's a problem—say, you want hourly precision for recent data but don't want hundreds of buckets when syncing a full month [see Workaround 3](). + +**Cons:** Must commit to a single granularity. Daily = too many buckets for long ranges. Monthly = lose precision for recent data. + + +You have to pick a granularity and stick with it. If that's a problem, say, you want hourly precision for recent data but don't want hundreds of buckets when syncing a full month [see Workaround 3](#3:-multiple-granularities). ## 3: Multiple Granularities @@ -136,8 +144,12 @@ This syncs January 5–6 by day, the morning of January 7 by hour, and the last The trade-off is complexity. The client must decide which granularity to use for each time segment, and each row belongs to multiple buckets, which increases replication overhead. - + When using multiple time granularities (e.g., monthly, daily, hourly), rows move between buckets as time passes. Since each granularity creates a different bucket ID, the client must re-download the row from the new bucket even if it already has the data. This re-download overhead can nullify the benefits of granular filtering. For this reason, in some cases it may be better to sync entire months avoiding the re-sync overhead, even if you sync more data initially. + + + +**Cons:** Each row belongs to multiple buckets (replication overhead). Re-sync overhead when rows move between bucket granularities. Added complexity may not justify the gains over Workaround 2. # Conclusion From b7bf218ea9bbbf07cc72cddf1f5424ed6ea31dd0 Mon Sep 17 00:00:00 2001 From: Al-Khawarizmi Date: Mon, 12 Jan 2026 08:10:28 +0800 Subject: [PATCH 3/5] Update usage/sync-rules/guide-sync-data-by-time.mdx Co-authored-by: benitav --- usage/sync-rules/guide-sync-data-by-time.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/usage/sync-rules/guide-sync-data-by-time.mdx b/usage/sync-rules/guide-sync-data-by-time.mdx index 3fba5f9a..3c0a6c81 100644 --- a/usage/sync-rules/guide-sync-data-by-time.mdx +++ b/usage/sync-rules/guide-sync-data-by-time.mdx @@ -12,7 +12,7 @@ bucket_definitions data: SELECT * FROM issues WHERE updated_at > bucket.start_date ``` -However, This won't work. Here's why. +However, this won't work. Here's why. # The Problem From d052151fe2998c64ab8d731197a839fa64fb4e2b Mon Sep 17 00:00:00 2001 From: Al-Khawarizmi Date: Mon, 12 Jan 2026 08:10:37 +0800 Subject: [PATCH 4/5] Update usage/sync-rules/guide-sync-data-by-time.mdx Co-authored-by: benitav --- usage/sync-rules/guide-sync-data-by-time.mdx | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/usage/sync-rules/guide-sync-data-by-time.mdx b/usage/sync-rules/guide-sync-data-by-time.mdx index 3c0a6c81..84422fa1 100644 --- a/usage/sync-rules/guide-sync-data-by-time.mdx +++ b/usage/sync-rules/guide-sync-data-by-time.mdx @@ -24,7 +24,7 @@ These constraints exist for good reason, they ensure buckets can be pre-computed This guide covers a few practical workarounds. -We are working on a more elegant solution for this problem. When ready this guide will be updated accordingly +We are working on a more elegant solution for this problem. When ready, this guide will be updated accordingly. # Workarounds From d1ef8563d9aab761ff51f1670569ea5365d06526 Mon Sep 17 00:00:00 2001 From: Amine Date: Mon, 12 Jan 2026 08:39:08 +0800 Subject: [PATCH 5/5] Add additional context on sync rule operators and performance limits of bucket granularity. --- usage/sync-rules/guide-sync-data-by-time.mdx | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/usage/sync-rules/guide-sync-data-by-time.mdx b/usage/sync-rules/guide-sync-data-by-time.mdx index 84422fa1..f985c401 100644 --- a/usage/sync-rules/guide-sync-data-by-time.mdx +++ b/usage/sync-rules/guide-sync-data-by-time.mdx @@ -16,7 +16,7 @@ However, this won't work. Here's why. # The Problem -Sync rules only support a limited set of operators when filtering on parameters. You can use `=`, `IN`, and `IS NULL`, but not range operators like `>`, `<`, `>=`, or `<=`. +Sync rules only support a limited set of [operators](https://docs.powersync.com/usage/sync-rules/operators-and-functions) when filtering on parameters. You can use `=`, `IN`, and `IS NULL`, but not range operators like `>`, `<`, `>=`, or `<=`. Additionally, sync rule functions must be deterministic. Time-based functions like `now()` aren't allowed because the result changes depending on when the query runs. @@ -70,7 +70,7 @@ This approach works well when you have a small, fixed set of time ranges. Howeve **Cons:** Requires schema changes and scheduled jobs (e.g., pg_cron). Limited to pre-defined time ranges. -If you need more flexibility like letting users pick arbitrary date ranges [see Workaround 2](#2:-buckets-per-date). +If you need more flexibility like letting users pick arbitrary date ranges, see Workaround 2 below. ## 2: Buckets Per Date @@ -97,13 +97,13 @@ await db.connect(connector, { This gives users full control over which dates to sync, with no schema changes or scheduled jobs required. -The trade-off is granularity. In this example we're using daily buckets. If you need finer precision (hourly), syncing a large range means many buckets. If you use larger buckets (monthly), you lose the ability to filter accurately. +The trade-off is granularity. In this example we're using daily buckets. If you need finer precision (hourly), syncing a large range means many buckets, which can degrade sync performance and approach [PowerSync's limit of 1,000 buckets per user](https://docs.powersync.com/resources/performance-and-limits#performance-and-limits). If you use larger buckets (monthly), you lose the ability to filter accurately. **Cons:** Must commit to a single granularity. Daily = too many buckets for long ranges. Monthly = lose precision for recent data. -You have to pick a granularity and stick with it. If that's a problem, say, you want hourly precision for recent data but don't want hundreds of buckets when syncing a full month [see Workaround 3](#3:-multiple-granularities). +You have to pick a granularity and stick with it. If that's a problem—say, you want hourly precision for recent data but don't want hundreds of buckets when syncing a full month, see Workaround 3 below. ## 3: Multiple Granularities