Feature/chunked database dumps#76
Conversation
|
@onyedikachi-david Apologies for the delay here, will be testing this implementation this week and report back! |
Okay. |
@onyedikachi-david Was looking at this and attempting to test but running into issues. To start testing I changed this value to 0 so the size to go to R2 was anything: // const SIZE_THRESHOLD_FOR_R2 = 100 * 1024 * 1024 // 100MB threshold for using R2 <- Commented this out
const SIZE_THRESHOLD_FOR_R2 = 0 // <- Testing with thisBut when attempting to hit the cURL request above to begin the database dump I see the errors in my Cloudflare Worker log above. Do you have good steps to reproduce how you got the database dump to work as expected? I setup my R2 bucket successfully and linked it in the |
|
@Brayden I noticed the issue might be related to the endpoint being used. I added a new endpoint for chunk dumps: curl -X POST 'https://starbasedb.{YOUR-IDENTIFIER}.workers.dev/export/dump/chunked' \
--header 'Authorization: Bearer ABC123' \
--header 'X-Callback-URL: http://your-callback-url'Would you retry with this, I don't have a paid Cloudflare instance, I would have tried replicating it on Cloudflare. |
There was a problem hiding this comment.
For some reason it seems like the dump recursively is called over and over again until the user actually accesses the dump via the /export/dump/${id} GET route. Once you access that the logs free up and we don''t see it called recursively.
I also can never seem to get it to output to R2 for me. Even when I hardcode shouldUseR2 to be true it doesn't ever seem to find env.DATABASE_DUMPS even though my R2 bucket is in my wrangler.toml file as a binding. This is the error I see in my chunked dump POST:
{
"error": "Failed to start database dump: Cannot read properties of undefined (reading 'DATABASE_DUMPS')"
}It seems to be because you're using c.env to try to access the R2 bucket from Wrangler but it's not available in the Hono context. If I do force pass in the Env object from the ./src/index.ts file to the StarbaseDB class and use that then I get this error when it tries to pipe it to an R2 bucket:
{
"result": {
"status": "failed",
"progress": {
"currentTable": "tmp_cache",
"processedTables": 0,
"totalTables": 11,
"error": "env3.DATABASE_DUMPS.get(...).text is not a function"
}
}
}
Overall the methods seem to be working just a couple of small hiccups. It's exciting to see it export database contents inside the DO though 🥳 Really good work!
| tables: string[] | ||
| ): Promise<number> { | ||
| let totalSize = 0 | ||
| for (const table of tables) { |
There was a problem hiding this comment.
This loop seems to throw a SQLite error if my table name is for example users.table containing a period in the string (which is a valid SQLite table name).
|
Thanks for the review; I implemented some fix, could you please check. |
|
@onyedikachi-david Now for all of my tests for some reason when I call to my {
"result": {
"status": "in_progress",
"progress": {
"currentTable": "",
"processedTables": 0,
"totalTables": 9
}
}
}It does seem to create an initial file in my R2 bucket with only the following contents: But no matter how long I wait I'm not seeing the tables get processed. For what it's worth, when I do go to test this PR there are two functions in your I know this was pretty close to working before so hoping we're not far off from it fully working? 🤞 |

Pull Request: Implement Chunked Database Dumps with Enhanced Features
/claim #59
Fixes: #59
Purpose
Implement a robust chunked database dump functionality to handle large databases efficiently while preventing memory issues and database locking. This implementation includes several key features:
1. Chunked Processing
2. Storage Flexibility
3. Progress Tracking & Notifications
Tasks
Verification Steps
curl http://127.0.0.1:8787/export/dump/{dumpId}/status \ -H "Authorization: Bearer ABC123"Before vs After
Test script
Test Results
Database Dump Test Results
Test conducted on: Sat Jan 25 13:10:23 WAT 2025
Test Steps
Step 1: Initiate Dump
Response:
{ "result": { "message": "Database dump started", "dumpId": "fb2e9497-d93a-457d-b96d-8cd1ae2d22fb", "status": "in_progress", "downloadUrl": "http://127.0.0.1:8787/export/dump/fb2e9497-d93a-457d-b96d-8cd1ae2d22fb", "estimatedSize": 3236 } }Step 2: Status Checks
Check 1:
Response:
{ "result": { "status": "in_progress", "progress": { "currentTable": "", "processedTables": 0, "totalTables": 6 } } }Check 3 (Final):
{ "result": { "status": "completed", "progress": { "currentTable": "users", "processedTables": 8, "totalTables": 6 } } }Step 3: Download Dump
Response Excerpt:
HTTP Status: 200
Summary
completedfb2e9497-d93a-457d-b96d-8cd1ae2d22fb