-
Notifications
You must be signed in to change notification settings - Fork 48
Synchronous history backfilling #571
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| OneDayOfLedgers = config.OneDayOfLedgers | ||
| SevenDayOfLedgers = config.OneDayOfLedgers * 7 | ||
| // Number of ledgers to read/write at a time during backfill | ||
| ChunkSize uint32 = OneDayOfLedgers / 4 // 6 hours. Takes X minutes to process |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
X minutes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
pending timing data collection outside of my machine
|
|
||
| // Checks to ensure state of local DB is acceptable for backfilling | ||
| func verifyDbGapless(callerCtx context.Context, reader db.LedgerReader, minLedgerSeq uint32, maxLedgerSeq uint32) error { | ||
| ctx, cancelPrecheck := context.WithTimeout(callerCtx, 4*time.Minute) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the rationale for the 4 minute? we should take note on how long this actually takes on a variety of systems and use like double the ceiling of that here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this was an early testing value that doesn't have much significance other than one minute timing out for the unoptimized version of this that existed before I wrote the new SQL method. many timeouts here will require more thorough metrics being collected
| } | ||
| // Create temporary backend for backwards-filling chunks | ||
| // Note monotonicity constraint of the ledger backend | ||
| tempBackend, err := makeBackend(metaInfo.dsInfo) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like you can create/close this outside of the loop, yeah?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
one would wish. trying to fill multiple chunks with seqnos x > y > a > b in the order [x->y], then [a->b] breaks monotonicity, hence why each chunk is supported by a separate backend for backfilling backwards
…tart post-backfill
What
NOTE: This works for me for an empty and partially filled database, but robust testing has yet to added. Optimizations for chunk size will be made based on metrics collected from future iterative testing. This takes about 7 minutes on my machine for one day of ledgers.
This PR adds synchronous history backfilling of ledgers through the new CLI argument
--backfill. This will fill the local SQL database with the most recentHISTORY_RETENTION_WINDOWledgers, fetched from CDP. Usage:./stellar-rpc --backfill.Notes:
ledgerReaderTx, CountLedgersInRange(context, start, end), was added. This enables one to count how many ledgers in a SQL database exist between a start and end (used here for fragmentation/gap detection).Why
See issues/discussions on this: #203, 1718
Known limitations
[TODO or N/A]