Skip to content

Conversation

@cjonas9
Copy link
Contributor

@cjonas9 cjonas9 commented Dec 15, 2025

What

NOTE: This works for me for an empty and partially filled database, but robust testing has yet to added. Optimizations for chunk size will be made based on metrics collected from future iterative testing. This takes about 7 minutes on my machine for one day of ledgers.

This PR adds synchronous history backfilling of ledgers through the new CLI argument --backfill. This will fill the local SQL database with the most recent HISTORY_RETENTION_WINDOW ledgers, fetched from CDP. Usage: ./stellar-rpc --backfill.

Notes:

  • This will backfill the specified number of ledgers from CDP backwards/"left" (i.e. from the most current ledger to the oldest), then backfill forwards/"right" up to the new most current ledger. Then, it will start RPC to begin filling in the ledgers live through captive core. See my design document for details for how this is done.
  • The backfilled ledgers are ingested via CDP by default. Support for ledger backfilling through captive core has yet to be added, though will be part of this PR later.
  • A method for ledgerReaderTx, CountLedgersInRange(context, start, end), was added. This enables one to count how many ledgers in a SQL database exist between a start and end (used here for fragmentation/gap detection).

Why

See issues/discussions on this: #203, 1718

Known limitations

[TODO or N/A]

@cjonas9 cjonas9 linked an issue Dec 15, 2025 that may be closed by this pull request
OneDayOfLedgers = config.OneDayOfLedgers
SevenDayOfLedgers = config.OneDayOfLedgers * 7
// Number of ledgers to read/write at a time during backfill
ChunkSize uint32 = OneDayOfLedgers / 4 // 6 hours. Takes X minutes to process
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

X minutes?

Copy link
Contributor Author

@cjonas9 cjonas9 Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pending timing data collection outside of my machine


// Checks to ensure state of local DB is acceptable for backfilling
func verifyDbGapless(callerCtx context.Context, reader db.LedgerReader, minLedgerSeq uint32, maxLedgerSeq uint32) error {
ctx, cancelPrecheck := context.WithTimeout(callerCtx, 4*time.Minute)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's the rationale for the 4 minute? we should take note on how long this actually takes on a variety of systems and use like double the ceiling of that here

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was an early testing value that doesn't have much significance other than one minute timing out for the unoptimized version of this that existed before I wrote the new SQL method. many timeouts here will require more thorough metrics being collected

}
// Create temporary backend for backwards-filling chunks
// Note monotonicity constraint of the ledger backend
tempBackend, err := makeBackend(metaInfo.dsInfo)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like you can create/close this outside of the loop, yeah?

Copy link
Contributor Author

@cjonas9 cjonas9 Dec 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

one would wish. trying to fill multiple chunks with seqnos x > y > a > b in the order [x->y], then [a->b] breaks monotonicity, hence why each chunk is supported by a separate backend for backfilling backwards

@cjonas9 cjonas9 requested a review from karthikiyer56 January 5, 2026 22:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Implement synchronous history backfilling

3 participants