Skip to content

BadgerDB v4: Encrypted database corruption after machine sleep/resume cycles #2241

@ssharish

Description

@ssharish

Environment

  • Go version: 1.25.1
  • BadgerDB version: 4.8.0
  • OS: macOS 26.1
  • Hardware: MacBookPro with frequent sleep/resume cycles

Configuration

opt.IndexCacheSize = 200 << 20
opt.BlockCacheSize = 256 << 20
opt.NumVersionsToKeep = 1
opt.SyncWrites = true
opt.CompactL0OnClose = true
opt.EncryptionKey = []byte("from-env-stable-across-restarts")
opt.EncryptionKeyRotationDuration = X * time.Hour

Issue Description

Database becomes corrupted after machine sleep/resume cycles. Corruption occurs reliably when laptop suspends and resumes multiple times over days.

Error

panic: runtime error: slice bounds out of range [2916546937:160] [recovered]
panic: 
== Recovering from initIndex crash ==
File Info: [ID: 6, Size: 307, Zeros: 0]
isEnrypted: true checksumLen: 6 checksum: sum:4292200631 indexLen: 176
== Recovered ==

Steps to Reproduce

  1. Open BadgerDB with encryption enabled and SyncWrites = true
  2. Perform regular read/write operations
  3. Allow machine to sleep/suspend (lid close or automatic suspend)
  4. Resume machine
  5. Repeat steps 2-4 over several days
  6. Eventually database fails to open with slice bounds error

Expected Behavior

Database should remain intact and openable after machine sleep/resume cycles, especially with SyncWrites = true enabled.

Actual Behavior

Corrupted SST file (encrypted index data with invalid bounds) preventing database from opening.

Questions

  1. Is this a known issue with encrypted BadgerDB on sleep/resume?
  2. Are there v4-specific options to prevent this? (v3 had ValueLogLoadingMode but v4 doesn't)
  3. Is reducing MemTableSize to force frequent flushes the recommended approach?
  4. Should we implement application-level periodic db.Sync() or db.Flatten() calls?

Temporary Workaround Attempted

None I could think of without affecting performance.

Additional Context

  • Encryption key is stable (from environment variable, never changes)
  • No improper shutdowns (SIGTERM, crashes) - only normal OS sleep cycles
  • Linux/macOS do not send signals to processes on sleep - they just freeze I/O

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions