Skip to content

Conversation

@kcalvinalvin
Copy link
Collaborator

Change Description

During the initial-block download, one of the biggest culprits of excessive memory allocation was the immutable treap.
The immutable treap guarantees immutability by cloning all shared nodes that'll be modified inside the treap during deletion and put operations.

For the insertions of 3 keys:

key 1: 50
key 2: 10
key 3: 4

The insertion is like so:

    1: allocate 50.
     
            50
     
    2: clone 50, allocate 10.
     
            50
           /  \
          10
     
    3: clone 50, clone 10, allocate 4
     
            50
           /  \
          10
         /
        4

In the above example, only the nodes allocated during insertion of (3) is going to be used.
The rest is going to be garbage collected immediately which results in memory being allocated too much.

The memory allocation savings in this PR comes from putting the nodes that'll be garbage collected in the sync.Pool.

We're able to safely recycle the intermediary nodes as we now require the Put() operation to take in multiple key-value pairs.
Since there's a guarantee that the previous copies are not being accessed as they're only accessible within the Put() operation, it's safe to recycle them.

I tested both the master and b3bf17cd0f924c82634271babb2e312f44b7a314 with the following command.
I used the connect flag to reduce the differences that would happen a random selection of peers.
I added a checkpoint just to speed up things on signet since there aren't checkpoints by default for signet.

btcd --datadir=. --logdir=. --signet --addcheckpoint=260000:0000000b2d5533cb361cab243eb2034986c579efceb2fe266baa057104537bf5 --memprofile=memprof-with-master --connect=100.115.246.54

On master, the pprof flamegraph for syncing on signet looks like so:

We can see that the cloneTreapNode function is taking up a lot of space in the flamegraph.
The total memory allocation is also 511GB.
IMG_0418

With this PR, the flamegraph looks like so:

We can see a dramatic reduction in the total memory allocated as it's now just 359GB.
The problematic cloneTreapNode is also significantly reduced in the flamegraph.
IMG_0419

I added the profiles that I took in this zip file:
2025-09-20-treappool-pprof.zip

Steps to Test

To test that there is a memory reduction, run the following command on master and on this PR.
Compare that there's less memory allocation with this PR.

# Make sure to edit the output name for the flag `memprofile`!
btcd --datadir=. --logdir=. --signet --addcheckpoint=260000:0000000b2d5533cb361cab243eb2034986c579efceb2fe266baa057104537bf5 --memprofile=memprof-with-master

To test that the Put() function still guarantees immutability, check that the tests in immutable_test.go are correctly
modified to account for the function prototype change.

Pull Request Checklist

Testing

  • Your PR passes all CI checks.
  • Tests covering the positive and negative (error paths) are included.
  • Bug fixes contain tests triggering the bug to prevent regressions.

Code Style and Documentation

📝 Please see our Contribution Guidelines for further guidance.

We do this to later add batch put logic into the exported Put()
function.
newTreapNode

The treapNodePool will allow for less memory allocations during
immutable treap operations.  We first change the cloneTreapNode and
newTreapNode to allocate a treapNode from the sync.Pool.

The allocated treapNodes will be put back into the sync.Pool in later
commits.
@saubyk saubyk added this to v0.21 Sep 30, 2025
@saubyk saubyk moved this to In review in v0.21 Sep 30, 2025
@saubyk
Copy link
Collaborator

saubyk commented Sep 30, 2025

cc: @mohamedawnallah for review

@saubyk
Copy link
Collaborator

saubyk commented Sep 30, 2025

cc: @bhandras for review

Copy link
Member

@Roasbeef Roasbeef left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIce optimization!

// Make sure that the node we're going to recycle isn't
// being used by the latest immutable treap by checking
// if the pointer value of the node is the same.
got := newTreap.get(node.key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@Roasbeef
Copy link
Member

Roasbeef commented Oct 3, 2025

Here's an attempt to make the test diff a bit easier to follow:

diff --git a/database/internal/treap/common_test.go b/database/internal/treap/common_test.go
index c43e678d..526c60cc 100644
--- a/database/internal/treap/common_test.go
+++ b/database/internal/treap/common_test.go
@@ -5,6 +5,7 @@
 package treap
 
 import (
+	"crypto/sha256"
 	"encoding/binary"
 	"encoding/hex"
 	"math/rand"
@@ -31,6 +32,65 @@ func serializeUint32(ui uint32) []byte {
 	return ret[:]
 }
 
+// makeSeqKeysWithStep returns serialized uint32 keys in ascending order
+// for the half-open range [start, end), advancing by step.  A step of 0
+// is treated as 1.
+func makeSeqKeysWithStep(start, end, step int) [][]byte {
+	if step == 0 {
+		step = 1
+	}
+	if end <= start {
+		return nil
+	}
+	// Preallocate best-effort capacity.
+	n := (end - start + (step - 1)) / step
+	keys := make([][]byte, 0, n)
+	for i := start; i < end; i += step {
+		keys = append(keys, serializeUint32(uint32(i)))
+	}
+	return keys
+}
+
+// makeSeqKeys returns serialized uint32 keys in ascending order for the
+// half-open range [start, end).
+func makeSeqKeys(start, end int) [][]byte {
+	return makeSeqKeysWithStep(start, end, 1)
+}
+
+// makeRevKeys returns serialized uint32 keys in descending order for the
+// inclusive range [start, end].  It is a no-op when start < end.
+func makeRevKeys(start, end int) [][]byte {
+	if start < end {
+		return nil
+	}
+	n := start - end + 1
+	keys := make([][]byte, 0, n)
+	for v := start; v >= end; v-- {
+		keys = append(keys, serializeUint32(uint32(v)))
+	}
+	return keys
+}
+
+// makeHashedSeqKeys returns sha256(serialized uint32) keys in ascending order
+// for the half-open range [start, end).
+func makeHashedSeqKeys(start, end int) [][]byte {
+	if end <= start {
+		return nil
+	}
+	keys := make([][]byte, 0, end-start)
+	for i := start; i < end; i++ {
+		sum := sha256.Sum256(serializeUint32(uint32(i)))
+		keys = append(keys, sum[:])
+	}
+	return keys
+}
+
+// putKeysAsValues is a small helper to put the provided keys with values equal
+// to the keys, returning the updated immutable treap.
+func putKeysAsValues(t *Immutable, keys [][]byte) *Immutable {
+	return t.Put(keys, keys)
+}
+
 // TestParentStack ensures the treapParentStack functionality works as intended.
 func TestParentStack(t *testing.T) {
 	t.Parallel()
diff --git a/database/internal/treap/immutable_test.go b/database/internal/treap/immutable_test.go
index f24d406f..4ef2dad9 100644
--- a/database/internal/treap/immutable_test.go
+++ b/database/internal/treap/immutable_test.go
@@ -64,14 +64,8 @@ func TestImmutableSequential(t *testing.T) {
 	keyCount := 100
 	testTreap := NewImmutable()
 	for i := 0; i < numItems/keyCount; i++ {
-		keys := make([][]byte, 0, keyCount)
-		for j := 0; j < keyCount; j++ {
-			n := i*keyCount + j
-			key := serializeUint32(uint32(n))
-			keys = append(keys, key)
-		}
-
-		testTreap = testTreap.Put(keys, keys)
+		keys := makeSeqKeys(i*keyCount, (i+1)*keyCount)
+		testTreap = putKeysAsValues(testTreap, keys)
 
 		// Ensure the treap length is the expected value.
 		if gotLen := testTreap.Len(); gotLen != (i+1)*keyCount {
@@ -174,14 +168,10 @@ func TestImmutableReverseSequential(t *testing.T) {
 	keyCount := 100
 	testTreap := NewImmutable()
 	for i := 0; i < numItems/keyCount; i++ {
-		keys := make([][]byte, 0, keyCount)
-		for j := 0; j < keyCount; j++ {
-			n := numItems - (i * keyCount) - j - 1
-			key := serializeUint32(uint32(n))
-			keys = append(keys, key)
-		}
-
-		testTreap = testTreap.Put(keys, keys)
+		start := numItems - (i * keyCount) - 1
+		end := numItems - (i+1)*keyCount
+		keys := makeRevKeys(start, end)
+		testTreap = putKeysAsValues(testTreap, keys)
 
 		// Ensure the treap length is the expected value.
 		if gotLen := testTreap.Len(); gotLen != (i+1)*keyCount {
@@ -286,15 +276,8 @@ func TestImmutableUnordered(t *testing.T) {
 	testTreap := NewImmutable()
 	for i := 0; i < numItems/keyCount; i++ {
 		// Hash the serialized int to generate out-of-order keys.
-		keys := make([][]byte, 0, keyCount)
-		for j := 0; j < keyCount; j++ {
-			n := i*keyCount + j
-			hash := sha256.Sum256(serializeUint32(uint32(n)))
-			key := hash[:]
-			keys = append(keys, key)
-		}
-
-		testTreap = testTreap.Put(keys, keys)
+		keys := makeHashedSeqKeys(i*keyCount, (i+1)*keyCount)
+		testTreap = putKeysAsValues(testTreap, keys)
 
 		// Ensure the treap length is the expected value.
 		if gotLen := testTreap.Len(); gotLen != (i+1)*keyCount {
@@ -448,12 +431,8 @@ func TestImmutableForEachStopIterator(t *testing.T) {
 	// Insert a few keys.
 	numItems := 10
 	testTreap := NewImmutable()
-	keys := make([][]byte, 0, numItems)
-	for i := 0; i < numItems; i++ {
-		key := serializeUint32(uint32(i))
-		keys = append(keys, key)
-	}
-	testTreap = testTreap.Put(keys, keys)
+	keys := makeSeqKeys(0, numItems)
+	testTreap = putKeysAsValues(testTreap, keys)
 
 	// Ensure ForEach exits early on false return by caller.
 	var numIterated int
@@ -482,14 +461,8 @@ func TestImmutableSnapshot(t *testing.T) {
 	for i := 0; i < numItems/keyCount; i++ {
 		treapSnap := testTreap
 
-		keys := make([][]byte, 0, keyCount)
-		for j := 0; j < keyCount; j++ {
-			n := i*keyCount + j
-			key := serializeUint32(uint32(n))
-			keys = append(keys, key)
-		}
-
-		testTreap = testTreap.Put(keys, keys)
+		keys := makeSeqKeys(i*keyCount, (i+1)*keyCount)
+		testTreap = putKeysAsValues(testTreap, keys)
 
 		// Ensure the length of the treap snapshot is the expected
 		// value.
diff --git a/database/internal/treap/treapiter_test.go b/database/internal/treap/treapiter_test.go
index 73a53d27..2063de18 100644
--- a/database/internal/treap/treapiter_test.go
+++ b/database/internal/treap/treapiter_test.go
@@ -496,12 +496,8 @@ testLoop:
 	for i, test := range tests {
 		// Insert a bunch of keys.
 		testTreap := NewImmutable()
-		keys := make([][]byte, 0, test.numKeys)
-		for i := 0; i < test.numKeys; i += test.step {
-			key := serializeUint32(uint32(i))
-			keys = append(keys, key)
-		}
-		testTreap = testTreap.Put(keys, keys)
+		keys := makeSeqKeysWithStep(0, test.numKeys, test.step)
+		testTreap = putKeysAsValues(testTreap, keys)
 
 		// Create new iterator limited by the test params.
 		iter := testTreap.Iterator(test.startKey, test.limitKey)

The newly created treapNodes in put() are now returned so that the
caller has access to them.  This is done so that the caller can put back
some of the treapNodes to sync.Pool.

For multiple put operations, an immutable treap will allocate many
treapNodes that will immediately be garbage collected.  Let's say
there's 3 key-value pairs that are going to be inserted:

key 1: 50
key 2: 10
key 3: 4

Then the insertion is like so:

1: allocate 50.

        50

2: clone 50, allocate 10.

        50
       /  \
      10

3: clone 50, clone 10, allocate 4

        50
       /  \
      10
     /
    4

In this example, only the nodes allocated during insertion of (3) is
going to be used.  The rest is going to be garbage collected and they
can be safely be put in the sync.Pool if there's a guarantee that the
previous copies are not being accessed.  This is true if the put
operations are going to be called in batches.

By returning the pointers of these allocated treapNodes, we allow the
caller to make such optimizations.
The recycle method resets and puts the treapNode into the treapNodePool
to be recycled at a future date.
@kcalvinalvin kcalvinalvin force-pushed the 2025-09-20-add-treapnode-pool branch from b3bf17c to 3f045c9 Compare November 13, 2025 13:54
@coveralls
Copy link

coveralls commented Nov 13, 2025

Pull Request Test Coverage Report for Build 19366806897

Details

  • 78 of 78 (100.0%) changed or added relevant lines in 3 files are covered.
  • 14 unchanged lines in 4 files lost coverage.
  • Overall coverage increased (+0.05%) to 55.011%

Files with Coverage Reduction New Missed Lines %
mempool/mempool.go 1 66.56%
btcutil/gcs/gcs.go 4 80.95%
database/ffldb/blockio.go 4 88.81%
rpcclient/infrastructure.go 5 47.7%
Totals Coverage Status
Change from base Build 19121431262: 0.05%
Covered Lines: 31267
Relevant Lines: 56838

💛 - Coveralls

Previously, the Put() function took in a single key-value pair,
resulting in memory allocation overhead as the intermediary treapNodes
got allocated and immediately garbage collected, resulting in
performance loss due to the garbage collection overhead.

We're able to recycle the intermediary treapNodes allocated by taking in
multiple key-value pairs.  This results in better performance of the
node especially during UTXO cache flushes.
@kcalvinalvin kcalvinalvin force-pushed the 2025-09-20-add-treapnode-pool branch from 3f045c9 to e009124 Compare November 14, 2025 13:58
@kcalvinalvin
Copy link
Collaborator Author

Addressed the fixes requested here besides doing the batch deletions.

@saubyk saubyk requested a review from Roasbeef November 14, 2025 16:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants