Skip to content

Commit 50d7160

Browse files
committed
PoS: update snapshots.md
- Update for community snapshot as primary method - Remove old Polygon snapshot platform
1 parent c2d0926 commit 50d7160

File tree

1 file changed

+43
-214
lines changed

1 file changed

+43
-214
lines changed

docs/pos/how-to/snapshots.md

Lines changed: 43 additions & 214 deletions
Original file line numberDiff line numberDiff line change
@@ -6,204 +6,37 @@ When setting up a new sentry, validator, or full node server, it is recommended
66

77
## Community snapshots
88

9-
### Amoy and beyond
9+
Polygon PoS has transitioned to a community-driven model for snapshots. Active community members now contribute to provide snapshots. Some of these members include:
1010

11-
With the [deprecation of the Mumbai testnet](https://forum.polygon.technology/t/pos-tooling-after-mumbai-deprecation-no-action-required/13740), Polygon PoS is transitioning to a community-driven model for snapshots. Active validators such as Vault Staking, Stakepool, StakeCraft, and Girnaar Nodes will now provide these snapshots. For future community snapshots on the Sepolia-anchored Amoy testnet, visit [All4nodes.io](https://all4nodes.io/Polygon), an aggregator for Polygon community snapshots.
11+
| Name | Available snapshots | Note |
12+
| -------------------------------------------------------------------- | --------------------- | -------------------------------------------- |
13+
| [Stakecraft](https://all4nodes.io/Polygon) | Mainnet, Amoy, Erigon | Support for Erigon archive snapshot |
14+
| [PublicNode (by Allnodes)](https://publicnode.com/snapshots#polygon) | Mainnet, Amoy | Support for PBSS + PebbleDB enabled snapshot |
15+
| [Stakepool](https://all4nodes.io/Polygon) | Mainnet, Amoy | - |
16+
| [Vaultstaking](https://all4nodes.io/Polygon) | Mainnet | - |
17+
| [Girnaar Nodes](https://all4nodes.io/Polygon) | Amoy | - |
1218

13-
### Legacy snapshots
14-
15-
If you're looking for older snapshots, please visit [Polygon Chains Snapshots](https://snapshot.polygon.technology/).
16-
17-
!!! note
18-
19-
Bor archive snapshots are no longer supported due to unsustainable data growth.
19+
For all the snapshots, visit [All4nodes.io](https://all4nodes.io/Polygon), an aggregator for Polygon community snapshots.
2020

2121
## Downloading and using client snapshots
2222

23-
!!! warning "Mumbai testnet now deprecated"
23+
To begin, ensure that your node environment meets the **prerequisites** outlined [here](../how-to/full-node/full-node-binaries.md).
2424

25-
Mumbai testnet is no longer supported. [Existing snapshots](https://snapshot.polygon.technology/), however, will still be available for the users who rely on them.
26-
27-
To begin, ensure that your node environment meets the **prerequisites** outlined [here](../how-to/full-node/full-node-binaries.md). Before starting any services, execute the shell script provided below. This script will download and extract the snapshot data, which allows for faster bootstrapping. This example uses an Ubuntu Linux m5d.4xlarge machine with an 8TB block device attached.
28-
To transfer the correct chain data to your disk, follow these steps:
29-
30-
- Specify the network (`mainnet` or `mumbai`) and client type (`heimdall` or `bor` or `erigon`) of your desired snapshot and run the following command:
25+
The majority of snapshot providers have also outlined the steps that need to be followed to download and use their respective client snapshots. Navigate to the All4nodes In case there the steps are unavailable or unclear, the following tips will come in handy:
3126

27+
- You can use the `wget` command to download and extract the `.tar` snapshot files. For example:
3228

3329
```bash
34-
curl -L https://snapshot-download.polygon.technology/snapdown.sh | bash -s -- --network {{ network }} --client {{ client }} --extract-dir {{ extract_dir }} --validate-checksum {{ true / false }}
30+
wget -O - snapshot_url_here | tar -xvf -C /target/directory
3531
```
3632

37-
For example:
33+
- Configure your client's `datadir` setting to match the directory where you downloaded and extracted the snapshot data. This ensures the `systemd` services can correctly register the snapshot data when the client is spun up.
3834

39-
```bash
40-
curl -L https://snapshot-download.polygon.technology/snapdown.sh | bash -s -- --network mainnet --client heimdall --extract-dir data --validate-checksum true
41-
```
42-
43-
!!! tip
44-
45-
This bash script automatically handles all download and extraction phases, as well as optimizing disk space by deleting already extracted files along the way.
35+
- To maintain your client's default configuration settings, consider using symbolic links (symlinks).
4636

47-
- `--extract-dir` and `--validate-checksum` flags are optional.
48-
- Consider using a Screen session to prevent accidental interruptions during the chaindata download and extraction process.
49-
- The raw bash script code is collapsed below for transparency:
37+
## Example
5038

51-
<details>
52-
<summary>View script here ↓</summary>
53-
54-
```bash
55-
#!/bin/bash
56-
57-
function validate_network() {
58-
if [[ "$1" != "mainnet" && "$1" != "mumbai" ]]; then
59-
echo "Invalid network input. Please enter 'mainnet' or 'mumbai'."
60-
exit 1
61-
fi
62-
}
63-
64-
function validate_client() {
65-
if [[ "$1" != "heimdall" && "$1" != "bor" && "$1" != "erigon" ]]; then
66-
echo "Invalid client input. Please enter 'heimdall' or 'bor' or 'erigon'."
67-
exit 1
68-
fi
69-
}
70-
71-
function validate_checksum() {
72-
if [[ "$1" != "true" && "$1" != "false" ]]; then
73-
echo "Invalid checksum input. Please enter 'true' or 'false'."
74-
exit 1
75-
fi
76-
}
77-
78-
# Parse command-line arguments
79-
while [[ $# -gt 0 ]]; do
80-
key="$1"
81-
82-
case $key in
83-
-n | --network)
84-
validate_network "$2"
85-
network="$2"
86-
shift # past argument
87-
shift # past value
88-
;;
89-
-c | --client)
90-
validate_client "$2"
91-
client="$2"
92-
shift # past argument
93-
shift # past value
94-
;;
95-
-d | --extract-dir)
96-
extract_dir="$2"
97-
shift # past argument
98-
shift # past value
99-
;;
100-
-v | --validate-checksum)
101-
validate_checksum "$2"
102-
checksum="$2"
103-
shift # past argument
104-
shift # past value
105-
;;
106-
*) # unknown option
107-
echo "Unknown option: $1"
108-
exit 1
109-
;;
110-
esac
111-
done
112-
113-
# Set default values if not provided through command-line arguments
114-
network=${network:-mumbai}
115-
client=${client:-heimdall}
116-
extract_dir=${extract_dir:-"${client}_extract"}
117-
checksum=${checksum:-false}
118-
119-
120-
# install dependencies and cursor to extract directory
121-
sudo apt-get update -y
122-
sudo apt-get install -y zstd pv aria2
123-
mkdir -p "$extract_dir"
124-
cd "$extract_dir"
125-
126-
# download compiled incremental snapshot files list
127-
aria2c -x6 -s6 "https://snapshot-download.polygon.technology/$client-$network-parts.txt"
128-
129-
# remove hash lines if user declines checksum verification
130-
if [ "$checksum" == "false" ]; then
131-
sed -i '/checksum/d' $client-$network-parts.txt
132-
fi
133-
134-
# download all incremental files, includes automatic checksum verification per increment
135-
aria2c -x6 -s6 --max-tries=0 --save-session-interval=60 --save-session=$client-$network-failures.txt --max-connection-per-server=4 --retry-wait=3 --check-integrity=$checksum -i $client-$network-parts.txt
136-
137-
max_retries=5
138-
retry_count=0
139-
140-
while [ $retry_count -lt $max_retries ]; do
141-
echo "Retrying failed parts, attempt $((retry_count + 1))..."
142-
aria2c -x6 -s6 --max-tries=0 --save-session-interval=60 --save-session=$client-$network-failures.txt --max-connection-per-server=4 --retry-wait=3 --check-integrity=$checksum -i $client-$network-failures.txt
143-
144-
# Check the exit status of the aria2c command
145-
if [ $? -eq 0 ]; then
146-
echo "Command succeeded."
147-
break # Exit the loop since the command succeeded
148-
else
149-
echo "Command failed. Retrying..."
150-
retry_count=$((retry_count + 1))
151-
fi
152-
done
153-
154-
# Don't extract if download/retries failed.
155-
if [ $retry_count -eq $max_retries ]; then
156-
echo "Download failed. Restart the script to resume downloading."
157-
exit 1
158-
fi
159-
160-
declare -A processed_dates
161-
162-
# Join bulk parts into valid tar.zst and extract
163-
for file in $(find . -name "$client-$network-snapshot-bulk-*-part-*" -print | sort); do
164-
date_stamp=$(echo "$file" | grep -o 'snapshot-.*-part' | sed 's/snapshot-\(.*\)-part/\1/')
165-
166-
# Check if we have already processed this date
167-
if [[ -z "${processed_dates[$date_stamp]}" ]]; then
168-
processed_dates[$date_stamp]=1
169-
output_tar="$client-$network-snapshot-${date_stamp}.tar.zst"
170-
echo "Join parts for ${date_stamp} then extract"
171-
cat $client-$network-snapshot-${date_stamp}-part* > "$output_tar"
172-
rm $client-$network-snapshot-${date_stamp}-part*
173-
pv $output_tar | tar -I zstd -xf - -C . && rm $output_tar
174-
fi
175-
done
176-
177-
# Join incremental following day parts
178-
for file in $(find . -name "$client-$network-snapshot-*-part-*" -print | sort); do
179-
date_stamp=$(echo "$file" | grep -o 'snapshot-.*-part' | sed 's/snapshot-\(.*\)-part/\1/')
180-
181-
# Check if we have already processed this date
182-
if [[ -z "${processed_dates[$date_stamp]}" ]]; then
183-
processed_dates[$date_stamp]=1
184-
output_tar="$client-$network-snapshot-${date_stamp}.tar.zst"
185-
echo "Join parts for ${date_stamp} then extract"
186-
cat $client-$network-snapshot-${date_stamp}-part* > "$output_tar"
187-
rm $client-$network-snapshot-${date_stamp}-part*
188-
pv $output_tar | tar -I zstd -xf - -C . --strip-components=3 && rm $output_tar
189-
fi
190-
done
191-
```
192-
193-
</details>
194-
195-
!!! note
196-
197-
If experiencing intermittent `aria2c` download errors, try reducing concurrency as shown here:
198-
199-
```bash
200-
aria2c -c -m 0 -x6 -s6 -i $client-$network-parts.txt --max-concurrent-downloads=1
201-
```
202-
203-
Once the extraction is complete, ensure that you update the datadir configuration of your client to point to the path where the extracted data is located. This ensures that the systemd services can correctly register the snapshot data when the client starts.
204-
If you wish to preserve the default client configuration settings, you can use symbolic links (symlinks).
205-
206-
For example, let's say you have mounted your block device at `~/snapshots` and have downloaded and extracted the chaindata for Heimdall into the directory `heimdall_extract`, and for Bor into the directory `bor_extract`. To ensure proper registration of the extracted data when starting the Heimdall or Bor systemd services, you can use the following sample commands:
39+
Let's say you have mounted your block device at `~/snapshots` and have downloaded and extracted the chain data into the `heimdall_extract` directory for Heimdall, and into the `bor_extract` directory for Bor. Use the following commands to register the extracted data for Heimdall and Bor `systemd` services:
20740

20841
```bash
20942
# remove any existing datadirs for Heimdall and Bor
@@ -222,46 +55,42 @@ sudo service heimdalld start
22255
sudo service bor start
22356
```
22457

58+
!!! tip "Appropriate user permissions"
59+
60+
Ensure that the Bor and Heimdall user files have appropriate permissions to access the `datadir`. To set correct permissions for Bor, execute `sudo chown -R bor:nogroup /var/lib/heimdall/data`. Similarly, for Heimdall, run `sudo chown -R heimdall:nogroup /var/lib/bor/data/bor`
61+
22562
## Recommended disk size guidance
22663

22764
### Polygon Amoy testnet
22865

229-
| Metric | Calculation Breakdown | Value |
230-
| ------ | --------------------- | ----------- |
231-
| approx. compressed total | 250 GB (Bor) + 35 GB (Heimdall) | 285 GB |
232-
| approx. data growth daily | 10 GB (Bor) + .5 GB (Heimdall) | 10.5 GB |
233-
| approx. total extracted size | 350 GB (Bor) + 50 GB (Heimdall) | 400 GB |
234-
| suggested disk size (2.5x buffer) | 400 GB * 2.5 (natural chain growth) | 1 TB |
66+
| Metric | Calculation Breakdown | Value |
67+
| --------------------------------- | ----------------------------------- | ------- |
68+
| approx. compressed total | 250 GB (Bor) + 35 GB (Heimdall) | 285 GB |
69+
| approx. data growth daily | 10 GB (Bor) + .5 GB (Heimdall) | 10.5 GB |
70+
| approx. total extracted size | 350 GB (Bor) + 50 GB (Heimdall) | 400 GB |
71+
| suggested disk size (2.5x buffer) | 400 GB * 2.5 (natural chain growth) | 1 TB |
23572

23673
### Polygon mainnet
23774

238-
| Metric | Calculation Breakdown | Value |
239-
| ------ | --------------------- | ----------- |
240-
| approx. compressed total | 1500 GB (Bor) + 225 GB (Heimdall) | 1725 GB |
241-
| approx. data growth daily | 100 GB (Bor) + 5 GB (Heimdall) | 105 GB |
242-
| approx. total extracted size | 2.1 TB (Bor) + 300 GB (Heimdall) | 2.4 TB |
243-
| suggested disk size (2.5x buffer) | 2.4 TB * 2.5 (natural chain growth) | 6 TB |
75+
| Metric | Calculation Breakdown | Value |
76+
| --------------------------------- | ----------------------------------- | ------- |
77+
| approx. compressed total | 1500 GB (Bor) + 225 GB (Heimdall) | 1725 GB |
78+
| approx. data growth daily | 100 GB (Bor) + 5 GB (Heimdall) | 105 GB |
79+
| approx. total extracted size | 2.1 TB (Bor) + 300 GB (Heimdall) | 2.4 TB |
80+
| suggested disk size (2.5x buffer) | 2.4 TB * 2.5 (natural chain growth) | 6 TB |
24481

24582
### Polygon Amoy Erigon archive
24683

247-
| Metric | Calculation Breakdown | Value |
248-
| ------ | --------------------- | ----------- |
249-
| approx. compressed total | 210 GB (Erigon) + 35 GB (Heimdall) | 245 GB |
250-
| approx. data growth daily | 4.5 GB (Erigon) + .5 GB (Heimdall) | 5 GB |
251-
| approx. total extracted size | 875 GB (Erigon) + 50 GB (Heimdall) | 925 GB |
252-
| suggested disk size (2.5x buffer) | 925 GB * 2.5 (natural chain growth) | 2.5 TB |
253-
254-
!!! note
255-
256-
The PoS Network is deprecating archive node snapshots. Please move to the Erigon client and use Erigon snapshots to sync your nodes.
257-
258-
### Polygon mainnet Erigon archive
259-
260-
Please check the hardware requirements for an Erigon mainnet archive node on the [pre-requisites page for deploying a Polygon node using Erigon](https://erigon.gitbook.io/erigon/basic-usage/getting-started#hardware-requirements).
84+
| Metric | Calculation Breakdown | Value |
85+
| --------------------------------- | ----------------------------------- | ------ |
86+
| approx. compressed total | 210 GB (Erigon) + 35 GB (Heimdall) | 245 GB |
87+
| approx. data growth daily | 4.5 GB (Erigon) + .5 GB (Heimdall) | 5 GB |
88+
| approx. total extracted size | 875 GB (Erigon) + 50 GB (Heimdall) | 925 GB |
89+
| suggested disk size (2.5x buffer) | 925 GB * 2.5 (natural chain growth) | 2.5 TB |
26190

26291
## Recommended disk type and IOPS guidance
26392

264-
- Disk IOPS will impact speed of downloading/extracting snapshots, getting in sync, and performing LevelDB compaction.
265-
- To minimize disk latency, direct attached storage is ideal.
266-
- In AWS, when using gp3 disk types, we recommend provisioning IOPS of 16000 and throughput of 1000. This minimizes cost and adds a lot of performance. io2 EBS volumes with matching IOPS and throughput values are similarly performant.
267-
- For GCP, we recommend using performance (SSD) persistent disks (`pd-ssd`) or extreme persistent disks (`pd-extreme`) with similar IOPS and throughput values as seen above.
93+
- Disk IOPS will affect the speed of downloading/extracting snapshots, getting in sync, and performing LevelDB compaction.
94+
- To minimize disk latency, direct-attached storage is ideal.
95+
- In AWS, when using gp3 disk types, we recommend provisioning IOPS of 16,000 and throughput of 1,000. This minimizes costs while providing significant performance benefits. io2 EBS volumes with matching IOPS and throughput values offer similar performance.
96+
- For GCP, we recommend using performance (SSD) persistent disks (`pd-ssd`) or extreme persistent disks (`pd-extreme`) with similar IOPS and throughput values as mentioned above.

0 commit comments

Comments
 (0)