From 4566810b1e7e771153e6f0bf54ad640134125101 Mon Sep 17 00:00:00 2001 From: =?UTF-8?q?Michael=20Vorburger=20=E2=9B=91=EF=B8=8F?= Date: Mon, 28 Aug 2023 21:26:26 +0200 Subject: [PATCH 1/3] docs(faq): Add initial details about penalties MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Ad discussed on Slack, and related to https://github.com/31z4/saturn-moonlet/issues/3. Signed-off-by: Michael Vorburger ⛑️ --- docs/faq.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/docs/faq.md b/docs/faq.md index 1a319bf1..9dd987b0 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -60,6 +60,21 @@ These are the current penalties that affect both DNS weight and earned FIL: - Fraudulent logging (e.g. self-dealing) - Multi-noding (Running multiple nodes on the same host) +These are visibile e.g. by hovering over the _Weight_ of a Node on https://dashboard.saturn.tech, or on the _Penalty_ graph of the [Moonlet](https://github.com/31z4/saturn-moonlet). + +Here are more details about each kind of penalty, with information what could cause it, and how to remedy if your Node encounters them: + +- `error_ratio` is caused by errors as shown in the log of L1 Node container, scuh as: + - Node refusing to connect with a client multiple times. +- `dup_cache_miss_ratio` is caused if a computed error ratio for consecutive duplicate cache misses goes over a threshold. +- `health_check_failures` are caused by an unexpectedly (not deregistered) unreachable node. + - This penalty starts with each health check failure event, and then gradually decreases over ~6h. + - Fix this by resolving the root cause of the node unavailability. + - Note the [deregister my node](#how-can-i-manually-deregister-my-node) section. + +None of these are "expected" under normal operation, and all are something that you want to keep down. + + ## Registration ### My Node fails to register with error ETIMEDOUT/EHOSTUNREACH From b661d58002ab5a94a40cfca095e7600c559018eb Mon Sep 17 00:00:00 2001 From: Michael Vorburger Date: Wed, 30 Aug 2023 20:00:18 +0200 Subject: [PATCH 2/3] Review feedback incorporated. --- docs/faq.md | 11 ++++------- 1 file changed, 4 insertions(+), 7 deletions(-) diff --git a/docs/faq.md b/docs/faq.md index 9dd987b0..4a59bf7c 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -60,21 +60,18 @@ These are the current penalties that affect both DNS weight and earned FIL: - Fraudulent logging (e.g. self-dealing) - Multi-noding (Running multiple nodes on the same host) -These are visibile e.g. by hovering over the _Weight_ of a Node on https://dashboard.saturn.tech, or on the _Penalty_ graph of the [Moonlet](https://github.com/31z4/saturn-moonlet). +These are visible e.g. by hovering over the _Weight_ of a Node on https://dashboard.saturn.tech. Here are more details about each kind of penalty, with information what could cause it, and how to remedy if your Node encounters them: -- `error_ratio` is caused by errors as shown in the log of L1 Node container, scuh as: +- Error ratio is caused by network errors connecting to the node and HTTP 5xx errors - Node refusing to connect with a client multiple times. -- `dup_cache_miss_ratio` is caused if a computed error ratio for consecutive duplicate cache misses goes over a threshold. -- `health_check_failures` are caused by an unexpectedly (not deregistered) unreachable node. +- Duplicate Cache Miss Ratio is caused when the same CID & file path is cached missed to upstream providers. This shouldn't happen if the node has enough disk and the right file permissions. +- Health Check Failures are caused by an unexpectedly (not deregistered) unreachable node. This isn't necessarily the node's fault, but could also be a connectivity issue anywhere in the internet path from the orchestrator to the node. - This penalty starts with each health check failure event, and then gradually decreases over ~6h. - Fix this by resolving the root cause of the node unavailability. - Note the [deregister my node](#how-can-i-manually-deregister-my-node) section. -None of these are "expected" under normal operation, and all are something that you want to keep down. - - ## Registration ### My Node fails to register with error ETIMEDOUT/EHOSTUNREACH From b357381b1e9866d098ca133141f5fd9eb60fc911 Mon Sep 17 00:00:00 2001 From: Michael Vorburger Date: Wed, 30 Aug 2023 23:12:39 +0200 Subject: [PATCH 3/3] docs: Add information about Old Version penalty to FAQ. --- docs/faq.md | 3 +++ 1 file changed, 3 insertions(+) diff --git a/docs/faq.md b/docs/faq.md index 4a59bf7c..4c425710 100644 --- a/docs/faq.md +++ b/docs/faq.md @@ -72,6 +72,9 @@ Here are more details about each kind of penalty, with information what could ca - Fix this by resolving the root cause of the node unavailability. - Note the [deregister my node](#how-can-i-manually-deregister-my-node) section. +- Old Version is caused by the container image version not having been upgraded after a new release + - This penalty is cleared immediately when registered with the current version. + ## Registration ### My Node fails to register with error ETIMEDOUT/EHOSTUNREACH