Skip to content

Commit e4c0491

Browse files
committed
feat: add comprehensive training documentation for _Executable class
Signed-off-by: MonaaEid <monaa_eid@hotmail.com>
1 parent 5b91fe5 commit e4c0491

File tree

2 files changed

+347
-0
lines changed

2 files changed

+347
-0
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,7 @@ This changelog is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.
2424
- Support selecting specific node account ID(s) for queries and transactions and added `Network._get_node()` with updated execution flow (#362)
2525
- Add TLS support with two-stage control (`set_transport_security()` and `set_verify_certificates()`) for encrypted connections to Hedera networks. TLS is enabled by default for hosted networks (mainnet, testnet, previewnet) and disabled for local networks (solo, localhost) (#855)
2626
- Add PR inactivity reminder bot for stale pull requests `.github/workflows/pr-inactivity-reminder-bot.yml`
27+
- Add comprehensive training documentation for _Executable class `docs/sdk_developers/training/executable.md`
2728

2829
### Changed
2930

Lines changed: 346 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,346 @@
1+
# _Executable Class Training
2+
3+
## Table of Contents
4+
5+
- [Introduction to _Executable](#introduction-to-_executable)
6+
- [Execution Flow](#execution-flow)
7+
- [Retry Logic](#retry-logic)
8+
- [Exponential backoff](#exponential-backoff)
9+
- [Error Handling](#error-handling)
10+
- [Logging & Debugging](#logging--debugging)
11+
- [Practical Examples](#practical-examples)
12+
13+
## 📋Introduction to _Executable
14+
* The _Executable class is the backbone of the Hedera SDK execution engine. It handles sending transactions and queries, retry logic, error mapping, and logging, allowing child classes (like Transaction and Query) to focus on business logic.
15+
16+
```mermaid
17+
graph TD;
18+
_Executable-->Transaction;
19+
_Executable-->Query;
20+
Query-->TokenNftInfoQuery;
21+
Query-->TokenInfoQuery;
22+
Transaction-->TokenFreezeTransaction;
23+
Transaction-->TokenDissociateTransaction;
24+
```
25+
26+
27+
## ▶️⚙️Execution Flow
28+
29+
-How _execute(client) works in the Hedera SDK?
30+
31+
The typical execution flow for transactions and queries using the Executable interface follows these steps:
32+
33+
1. **Build** → Create the transaction/query with required parameters
34+
2. **FreezeWith(client)** → Locks the transaction for signing
35+
3. **Sign(privateKey)** → Add required signatures
36+
4. **Execute(client)** → Submit to the network
37+
5. **GetReceipt(client)** → Confirm success/failure
38+
39+
40+
Here’s how child classes hook into the execution pipeline:
41+
42+
| Command | Description |
43+
| --- | --- |
44+
| `_make_request` | Build the protobuf request for this operation. Example: a transaction class serializes its body into a Transaction proto; a query class builds the appropriate query proto. |
45+
| `_get_method(channel: _Channel) -> _Method` | Choose which gRPC stub method to call. You get service stubs from channel, then return _Method(transaction_func=...) for transactions or _Method(query_func=...) for queries. The executor calls _execute_method, which picks transaction if present, otherwise query. |
46+
| `_map_status_error(response)` | Inspect the network response status and convert it to an appropriate exception (precheck/receipt). This lets the executor decide whether to raise or retry based on _should_retry. |
47+
| `_should_retry(response) -> _ExecutionState` | _ExecutionState: Decide the execution state from the response/status: RETRY, FINISHED, ERROR, or EXPIRED. This drives the retry loop and backoff. |
48+
| `_map_response(response, node_id, proto_request)` | Convert the raw gRPC/Proto response into the SDK’s response type (e.g., TransactionResponse, Query result) that gets returned to the caller. |
49+
50+
51+
## 🔁Retry Logic
52+
- Core Logic:
53+
1. Loop up to max_attempts times — The outer for loop tries the operation multiple times
54+
2. Exponential backoff — Each retry waits longer than the previous one
55+
3. Execute and check response — After execution, determine if we should retry, fail, or succeed
56+
4. Smart error handling — Different errors trigger different actions
57+
58+
<img width="600" height="600" alt="image" src="https://github.com/user-attachments/assets/5d318db0-4f3d-45c8-98b8-e7b6d7bf0762" />
59+
60+
61+
**_Retry logic = Try the operation, wait progressively longer between attempts, pick a different node if needed, and give up after max attempts. This makes the system resilient to temporary network hiccups._**
62+
63+
64+
## ⏳Exponential backoff
65+
Key Steps:
66+
67+
* First retry: wait `_min_backoff` ms
68+
* Second retry: wait 2× that
69+
* Third retry: wait 4× that (doubling each time)
70+
* Stops growing at `_max_backoff`
71+
72+
_(Why? Gives the network time to recover between attempts without hammering it immediately.)_
73+
74+
Handling gRPC errors:
75+
```python
76+
except grpc.RpcError as e:
77+
err_persistant = f"Status: {e.code()}, Details: {e.details()}"
78+
node = client.network._select_node() # Switch nodes
79+
logger.trace("Switched to a different node...", "error", err_persistant)
80+
continue # Retry with new node
81+
```
82+
Retryable gRPC codes:
83+
84+
* `UNAVAILABLE — Node` down/unreachable
85+
* `DEADLINE_EXCEEDED` — Request timeout
86+
* `RESOURCE_EXHAUSTED` — Rate limited
87+
* `INTERNAL` — Server error
88+
89+
_(If the [gRPC](https://en.wikipedia.org/wiki/GRPC) call itself fails, switch to a different network node and retry.)_
90+
91+
## 🚨Error Handling
92+
93+
* Mapping network errors to Python exceptions
94+
Abstract method that child classes implement:
95+
```python
96+
@abstractmethod
97+
def _map_status_error(self, response):
98+
"""Maps a response status code to an appropriate error object."""
99+
raise NotImplementedError(...)
100+
```
101+
102+
* Precheck errors --> PrecheckError (e.g., invalid account, insufficient balance)
103+
* Receipt errors --> ReceiptStatusError (e.g., transaction executed but failed)
104+
* Other statuses --> Appropriate exception types based on the status code
105+
106+
107+
*Retryable vs Fatal Errors
108+
Determined by `_should_retry(response) → _ExecutionState`:
109+
110+
```python
111+
@abstractmethod
112+
def _should_retry(self, response) -> _ExecutionState:
113+
"""Determine whether the operation should be retried based on the response."""
114+
raise NotImplementedError(...)
115+
```
116+
117+
The response is checked via `_should_retry()` which returns one of four `Execution States`:
118+
119+
| State | Action |
120+
| :--------------| :---------------------------------------|
121+
| **RETRY** | `Wait (backoff), then loop again` |
122+
| **FINISHED** | `Success! Return the response` |
123+
| **ERROR** | `Permanent failure, raise exception` |
124+
| **EXPIRED** | `Request expired, raise exception` |
125+
126+
127+
128+
## ⚠️🐞Logging & Debugging
129+
130+
- Request ID tracking
131+
* Unique request identifier per operation:
132+
```python
133+
def _get_request_id(self):
134+
"""Format the request ID for the logger."""
135+
return f"{self.__class__.__name__}:{time.time_ns()}"
136+
```
137+
* Format: `ClassName:nanosecond_timestamp` (e.g., `TransferTransaction:1702057234567890123`)
138+
* Unique per execution, allowing you to trace a single operation through logs
139+
* Passed to every logger call for correlation
140+
141+
* Used throughout execution:
142+
```python
143+
logger.trace("Executing", "requestId", self._get_request_id(), "nodeAccountID", self.node_account_id, "attempt", attempt + 1, "maxAttempts", max_attempts)
144+
logger.trace("Executing gRPC call", "requestId", self._get_request_id())
145+
logger.trace("Retrying request attempt", "requestId", request_id, "delay", current_backoff, ...)
146+
```
147+
148+
* At each attempt start:
149+
```python
150+
logger.trace(
151+
"Executing",
152+
"requestId", self._get_request_id(),
153+
"nodeAccountID", self.node_account_id, # Which node this attempt uses
154+
"attempt", attempt + 1, # Current attempt (1-based)
155+
"maxAttempts", max_attempts # Total allowed attempts
156+
)
157+
```
158+
* During gRPC call:
159+
```
160+
logger.trace("Executing gRPC call", "requestId", self._get_request_id())
161+
```
162+
* After response received:
163+
```logger.trace(
164+
f"{self.__class__.__name__} status received",
165+
"nodeAccountID", self.node_account_id,
166+
"network", client.network.network, # Network name (testnet, mainnet)
167+
"state", execution_state.name, # RETRY, FINISHED, ERROR, EXPIRED
168+
"txID", tx_id # Transaction ID if available
169+
)
170+
```
171+
172+
* Before backoff/retry:
173+
```logger.trace(
174+
f"Retrying request attempt",
175+
"requestId", request_id,
176+
"delay", current_backoff, # Milliseconds to wait
177+
"attempt", attempt,
178+
"error", error # The error that triggered retry
179+
)
180+
time.sleep(current_backoff * 0.001) # Convert ms to seconds
181+
```
182+
* Node switch on gRPC error:
183+
```logger.trace(
184+
"Switched to a different node for the next attempt",
185+
"error", err_persistant,
186+
"from node", self.node_account_id, # Old node
187+
"to node", node._account_id # New node
188+
)
189+
```
190+
* Final failure:
191+
```logger.error(
192+
"Exceeded maximum attempts for request",
193+
"requestId", self._get_request_id(),
194+
"last exception being", err_persistant
195+
)
196+
```
197+
198+
- Tips for debugging transaction/query failures
199+
1. Enable Trace Logging
200+
Capture detailed execution flow:
201+
```python
202+
client.logger.set_level("trace") # or DEBUG
203+
```
204+
* What you'll see:
205+
- _Every attempt number_
206+
- _Node account IDs being used_
207+
- _Backoff delays between retries_
208+
- _Status received at each stage_
209+
- _Node switches on errors_
210+
211+
2. Identify the Failure Type by Execution State
212+
Execution State Flow:
213+
```
214+
┌─ RETRY (0)
215+
│ ├─ Network hiccup (gRPC error)
216+
│ ├─ Temporary node issue
217+
│ └─ Rate limiting (try after backoff)
218+
219+
├─ FINISHED (1)
220+
│ └─ Success ✓ (return response)
221+
222+
├─ ERROR (2)
223+
│ ├─ Precheck error (bad input)
224+
│ ├─ Invalid account/permissions
225+
│ ├─ Insufficient balance
226+
│ └─ Permanent failure
227+
228+
└─ EXPIRED (3)
229+
└─ Transaction ID expired (timing issue)
230+
```
231+
232+
3. Track Backoff Progression
233+
Exponential backoff indicates retryable errors:
234+
```Attempt 1: (no backoff, first try)
235+
Attempt 2: delay 250ms
236+
Attempt 3: delay 500ms
237+
Attempt 4: delay 1000ms (1s)
238+
Attempt 5: delay 2000ms (2s)
239+
Attempt 6: delay 4000ms (4s)
240+
Attempt 7: delay 8000ms (8s, capped)
241+
Attempt 8+: delay 8000ms (stays capped)
242+
```
243+
Interpretation:
244+
* => Growing delays: System is retrying a transient issue → healthy behavior
245+
* => Reaches cap (8s) multiple times: Network or node is struggling
246+
* => Fails immediately (no backoff): Permanent error(precheck/validation)
247+
4. Monitor Node Switches
248+
Watch for node switching patterns in logs:
249+
```
250+
Switched to a different node
251+
from node: 0.0.3
252+
to node: 0.0.4
253+
error: Status: UNAVAILABLE, Details: Node is offline
254+
```
255+
Healthy patterns:
256+
* => Few switches (1-2 per 3+ attempts)
257+
* => Changes due to network errors (gRPC failures)
258+
259+
Problem patterns:
260+
* => Rapid switches (multiple per attempt)
261+
* => All nodes fail → network-wide issue
262+
* => Always same node fails → that node may be down
263+
264+
5. Cross-Reference Transaction ID with Hedera Explorer
265+
For transactions, use the transaction ID to verify on the network:
266+
```# From logs, capture txID
267+
txID: "0.0.123@1702057234.567890123"
268+
269+
# Query Hedera mirror node
270+
curl https://testnet.mirrornode.hedera.com/api/v1/transactions/0.0.123-1702057234-567890123
271+
```
272+
What you'll find:
273+
* => Actual execution result on the network
274+
* => Receipt status
275+
* => Gas used (for contract calls)
276+
* => Confirms if transaction made it despite client-side errors
277+
278+
6. Debug Specific Error Scenarios
279+
280+
| Error | Cause | Debug Steps |
281+
| :--- | :---: | :---: |
282+
| MaxAttemptsError | Failed after max retries | Check backoff log; all nodes failing? |
283+
| PrecheckError | Bad request (immediate fail)| Validate: account ID, amount, permissions |
284+
| ReceiptStatusError| Executed but failed | Check transaction details, balance, contract logic|
285+
| gRPC RpcError | Network issue | Check node status, firewall, internet |
286+
| EXPIRED state | Transaction ID too old | Use fresh transaction ID, check system clock |
287+
288+
7. Practical Debugging Workflow
289+
* Step 1: Capture the request ID
290+
`From error output: requestId = TransferTransaction:1702057234567890123`
291+
292+
* Step 2: Search logs for that request ID
293+
```grep "1702057234567890123" application.log```
294+
295+
* Step 3: Analyze the sequence
296+
```[TRACE] Executing attempt=1 nodeAccountID=0.0.3 ...
297+
[TRACE] Executing gRPC call ...
298+
[TRACE] Retrying request delay=250ms ...
299+
[TRACE] Executing attempt=2 nodeAccountID=0.0.4 ...
300+
[TRACE] Switched to a different node error=Status: UNAVAILABLE ...
301+
[ERROR] Exceeded maximum attempts ...
302+
```
303+
304+
* Step 4: Determine root cause
305+
* => Multiple retries with node switches → Network issue
306+
* => Single attempt, immediate ERROR → Input validation issue
307+
* => EXPIRED after 1+ attempts → Timeout/clock issue
308+
309+
* Step 5: Take action
310+
* => Network issue: Retry after delay, check network status
311+
* => Input issue: Fix account ID, amount, permissions
312+
* => Timeout: Increase max_backoff or check system clock
313+
314+
8. Enable Verbose Logging for Production Issues
315+
* For real-world debugging:
316+
```python
317+
# Set higher log level before executing
318+
client.logger.set_level("debug")
319+
320+
# Or configure structured logging
321+
import logging
322+
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
323+
```
324+
Captures:
325+
* => Request ID, attempt number, node ID
326+
* => Backoff delays and progression
327+
* => gRPC errors with status codes
328+
* => Final error message with context
329+
330+
Debug Checklist:
331+
332+
* ✅ Confirm request ID appears in logs (means operation was attempted)
333+
* ✅ Count attempts (did it retry or fail immediately?)
334+
* ✅ Check execution states (RETRY → ERROR or RETRY → FINISHED?)
335+
* ✅ Note node switches (gRPC errors or single node?)
336+
* ✅ Verify backoff progression (exponential or capped?)
337+
* ✅ Match final error to exception type (Precheck, Receipt, MaxAttempts, etc.)
338+
* ✅ Cross-check transaction ID with Hedera explorer if available
339+
340+
This lets you quickly identify whether a failure is transient (network), permanent (bad input), or rate-related (backoff didn't help).
341+
342+
## Practical Examples
343+
* [Token Association Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/tokens/token_associate_transaction.py)
344+
* [Token Freeze Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/tokens/token_freeze_transaction.py)
345+
* [Token Account Info Query Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/query/account_info_query.py)
346+

0 commit comments

Comments
 (0)