Skip to content

Commit 037ca62

Browse files
committed
bot: unify inactivity unassign logic (Phase 1 + Phase 2) — squashed & signed
1 parent d57a5db commit 037ca62

File tree

5 files changed

+510
-67
lines changed

5 files changed

+510
-67
lines changed

CHANGELOG.md

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,14 @@ This changelog is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.
88

99
### Added
1010
- Unified the inactivity-unassign bot into a single script with `DRY_RUN` support, and fixed handling of cross-repo PR references for stale detection.
11+
12+
- Added unit tests for `SubscriptionHandle` class covering cancellation state, thread management, and join operations.
13+
14+
15+
- Refactored `account_create_transaction_create_with_alias.py` example by splitting monolithic function into modular functions: `generate_main_and_alias_keys()`, `create_account_with_ecdsa_alias()`, `fetch_account_info()`, `print_account_summary()` (#1016)
16+
-
1117
- Modularized `transfer_transaction_fungible` example by introducing `account_balance_query()` & `transfer_transaction()`.Renamed `transfer_tokens()``main()`
12-
- Phase 2 of the inactivity-unassign bot:Automatically detects stale open pull requests (no commit activity for 21+ days), comments with a helpful InactivityBot message, closes the stale PR, and unassigns the contributor from the linked issue.
18+
- Phase 2 of the inactivity-unassign bot: Automatically detects stale open pull requests (no commit activity for 21+ days), comments with a helpful InactivityBot message, closes the stale PR, and unassigns the contributor from the linked issue.
1319
- Added `__str__()` to CustomFixedFee and updated examples and tests accordingly.
1420
- Added unit tests for `crypto_utils` (#993)
1521
- Added a github template for good first issues
@@ -26,6 +32,7 @@ This changelog is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.
2632
- Support selecting specific node account ID(s) for queries and transactions and added `Network._get_node()` with updated execution flow (#362)
2733
- Add TLS support with two-stage control (`set_transport_security()` and `set_verify_certificates()`) for encrypted connections to Hedera networks. TLS is enabled by default for hosted networks (mainnet, testnet, previewnet) and disabled for local networks (solo, localhost) (#855)
2834
- Add PR inactivity reminder bot for stale pull requests `.github/workflows/pr-inactivity-reminder-bot.yml`
35+
- Add comprehensive training documentation for _Executable class `docs/sdk_developers/training/executable.md`
2936

3037
### Changed
3138

CONTRIBUTING.md

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -166,4 +166,5 @@ Thank you for contributing to the Hiero Python SDK! 🎉
166166
- **Need help or want to connect?** Join our community on Discord! See the **[Discord Joining Guide](docs/discord.md)** for detailed steps on how to join the LFDT server
167167
- **Quick Links:**
168168
- Join the main [Linux Foundation Decentralized Trust (LFDT) Discord Server](https://discord.gg/hyperledger).
169-
- Go directly to the [#hiero-python-sdk channel](https://discord.com/channels/905194001349627914/1336494517544681563)
169+
- Go directly to the [#hiero-python-sdk channel](https://discord.com/channels/905194001349627914/1336494517544681563)
170+
Lines changed: 357 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,357 @@
1+
# _Executable Class Training
2+
3+
## Table of Contents
4+
5+
- [Introduction to _Executable](#introduction-to-_executable)
6+
- [Execution Flow](#execution-flow)
7+
- [Retry Logic](#retry-logic)
8+
- [Exponential backoff](#exponential-backoff)
9+
- [Error Handling](#error-handling)
10+
- [Logging & Debugging](#logging--debugging)
11+
- [Practical Examples](#practical-examples)
12+
13+
## Introduction to _Executable
14+
* The _Executable class is the backbone of the Hedera SDK execution engine. It handles sending transactions and queries, retry logic, error mapping, and logging, allowing child classes (like Transaction and Query) to focus on business logic.
15+
16+
```mermaid
17+
graph TD;
18+
_Executable-->Transaction;
19+
_Executable-->Query;
20+
Query-->TokenNftInfoQuery;
21+
Query-->TokenInfoQuery;
22+
Transaction-->TokenFreezeTransaction;
23+
Transaction-->TokenDissociateTransaction;
24+
```
25+
26+
27+
## Execution Flow
28+
29+
- How _execute(client) works in the Hedera SDK?
30+
31+
The typical execution flow for transactions and queries using the Executable interface follows these steps:
32+
33+
1. **Build** → Create the transaction/query with required parameters
34+
2. **FreezeWith(client)** → Locks the transaction for signing
35+
3. **Sign(privateKey)** → Add required signatures
36+
4. **Execute(client)** → Submit to the network
37+
5. **GetReceipt(client)** → Confirm success/failure
38+
39+
40+
- Here’s how child classes hook into the execution pipeline:
41+
42+
| Command | Description |
43+
| --- | --- |
44+
| `_make_request` | Build the protobuf request for this operation. Example: a transaction class serializes its body into a Transaction proto; a query class builds the appropriate query proto. |
45+
| `_get_method(channel: _Channel) -> _Method` | Choose which gRPC stub method to call. You get service stubs from channel, then return _Method(transaction_func=...) for transactions or _Method(query_func=...) for queries. The executor calls _execute_method, which picks transaction if present, otherwise query. |
46+
| `_map_status_error(response)` | Inspect the network response status and convert it to an appropriate exception (precheck/receipt). This lets the executor decide whether to raise or retry based on _should_retry. |
47+
| `_should_retry(response) -> _ExecutionState` | _ExecutionState: Decide the execution state from the response/status: RETRY, FINISHED, ERROR, or EXPIRED. This drives the retry loop and backoff. |
48+
| `_map_response(response, node_id, proto_request)` | Convert the raw gRPC/Proto response into the SDK’s response type (e.g., TransactionResponse, Query result) that gets returned to the caller. |
49+
50+
51+
## Retry Logic
52+
- Core Logic:
53+
1. Loop up to max_attempts times — The outer for loop tries the operation multiple times
54+
2. Exponential backoff — Each retry waits longer than the previous one
55+
3. Execute and check response — After execution, determine if we should retry, fail, or succeed
56+
4. Smart error handling — Different errors trigger different actions
57+
58+
<img width="600" height="600" alt="image" src="https://github.com/user-attachments/assets/5d318db0-4f3d-45c8-98b8-e7b6d7bf0762" />
59+
60+
61+
**_Retry logic = Try the operation, wait progressively longer between attempts, pick a different node if needed, and give up after max attempts. This makes the system resilient to temporary network hiccups._**
62+
63+
64+
## Exponential backoff
65+
Key Steps:
66+
67+
* First retry: wait `_min_backoff` ms
68+
* Second retry: wait 2× that
69+
* Third retry: wait 4× that (doubling each time)
70+
* Stops growing at `_max_backoff`
71+
72+
_(Why? Gives the network time to recover between attempts without hammering it immediately.)_
73+
74+
Handling gRPC errors:
75+
```python
76+
except grpc.RpcError as e:
77+
err_persistant = f"Status: {e.code()}, Details: {e.details()}"
78+
node = client.network._select_node() # Switch nodes
79+
logger.trace("Switched to a different node...", "error", err_persistant)
80+
continue # Retry with new node
81+
```
82+
Retryable gRPC codes:
83+
84+
* `UNAVAILABLE — Node` down/unreachable
85+
* `DEADLINE_EXCEEDED` — Request timeout
86+
* `RESOURCE_EXHAUSTED` — Rate limited
87+
* `INTERNAL` — Server error
88+
89+
_(If the [gRPC](https://en.wikipedia.org/wiki/GRPC) call itself fails, switch to a different network node and retry.)_
90+
91+
## Error Handling
92+
93+
* Mapping network errors to Python exceptions
94+
Abstract method that child classes implement:
95+
```python
96+
@abstractmethod
97+
def _map_status_error(self, response):
98+
"""Maps a response status code to an appropriate error object."""
99+
raise NotImplementedError(...)
100+
```
101+
102+
- Precheck errors --> PrecheckError (e.g., invalid account, insufficient balance)
103+
- Receipt errors --> ReceiptStatusError (e.g., transaction executed but failed)
104+
- Other statuses --> Appropriate exception types based on the status code
105+
106+
107+
* Retryable vs Fatal Errors
108+
Determined by `_should_retry(response) → _ExecutionState`:
109+
110+
```python
111+
@abstractmethod
112+
def _should_retry(self, response) -> _ExecutionState:
113+
"""Determine whether the operation should be retried based on the response."""
114+
raise NotImplementedError(...)
115+
```
116+
117+
The response is checked via `_should_retry()` which returns one of four `Execution States`:
118+
119+
| State | Action |
120+
| :--------------| :---------------------------------------|
121+
| **RETRY** | `Wait (backoff), then loop again` |
122+
| **FINISHED** | `Success! Return the response` |
123+
| **ERROR** | `Permanent failure, raise exception` |
124+
| **EXPIRED** | `Request expired, raise exception` |
125+
126+
127+
128+
## Logging & Debugging
129+
130+
- Request ID tracking
131+
* Unique request identifier per operation:
132+
```python
133+
def _get_request_id(self):
134+
"""Format the request ID for the logger."""
135+
return f"{self.__class__.__name__}:{time.time_ns()}"
136+
```
137+
* Format: `ClassName:nanosecond_timestamp` (e.g., `TransferTransaction:1702057234567890123`)
138+
* Unique per execution, allowing you to trace a single operation through logs
139+
* Passed to every logger call for correlation
140+
141+
* Used throughout execution:
142+
```python
143+
logger.trace("Executing", "requestId", self._get_request_id(), "nodeAccountID", self.node_account_id, "attempt", attempt + 1, "maxAttempts", max_attempts)
144+
logger.trace("Executing gRPC call", "requestId", self._get_request_id())
145+
logger.trace("Retrying request attempt", "requestId", request_id, "delay", current_backoff, ...)
146+
```
147+
148+
* At each attempt start:
149+
```python
150+
logger.trace(
151+
"Executing",
152+
"requestId", self._get_request_id(),
153+
"nodeAccountID", self.node_account_id, # Which node this attempt uses
154+
"attempt", attempt + 1, # Current attempt (1-based)
155+
"maxAttempts", max_attempts # Total allowed attempts
156+
)
157+
```
158+
* During gRPC call:
159+
```python
160+
logger.trace("Executing gRPC call", "requestId", self._get_request_id())
161+
```
162+
* After response received:
163+
```python
164+
logger.trace(
165+
f"{self.__class__.__name__} status received",
166+
"nodeAccountID", self.node_account_id,
167+
"network", client.network.network, # Network name (testnet, mainnet)
168+
"state", execution_state.name, # RETRY, FINISHED, ERROR, EXPIRED
169+
"txID", tx_id # Transaction ID if available
170+
)
171+
```
172+
173+
* Before backoff/retry:
174+
```python
175+
logger.trace(
176+
f"Retrying request attempt",
177+
"requestId", request_id,
178+
"delay", current_backoff, # Milliseconds to wait
179+
"attempt", attempt,
180+
"error", error # The error that triggered retry
181+
)
182+
time.sleep(current_backoff * 0.001) # Convert ms to seconds
183+
```
184+
* Node switch on gRPC error:
185+
```python
186+
logger.trace(
187+
"Switched to a different node for the next attempt",
188+
"error", err_persistant,
189+
"from node", self.node_account_id, # Old node
190+
"to node", node._account_id # New node
191+
)
192+
```
193+
* Final failure:
194+
```python
195+
logger.error(
196+
"Exceeded maximum attempts for request",
197+
"requestId", self._get_request_id(),
198+
"last exception being", err_persistant
199+
)
200+
```
201+
202+
- Tips for debugging transaction/query failures
203+
1. Enable Trace Logging
204+
* Capture detailed execution flow:
205+
```python
206+
client.logger.set_level("trace") # or DEBUG
207+
```
208+
* What you'll see:
209+
- _Every attempt number_
210+
- _Node account IDs being used_
211+
- _Backoff delays between retries_
212+
- _Status received at each stage_
213+
- _Node switches on errors_
214+
215+
2. Identify the Failure Type by Execution State
216+
Execution State Flow:
217+
```
218+
┌─ RETRY (0)
219+
│ ├─ Network hiccup (gRPC error)
220+
│ ├─ Temporary node issue
221+
│ └─ Rate limiting (try after backoff)
222+
223+
├─ FINISHED (1)
224+
│ └─ Success ✓ (return response)
225+
226+
├─ ERROR (2)
227+
│ ├─ Precheck error (bad input)
228+
│ ├─ Invalid account/permissions
229+
│ ├─ Insufficient balance
230+
│ └─ Permanent failure
231+
232+
└─ EXPIRED (3)
233+
└─ Transaction ID expired (timing issue)
234+
```
235+
236+
3. Track Backoff Progression
237+
Exponential backoff indicates retryable errors:
238+
```text
239+
Attempt 1: (no backoff, first try)
240+
Attempt 2: delay 250ms
241+
Attempt 3: delay 500ms
242+
Attempt 4: delay 1000ms (1s)
243+
Attempt 5: delay 2000ms (2s)
244+
Attempt 6: delay 4000ms (4s)
245+
Attempt 7: delay 8000ms (8s, capped)
246+
Attempt 8+: delay 8000ms (stays capped)
247+
```
248+
Interpretation:
249+
* => Growing delays: System is retrying a transient issue → healthy behavior
250+
* => Reaches cap (8s) multiple times: Network or node is struggling
251+
* => Fails immediately (no backoff): Permanent error(precheck/validation)
252+
4. Monitor Node Switches
253+
Watch for node switching patterns in logs:
254+
```text
255+
Switched to a different node
256+
from node: 0.0.3
257+
to node: 0.0.4
258+
error: Status: UNAVAILABLE, Details: Node is offline
259+
```
260+
Healthy patterns:
261+
* => Few switches (1-2 per 3+ attempts)
262+
* => Changes due to network errors (gRPC failures)
263+
264+
Problem patterns:
265+
* => Rapid switches (multiple per attempt)
266+
* => All nodes fail → network-wide issue
267+
* => Always same node fails → that node may be down
268+
269+
5. Cross-Reference Transaction ID with Hedera Explorer
270+
For transactions, use the transaction ID to verify on the network:
271+
```
272+
# From logs, capture txID
273+
txID: "0.0.123@1702057234.567890123"
274+
275+
# Query Hedera mirror node
276+
curl https://testnet.mirrornode.hedera.com/api/v1/transactions/0.0.123-1702057234-567890123
277+
```
278+
What you'll find:
279+
* => Actual execution result on the network
280+
* => Receipt status
281+
* => Gas used (for contract calls)
282+
* => Confirms if transaction made it despite client-side errors
283+
284+
6. Debug Specific Error Scenarios
285+
286+
| Error | Cause | Debug Steps |
287+
| :--- | :---: | :---: |
288+
| MaxAttemptsError | Failed after max retries | Check backoff log; all nodes failing? |
289+
| PrecheckError | Bad request (immediate fail)| Validate: account ID, amount, permissions |
290+
| ReceiptStatusError| Executed but failed | Check transaction details, balance, contract logic|
291+
| gRPC RpcError | Network issue | Check node status, firewall, internet |
292+
| EXPIRED state | Transaction ID too old | Use fresh transaction ID, check system clock |
293+
294+
7. Practical Debugging Workflow
295+
* Step 1: Capture the request ID
296+
```
297+
From error output: requestId = TransferTransaction:1702057234567890123
298+
```
299+
300+
* Step 2: Search logs for that request ID
301+
```text
302+
grep "1702057234567890123" application.log
303+
```
304+
305+
* Step 3: Analyze the sequence
306+
```
307+
[TRACE] Executing attempt=1 nodeAccountID=0.0.3 ...
308+
[TRACE] Executing gRPC call ...
309+
[TRACE] Retrying request delay=250ms ...
310+
[TRACE] Executing attempt=2 nodeAccountID=0.0.4 ...
311+
[TRACE] Switched to a different node error=Status: UNAVAILABLE ...
312+
[ERROR] Exceeded maximum attempts ...
313+
```
314+
315+
* Step 4: Determine root cause
316+
* => Multiple retries with node switches → Network issue
317+
* => Single attempt, immediate ERROR → Input validation issue
318+
* => EXPIRED after 1+ attempts → Timeout/clock issue
319+
320+
* Step 5: Take action
321+
* => Network issue: Retry after delay, check network status
322+
* => Input issue: Fix account ID, amount, permissions
323+
* => Timeout: Increase max_backoff or check system clock
324+
325+
8. Enable Verbose Logging for Production Issues
326+
* For real-world debugging:
327+
```python
328+
# Set higher log level before executing
329+
client.logger.set_level("debug")
330+
331+
# Or configure structured logging
332+
import logging
333+
logging.basicConfig(level=logging.DEBUG, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
334+
```
335+
Captures:
336+
* => Request ID, attempt number, node ID
337+
* => Backoff delays and progression
338+
* => gRPC errors with status codes
339+
* => Final error message with context
340+
341+
Debug Checklist:
342+
343+
* ✅ Confirm request ID appears in logs (means operation was attempted)
344+
* ✅ Count attempts (did it retry or fail immediately?)
345+
* ✅ Check execution states (RETRY → ERROR or RETRY → FINISHED?)
346+
* ✅ Note node switches (gRPC errors or single node?)
347+
* ✅ Verify backoff progression (exponential or capped?)
348+
* ✅ Match final error to exception type (Precheck, Receipt, MaxAttempts, etc.)
349+
* ✅ Cross-check transaction ID with Hedera explorer if available
350+
351+
This lets you quickly identify whether a failure is transient (network), permanent (bad input), or rate-related (backoff didn't help).
352+
353+
## Practical Examples
354+
* [Token Association Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/tokens/token_associate_transaction.py)
355+
* [Token Freeze Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/tokens/token_freeze_transaction.py)
356+
* [Token Account Info Query Example](https://github.com/hiero-ledger/hiero-sdk-python/blob/main/examples/query/account_info_query.py)
357+

0 commit comments

Comments
 (0)