-
Notifications
You must be signed in to change notification settings - Fork 0
Description
Summary
Demand placement workflows (MSD, TrafficMatrixPlacement) compute redundant SPF calculations. SPF returns paths to all nodes, but we discard all except one destination. When multiple demands share the same
source, we recompute identical SPFs.
Problem
In a production scenario with 720 demands from 77 unique sources:
- Current: 720 SPF calls per alpha evaluation
- Optimal: ~77 SPF calls per alpha evaluation
- Measured impact: ~3x slowdown (106s → 36s for MSD)
Proposed Optimizations
- SPF Caching by Source (High Impact)
Cache the full predecessor DAG per source node within each placement batch:
spf_cache: dict[int, PredDAG] = {}
for demand in demands:
if src_id not in spf_cache:
spf_cache[src_id] = algorithms.spf(src=src_id, dst=None, ...)
dag = spf_cache[src_id]
placed = flow_graph.place(src_id, dst_id, dag, volume, ...)
# Recompute if shortest paths saturated
if placed < volume:
spf_cache[src_id] = algorithms.spf(src=src_id, dst=None, ...)
# retry...
The existing APIs support this: spf(dst=None) returns full DAG, flow_graph.place() accepts pre-computed DAG and validates against current residuals.
- Pre-resolve Node IDs (Low Impact)
Cache (src_id, dst_id) tuples in _MSDCache instead of calling node_mapper.to_id() per demand per alpha.
- Reuse FlowPolicy Objects (Low Impact)
Create one FlowPolicy per unique preset rather than per demand.
Affected Code
- ngraph/workflow/maximum_supported_demand_step.py - MSD binary search
- ngraph/exec/analysis/flow.py - demand_placement_analysis() used by TrafficMatrixPlacement
Design Notes
- Cache scope is per-alpha (MSD) or per-iteration (Monte Carlo)—residuals differ between evaluations
- Fallback recomputation is essential when all shortest paths to a destination saturate
- Thread safety: each parallel iteration needs its own cache (already isolated by separate FlowGraph instances)