Skip to content

Commit 4c6fd25

Browse files
authored
kv_transfer: Rename the shared storage connectors (#30201)
Signed-off-by: Or Ozeri <oro@il.ibm.com>
1 parent 03b91f7 commit 4c6fd25

File tree

27 files changed

+129
-129
lines changed

27 files changed

+129
-129
lines changed

.buildkite/scripts/hardware_ci/run-xpu-test.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,6 +47,6 @@ docker run \
4747
pytest -v -s v1/worker --ignore=v1/worker/test_gpu_model_runner.py
4848
pytest -v -s v1/structured_output
4949
pytest -v -s v1/spec_decode --ignore=v1/spec_decode/test_max_len.py --ignore=v1/spec_decode/test_tree_attention.py --ignore=v1/spec_decode/test_speculators_eagle3.py
50-
pytest -v -s v1/kv_connector/unit --ignore=v1/kv_connector/unit/test_multi_connector.py --ignore=v1/kv_connector/unit/test_nixl_connector.py --ignore=v1/kv_connector/unit/test_shared_storage_connector.py --ignore=v1/kv_connector/unit/test_lmcache_integration.py
50+
pytest -v -s v1/kv_connector/unit --ignore=v1/kv_connector/unit/test_multi_connector.py --ignore=v1/kv_connector/unit/test_nixl_connector.py --ignore=v1/kv_connector/unit/test_example_connector.py --ignore=v1/kv_connector/unit/test_lmcache_integration.py
5151
pytest -v -s v1/test_serial_utils.py
5252
'

docs/features/disagg_encoder.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -32,14 +32,14 @@ Design doc: <https://docs.google.com/document/d/1aed8KtC6XkXtdoV87pWT0a8OJlZ-Cpn
3232

3333
## 2 Usage Example
3434

35-
The current reference pathway is **SharedStorageConnector**.
35+
The current reference pathway is **ExampleConnector**.
3636
Below ready-to-run scripts shows the workflow:
3737

3838
1 Encoder instance + 1 PD instance:
39-
`examples/online_serving/disaggregated_encoder/shared_storage_connector/disagg_encoder_example.sh`
39+
`examples/online_serving/disaggregated_encoder/disagg_1e1pd_example.sh`
4040

4141
1 Encoder instance + 1 Prefill instance + 1 Decode instance:
42-
`examples/online_serving/disaggregated_encoder/shared_storage_connector/disagg_epd_example.sh`
42+
`examples/online_serving/disaggregated_encoder/disagg_1e1p1d_example.sh`
4343

4444
---
4545

docs/features/disagg_prefill.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -21,14 +21,14 @@ Please refer to [examples/online_serving/disaggregated_prefill.sh](../../example
2121

2222
Now supports 5 types of connectors:
2323

24-
- **SharedStorageConnector**: refer to [examples/offline_inference/disaggregated-prefill-v1/run.sh](../../examples/offline_inference/disaggregated-prefill-v1/run.sh) for the example usage of SharedStorageConnector disaggregated prefilling.
24+
- **ExampleConnector**: refer to [examples/offline_inference/disaggregated-prefill-v1/run.sh](../../examples/offline_inference/disaggregated-prefill-v1/run.sh) for the example usage of ExampleConnector disaggregated prefilling.
2525
- **LMCacheConnectorV1**: refer to [examples/others/lmcache/disagg_prefill_lmcache_v1/disagg_example_nixl.sh](../../examples/others/lmcache/disagg_prefill_lmcache_v1/disagg_example_nixl.sh) for the example usage of LMCacheConnectorV1 disaggregated prefilling which uses NIXL as the underlying KV transmission.
2626
- **NixlConnector**: refer to [tests/v1/kv_connector/nixl_integration/run_accuracy_test.sh](../../tests/v1/kv_connector/nixl_integration/run_accuracy_test.sh) for the example usage of NixlConnector disaggregated prefilling which support fully async send/recv. For detailed usage guide, see [NixlConnector Usage Guide](nixl_connector_usage.md).
2727
- **P2pNcclConnector**: refer to [examples/online_serving/disaggregated_serving_p2p_nccl_xpyd/disagg_example_p2p_nccl_xpyd.sh](../../examples/online_serving/disaggregated_serving_p2p_nccl_xpyd/disagg_example_p2p_nccl_xpyd.sh) for the example usage of P2pNcclConnector disaggregated prefilling.
2828
- **MultiConnector**: take advantage of the kv_connector_extra_config: dict[str, Any] already present in KVTransferConfig to stash all the connectors we want in an ordered list of kwargs.such as:
2929

3030
```bash
31-
--kv-transfer-config '{"kv_connector":"MultiConnector","kv_role":"kv_both","kv_connector_extra_config":{"connectors":[{"kv_connector":"NixlConnector","kv_role":"kv_both"},{"kv_connector":"SharedStorageConnector","kv_role":"kv_both","kv_connector_extra_config":{"shared_storage_path":"local_storage"}}]}}'
31+
--kv-transfer-config '{"kv_connector":"MultiConnector","kv_role":"kv_both","kv_connector_extra_config":{"connectors":[{"kv_connector":"NixlConnector","kv_role":"kv_both"},{"kv_connector":"ExampleConnector","kv_role":"kv_both","kv_connector_extra_config":{"shared_storage_path":"local_storage"}}]}}'
3232
```
3333

3434
For NixlConnector, you may also specify one or multiple NIXL_Backend. Such as:

examples/offline_inference/disaggregated-prefill-v1/decode_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ def main():
3030
max_num_batched_tokens=64,
3131
max_num_seqs=16,
3232
kv_transfer_config=KVTransferConfig(
33-
kv_connector="SharedStorageConnector",
33+
kv_connector="ExampleConnector",
3434
kv_role="kv_both",
3535
kv_connector_extra_config={"shared_storage_path": "local_storage"},
3636
),

examples/offline_inference/disaggregated-prefill-v1/prefill_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def main():
2626
enforce_eager=True,
2727
gpu_memory_utilization=0.8,
2828
kv_transfer_config=KVTransferConfig(
29-
kv_connector="SharedStorageConnector",
29+
kv_connector="ExampleConnector",
3030
kv_role="kv_both",
3131
kv_connector_extra_config={"shared_storage_path": "local_storage"},
3232
),

examples/offline_inference/kv_load_failure_recovery/README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -10,7 +10,7 @@ It demonstrates vLLM's ability to recover from KV load failures in both synchron
1010
- `decode_example.py` – performs the decode stage. Accepts:
1111
- `--simulate-failure`: simulates KV load failure using a custom connector.
1212
- `--async-load`: enables asynchronous KV loading mode.
13-
- `rogue_shared_storage_connector.py` – defines `RogueSharedStorageConnector`, a subclass of `SharedStorageConnector`, that simulates missing or corrupted external KV blocks by failing to load blocks for the first decode request.
13+
- `load_recovery_example_connector.py` – defines `LoadRecoveryExampleConnector`, a subclass of `ExampleConnector`, that simulates missing or corrupted external KV blocks by failing to load blocks for the first decode request.
1414
- `run.sh` – orchestrates the test: runs the prefill stage, then three decode stages:
1515
1. Normal decode (baseline).
1616
2. Decode with simulated sync KV load failure.
@@ -20,7 +20,7 @@ It demonstrates vLLM's ability to recover from KV load failures in both synchron
2020

2121
## How It Works
2222

23-
- The test dynamically loads `RogueSharedStorageConnector` via `KVTransferConfig.kv_connector_module_path`, enabling controlled simulation of load failures without modifying the original connector.
23+
- The test dynamically loads `LoadRecoveryExampleConnector` via `KVTransferConfig.kv_connector_module_path`, enabling controlled simulation of load failures without modifying the original connector.
2424
- The decode stages that simulate failure are expected to trigger recovery logic in vLLM, resulting in the same output as the baseline decode.
2525
- If recovery fails, the script prints a unified diff of the output mismatch and exits with error.
2626

examples/offline_inference/kv_load_failure_recovery/decode_example.py

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -35,13 +35,13 @@ def main():
3535

3636
if args.simulate_failure:
3737
ktc = KVTransferConfig(
38-
kv_connector="RogueSharedStorageConnector",
38+
kv_connector="LoadRecoveryExampleConnector",
3939
kv_role="kv_both",
4040
kv_connector_extra_config={
4141
"shared_storage_path": "local_storage",
4242
"async_load": args.async_load,
4343
},
44-
kv_connector_module_path="rogue_shared_storage_connector",
44+
kv_connector_module_path="load_recovery_example_connector",
4545
)
4646
out_file = (
4747
"async_decode_recovered_output.txt"
@@ -50,7 +50,7 @@ def main():
5050
)
5151
else:
5252
ktc = KVTransferConfig(
53-
kv_connector="SharedStorageConnector",
53+
kv_connector="ExampleConnector",
5454
kv_role="kv_both",
5555
kv_connector_extra_config={
5656
"shared_storage_path": "local_storage",

examples/offline_inference/kv_load_failure_recovery/rogue_shared_storage_connector.py renamed to examples/offline_inference/kv_load_failure_recovery/load_recovery_example_connector.py

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@
1010
KVConnectorMetadata,
1111
KVConnectorRole,
1212
)
13-
from vllm.distributed.kv_transfer.kv_connector.v1.shared_storage_connector import (
14-
SharedStorageConnector,
15-
SharedStorageConnectorMetadata,
13+
from vllm.distributed.kv_transfer.kv_connector.v1.example_connector import (
14+
ExampleConnector,
15+
ExampleConnectorMetadata,
1616
)
1717
from vllm.forward_context import ForwardContext
1818
from vllm.v1.core.kv_cache_manager import KVCacheBlocks
@@ -26,15 +26,15 @@
2626

2727

2828
@dataclass
29-
class RogueSharedStorageConnectorMetadata(SharedStorageConnectorMetadata):
29+
class LoadRecoveryExampleConnectorMetadata(ExampleConnectorMetadata):
3030
req_to_block_ids: dict[str, set[int]] = field(default_factory=dict)
3131

3232
@classmethod
33-
def from_base(cls, base: SharedStorageConnectorMetadata):
33+
def from_base(cls, base: ExampleConnectorMetadata):
3434
return cls(requests=base.requests)
3535

3636

37-
class RogueSharedStorageConnector(SharedStorageConnector):
37+
class LoadRecoveryExampleConnector(ExampleConnector):
3838
def __init__(self, vllm_config: "VllmConfig", role: KVConnectorRole):
3939
super().__init__(vllm_config=vllm_config, role=role)
4040
self._async_load = vllm_config.kv_transfer_config.get_from_extra_config(
@@ -45,7 +45,7 @@ def __init__(self, vllm_config: "VllmConfig", role: KVConnectorRole):
4545
self._req_to_block_ids: dict[str, list[int]] = dict()
4646

4747
def bind_connector_metadata(self, connector_metadata: KVConnectorMetadata) -> None:
48-
assert isinstance(connector_metadata, RogueSharedStorageConnectorMetadata)
48+
assert isinstance(connector_metadata, LoadRecoveryExampleConnectorMetadata)
4949
index, failed_request = next(
5050
(
5151
(i, x)
@@ -84,7 +84,7 @@ def get_finished(
8484
) -> tuple[set[str] | None, set[str] | None]:
8585
if self._async_load:
8686
meta = self._get_connector_metadata()
87-
assert isinstance(meta, RogueSharedStorageConnectorMetadata)
87+
assert isinstance(meta, LoadRecoveryExampleConnectorMetadata)
8888
if meta.req_to_block_ids:
8989
return None, set(meta.req_to_block_ids)
9090

@@ -126,9 +126,9 @@ def build_connector_meta(
126126
) -> KVConnectorMetadata:
127127
if not self._async_load:
128128
base = super().build_connector_meta(scheduler_output)
129-
meta = RogueSharedStorageConnectorMetadata.from_base(base)
129+
meta = LoadRecoveryExampleConnectorMetadata.from_base(base)
130130
else:
131-
meta = RogueSharedStorageConnectorMetadata()
131+
meta = LoadRecoveryExampleConnectorMetadata()
132132
if self._requests_need_load:
133133
for req_id, request in self._requests_need_load.items():
134134
meta.add_request(

examples/offline_inference/kv_load_failure_recovery/prefill_example.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,7 @@ def main():
2626
enforce_eager=True,
2727
gpu_memory_utilization=0.8,
2828
kv_transfer_config=KVTransferConfig(
29-
kv_connector="SharedStorageConnector",
29+
kv_connector="ExampleConnector",
3030
kv_role="kv_both",
3131
kv_connector_extra_config={"shared_storage_path": "local_storage"},
3232
),

examples/online_serving/disaggregated_encoder/README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -50,12 +50,12 @@ The vllm instances and `disagg_encoder_proxy` supports local URIs with ```{"url"
5050

5151
## EC connector and KV transfer
5252

53-
The `ECSharedStorageConnector` is used to store the encoder cache on local disk and facilitate transfer. To enable the encoder disaggregation feature, add the following configuration:
53+
The `ECExampleonnector` is used to store the encoder cache on local disk and facilitate transfer. To enable the encoder disaggregation feature, add the following configuration:
5454

5555
```bash
5656
# Add to encoder instance:
5757
--ec-transfer-config '{
58-
"ec_connector": "ECSharedStorageConnector",
58+
"ec_connector": "ECExampleConnector",
5959
"ec_role": "ec_producer",
6060
"ec_connector_extra_config": {
6161
"shared_storage_path": "'"$EC_SHARED_STORAGE_PATH"'"
@@ -64,7 +64,7 @@ The `ECSharedStorageConnector` is used to store the encoder cache on local disk
6464

6565
# Add to prefill/prefill+decode instance:
6666
--ec-transfer-config '{
67-
"ec_connector": "ECSharedStorageConnector",
67+
"ec_connector": "ECExampleConnector",
6868
"ec_role": "ec_consumer",
6969
"ec_connector_extra_config": {
7070
"shared_storage_path": "'"$EC_SHARED_STORAGE_PATH"'"

0 commit comments

Comments
 (0)