-
Notifications
You must be signed in to change notification settings - Fork 23
Description
Related:
Summary
The provider and data set resolution logic in StorageContext has grown to accommodate many potential use cases. This flexibility has come at the cost of maintainability, testability, and user comprehension. This proposal extracts resolution logic into a focused ProviderResolver interface with simpler implementations that cover the use cases we need to support, plus basic options to support golden-path top-level APIs.
Problem
The logic inside createContext() and createContexts() has become difficult to maintain, test, and document. We've introduced flexibility (and in fact had a lot of this from the begining) to solve for imagined user needs, but at the cost of significant code complexity.
Most users fall into one of three categories:
- "Just store my data": no opinions about providers or data sets
- "Use these specific providers": explicit provider selection
- "Use these specific data sets": Filecoin Pin Demo is an example of this
The current implementation tries to accommodate partial specifications, cascading fallbacks, and mixing of options in ways that are hard to reason about and harder to explain.
Current complexity
createContexts() implements a cascading three-tier resolution:
- If
dataSetIdsprovided -> resolve each viaresolveByDataSetId()(up tocount) - If still need more AND
providerIdsprovided -> resolve remaining viaresolveByProviderId()(filtering out already-resolved providers) - If still need more -> fill remaining slots via
smartSelectProvider()(excluding all previously resolved)
Each tier maintains its own exclusion tracking and conditionally hands off to the next.
Introducing the concept of endorsed SPs adds even more complexity to this, with two new tiers within smartSelectProvider().
resolveProviderAndDataSet() (for single context) has its own parallel logic tree:
- Checks
dataSetId->resolveByDataSetId() - Else checks
providerId->resolveByProviderId() - Else checks
providerAddress->resolveByProviderAddress() - Else ->
smartSelectProvider()
Additional complexity:
forceCreateDataSet/forceCreateDataSetsflags alter behaviour at multiple levelsexcludeProviderIdsadds another dimension of filtering- Singular options (
providerId,providerAddress,dataSetId) vs plural (providerIds,dataSetIds) have different code paths - Metadata matching, provider ping validation, and data set preference sorting are duplicated across methods
dataSetId = -1sentinel value indicates "create new" vs existing
Further, moving to an API where we have separate interaction modes for a primary SP ("endorsed") and one or more secondary SPs (see multi-copy upload via SP-to-SP fetch), we'd need to introduce either another selection tier or strictly enforce ordering in the selection process.
Proposal
Trim down the options and focus on three use cases that represent how users actually interact with the SDK:
- User has no opinions: using
upload()orcreateContexts()with no options. We figure out what to do based on what we find on-chain for their wallet. - User has opinions about providers: supply provider IDs, count must match, we find or create data sets for those providers.
- User has opinions about data sets: supply data set IDs, count must match, we validate ownership and use those data sets.
In all cases, we identify an "endorsed" provider from the resolved set and return it first. If no endorsed provider is available (e.g., user specified non-endorsed providers), the first result is treated as "primary" for upload purposes.
Simplified options
Remove:
providerId(singular): useproviderIds: [id]providerAddress: can query registry by ID if neededdataSetId(singular): usedataSetIds: [id]excludeProviderIds: no longer needed with explicit selection modeldev,withIpni: not neededforceCreateDataSet/forceCreateDataSets: can be achieved with a customProviderResolver(below)
Keep:
count: number of contexts (default: 2)dataSetIds: explicit data set selectionproviderIds: explicit provider selectionmetadata: for data set matching and creationwithCDN: sugar formetadata
Validation rules:
dataSetIdsandproviderIdsare mutually exclusive, error if both provided- If
dataSetIdsprovided: length must equalcount - If
providerIdsprovided: length must equalcount
Resolver interface
(Thanks to @hugomrdias for seeding this idea)
Extract resolution logic into a simple interface:
interface ProviderResolver {
resolveNext(): Promise<ProviderSelectionResult | null>
}Three focused implementations we will use internally:
| Resolver | Input | Behaviour |
|---|---|---|
SmartResolver |
nothing | Query chain for existing data sets and approved providers, prefer endorsed, ping validate |
ProviderIdsResolver |
provider IDs | Validate providers exist and are approved, find matching data sets or mark for creation, order by endorsement |
DataSetIdsResolver |
data set IDs | Validate ownership/live/managed, get providers from data sets, order by endorsement |
Factory function selects the appropriate implementation based on options provided.
What we keep
Useful logic that remains, potentially shared across resolver implementations:
- Metadata matching for data set reuse
- Provider ping validation for health checking
- Data set preference ordering: with pieces > without pieces, older first
- Endorsement detection and ordering
User-provided resolvers
The ProviderResolver interface is simple enough that advanced users could provide their own implementation if they have needs beyond the three standard cases (perhaps an external reputation service, doing per-country filtering yourself, or even just implementing forceCreateDataSets). This is an escape hatch for edge cases, not a primary API surface. We add a resolver option that overrides much of the default behaviour and lets you control it yourself.
Benefits
- Each resolver is small, focused, and independently testable
- No cascading tier logic or conditional handoffs between resolution strategies
- Clear validation rules that are easy to document and explain
- Mutual exclusivity enforced upfront rather than through complex interactions
- Easier to add endorsement ordering without further complicating existing logic
- Simpler mental model: users either specify what they want or let us figure it out
Metadata
Metadata
Assignees
Labels
Type
Projects
Status