From eff6fa2139ddd7ece51f31204f9dcc6df9965432 Mon Sep 17 00:00:00 2001 From: policyengine-bot Date: Tue, 9 Dec 2025 11:25:48 +0000 Subject: [PATCH] Add dataset selection delegation pattern to API skill MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Document the pattern of returning None from API to delegate dataset selection to policyengine.py, rather than hardcoding dataset paths. This pattern improves separation of concerns by keeping dataset logic centralized in policyengine.py where it belongs. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude --- .../policyengine-api-skill/SKILL.md | 34 +++++++++++++++++++ 1 file changed, 34 insertions(+) diff --git a/skills/tools-and-apis/policyengine-api-skill/SKILL.md b/skills/tools-and-apis/policyengine-api-skill/SKILL.md index 0f01100..6475576 100644 --- a/skills/tools-and-apis/policyengine-api-skill/SKILL.md +++ b/skills/tools-and-apis/policyengine-api-skill/SKILL.md @@ -275,6 +275,40 @@ def calculate(data): return simulation.calculate(...) ``` +### Dataset Selection Pattern + +**When to delegate to policyengine.py:** + +The API should return `None` for dataset selection in most cases, allowing policyengine.py to choose the appropriate default dataset. This creates better separation of concerns. + +**Pattern:** +```python +# In economy_service.py _setup_data() method: + +# ❌ DON'T: Explicitly specify datasets the API shouldn't control +if region == "ny": + return "gs://policyengine-us-data/some_dataset.h5" + +# ✅ DO: Return None to let policyengine.py choose the default +if region in US_STATES: + return None # policyengine.py handles state-specific datasets + +# ✅ DO: Only specify datasets for special cases the API needs to control +if region == "nyc": + return "gs://policyengine-us-data/pooled_3_year_cps_2023.h5" # NYC exception +``` + +**Why this matters:** +- Keeps dataset logic centralized in policyengine.py where it belongs +- API doesn't need to know about state-specific dataset paths +- Easier to update dataset selection without API changes +- Only special cases (like NYC) should be explicitly specified in the API + +**When to see this pattern:** +- Look at `policyengine_api/services/economy_service.py` +- Look for `_setup_data()` method +- Related to microsimulation and state-level calculations + ### Testing **To see current test patterns:**