-
Notifications
You must be signed in to change notification settings - Fork 9
feat: clustering of time series data #555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
| Case | Layout |
|------------------|-------------------------------------------------------------------------|
| Simple (no dims) | Single heatmap |
| Periods only | Vertical stack (1 column) |
| Scenarios only | Horizontal layout |
| Both | Grid (periods as rows, scenarios as columns via facet_cols=n_scenarios) |
| File | Issue | Fix |
|-------------------------------------------------|-----------------------------------------------------------|-----------------------------------------------------|
| docs/notebooks/03-investment-optimization.ipynb | Division by zero when solar_size = 0 | Added guard: if solar_size > 0 else float('nan') |
| flixopt/clustering/base.py:181-235 | ClusterStructure.plot() crashes for multi-period/scenario | Added NotImplementedError with helpful message |
| flixopt/components.py:1256-1265 | Obsolete linopy LP writer bug workaround | Removed + 0.0 workaround (fixed in linopy >= 0.5.1) |
| flixopt/dataset_plot_accessor.py:742-780 | to_duration_curve crashes on variables without time dim | Added guard to skip non-time variables |
| flixopt/features.py:234-236 | Critical: Startup count ignores cluster weighting | Now multiplies by cluster_weight before summing |
| flixopt/structure.py:268-281 | scenario_weights docstring misleading | Updated docstring to accurately describe behavior |
Nitpick Fixes
| File | Fix |
|----------------------------------------------------|-------------------------------------------------------------------------------------------|
| docs/notebooks/data/generate_example_systems.py | Fixed type hint pd.DataFrame → pd.Series for _elec_prices, added timezone guard |
| flixopt/statistics_accessor.py:2103-2132 | Added detailed comment explaining secondary y-axis offset strategy |
| tests/test_cluster_reduce_expand.py | Moved import xarray as xr to top of file |
| docs/notebooks/data/generate_realistic_profiles.py | Clarified warnings.resetwarnings() comment |
| flixopt/transform_accessor.py | Removed unused cluster_coords and time_coords params from _combine_slices_to_dataarray_2d |
| flixopt/comparison.py | Added error handling to _concat_property for FlowSystems lacking optimization data |
📝 WalkthroughWalkthroughMajor release v6.0.0 restructures time-series clustering into a modular package with expanded parameters, introduces multi-system comparison functionality, adds fxplot xarray plotting accessor, renames timing concepts throughout, updates documentation and notebooks to data-driven approaches, and optimizes documentation build workflows. Changes
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Suggested reviewers
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
✨ Finishing touches
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
flixopt/flow_system.py (1)
63-76: scenario_weights docs no longer match normalization semanticsThe constructor docstring says:
“
scenario_weights: … If None, all scenarios have the same weight (normalized to 1)… The finalweightsarray … is computed as period_weights × normalized_scenario_weights, with normalization applied … by default.”But the current implementation does:
FlowSystem.scenario_weightssetter: always normalizes when a value is provided.- When
scenario_weightsis left asNone,FlowSystemModel.scenario_weightsfalls back toFlowSystem._unit_weight('scenario'), i.e. unnormalized weights of 1.0 per scenario.FlowSystem.weights['scenario']likewise returns unit weights when no explicit scenario weights are set.So by default, multiple scenarios are summed, not averaged, which contradicts the existing docstring.
I’d recommend updating the docstring (and any related API docs) to state:
- If
scenario_weightsisNone, scenarios are treated with equal unit weight (objective sums over scenarios).- When
scenario_weightsis provided, it is normalized to sum to 1 along thescenariodimension.Example doc tweak for __init__
- scenario_weights: The weights of each scenario. If None, all scenarios have the same weight (normalized to 1). + scenario_weights: The weights of each scenario. If None, all scenarios + have the same unit weight (the objective sums over scenarios). When + provided, the weights are normalized to sum to 1 along the + ``scenario`` dimension. @@ - The final `weights` array (accessible via `flow_system.model.objective_weights`) is computed as period_weights × normalized_scenario_weights, with normalization applied to the scenario weights by default. + The final `weights` array (accessible via + `flow_system.model.objective_weights`) is computed as + ``period_weights × scenario_weights``, where scenario_weights are + normalized only when explicitly set.Also applies to: 2053-2103
flixopt/statistics_accessor.py (1)
960-1092: Sankey methods don't aggregate cluster dimension, causing TypeError with clustered systemsThe
flows(),peak_flow(), andsizes()methods fail on clustered FlowSystems because they don't aggregate over theclusterdimension. This leaves non-scalar arrays that causefloat(ds[label].values)andfloat(ds[lbl].max())to raiseTypeError, and even if coerced, the results would ignorecluster_weight.Add cluster aggregation (with proper weighting) to all three methods before building links:
Required fix for flows(), peak_flow(), and sizes()
@@ def flows( - ds = _apply_selection(ds, xr_select) - - # Aggregate remaining dimensions - if 'time' in ds.dims: - ds = getattr(ds, aggregate)(dim='time') - for dim in ['period', 'scenario']: - if dim in ds.dims: - ds = ds.sum(dim=dim) + ds = _apply_selection(ds, xr_select) + + # Aggregate temporal and scenario/period dimensions + if 'time' in ds.dims: + ds = getattr(ds, aggregate)(dim='time') + if 'cluster' in ds.dims: + cluster_w = self._fs.weights.get('cluster') + if cluster_w is not None: + ds = (ds * cluster_w).sum(dim='cluster') + else: + ds = ds.sum(dim='cluster') + for dim in ['period', 'scenario']: + if dim in ds.dims: + ds = ds.sum(dim=dim) @@ def sizes( - # Collapse remaining dimensions - for dim in ['period', 'scenario']: - if dim in ds.dims: - ds = ds.max(dim=dim) + # Collapse remaining dimensions + for dim in ['period', 'scenario']: + if dim in ds.dims: + ds = ds.max(dim=dim) + if 'cluster' in ds.dims: + cluster_w = self._fs.weights.get('cluster') + if cluster_w is not None: + ds = (ds * cluster_w).sum(dim='cluster') + else: + ds = ds.sum(dim='cluster') @@ def peak_flow( - # Take max over all dimensions - for dim in ['time', 'period', 'scenario']: - if dim in ds.dims: - ds = ds.max(dim=dim) + # Take max over relevant dimensions; clusters are always summed + if 'time' in ds.dims: + ds = ds.max(dim='time') + for dim in ['period', 'scenario']: + if dim in ds.dims: + ds = ds.max(dim=dim) + if 'cluster' in ds.dims: + cluster_w = self._fs.weights.get('cluster') + if cluster_w is not None: + ds = (ds * cluster_w).sum(dim='cluster') + else: + ds = ds.sum(dim='cluster')
🤖 Fix all issues with AI Agents
In @docs/notebooks/08d-clustering-multiperiod.ipynb:
- Around line 555-571: The API snippet calls
fs_clustered.clustering.plot.compare with the wrong keyword and an inconsistent
series label; update the call to use the correct parameter name "variables"
instead of "variable" and use the canonical time-series suffix used elsewhere
(e.g., "Demand(Flow)|fixed_relative_profile") so the call becomes
fs_clustered.clustering.plot.compare(variables='Demand(Flow)|fixed_relative_profile')
or another existing series label.
In @docs/notebooks/data/generate_example_systems.py:
- Around line 544-548: The code assumes _get_weather() returns exactly 8760
hourly rows and directly uses temp and ghi arrays which may misalign with
timesteps; update the block that sets temp and ghi to either (1) explicitly
reindex/resample the weather Series/DataFrame returned by _get_weather() to the
timesteps index (matching the approach used elsewhere in the repo) or (2) add a
defensive assertion and clear comment: assert len(temp)==len(timesteps) (and
same for ghi) with a note that tmy_dresden.csv must be 8760 hourly values for a
non‑leap year; reference the variables timesteps, _get_weather(), temp, and ghi
when making the change.
In @docs/notebooks/data/generate_realistic_profiles.py:
- Around line 28-31: Remove the global call to warnings.resetwarnings() in
generate_realistic_profiles.py and instead either apply targeted filters with
warnings.filterwarnings() for the specific noisy messages (referencing the
module names like demandlib and pvlib) or wrap the specific imports/calls that
trigger noise in a warnings.catch_warnings() context so the suppression is
local; replace the resetwarnings() usage with these focused patterns and ensure
any added filterwarnings() include module or message qualifiers to avoid
changing global warning state.
In @docs/user-guide/optimization/clustering.md:
- Around line 47-111: The docs mismatch the implementation: update the
Clustering Parameters table to include cluster_method examples 'k_maxoids' and
'averaging' and add 'distributionAndMinMaxRepresentation' to
representation_method; then fix the "Available clustering algorithms" table by
removing "(default)" from 'k_means' and marking 'hierarchical' as the default to
match the implementation (references: parameters n_clusters, cluster_duration,
time_series_for_high_peaks, cluster_method, representation_method,
rescale_cluster_periods, and the "Available clustering algorithms" section).
- Around line 24-45: The example references a nonexistent solution key
'Storage|charge_state'; update the snippet so the added element name matches the
solution lookup. Either instantiate a storage element in
flow_system.add_elements(...) with a concrete name (e.g., "Battery" or
"HotWaterTank") or change fs_expanded.solution['Storage|charge_state'] to the
actual element instance name used; ensure the lookup matches the instance naming
produced by flow_system.add_elements and keep the calls to
flow_system.transform.cluster(...) and fs_clustered.transform.expand_solution()
unchanged.
In @flixopt/__init__.py:
- Line 20: The Comparison class is being exported from the package without any
unit tests; add a new test module tests/test_comparison.py that exercises
Comparison: write tests for initialization (valid and invalid args), each public
method's expected behavior and return types, and edge cases (empty inputs,
boundary values, and error paths); use the Comparison class name in tests,
assert type hints/outputs where applicable, and ensure the new tests run in CI
before exposing Comparison in __init__.py.
In @flixopt/clustering/intercluster_helpers.py:
- Around line 74-111: The docstring for extract_capacity_bounds incorrectly
states the unbounded upper default as 1e9; update that sentence to match the
implementation constant DEFAULT_UNBOUNDED_CAPACITY (= 1e6) so the docs reflect
the actual fallback value used by the function (refer to
DEFAULT_UNBOUNDED_CAPACITY and extract_capacity_bounds).
In @flixopt/components.py:
- Around line 1243-1485: The decay factors (decay_n/decay_t) are left with a
'cluster' dimension and are being broadcast against soc_before/soc_d (which use
'original_cluster'), causing extra dimensions and over‑constraining; in
_add_linking_constraints and _add_combined_bound_constraints you must index the
decay arrays by the cluster assignment before using them (same pattern as
delta_soc_ordered): compute decay_n/decay_t with their 'cluster' dim then
convert them to original-period space via .isel(cluster=cluster_order) (or
otherwise align coords) so that the decay factors have the 'original_cluster'
coord and match soc_before/soc_after (and cs_t) prior to building lhs and
combined bound expressions.
In @flixopt/structure.py:
- Around line 238-245: The docstring on FlowSystemModel.weights is inaccurate:
FlowSystem.weights['scenario'] currently returns unit (all-1.0) weights when no
scenario_weights are provided, so scenarios are summed by default; update the
documentation (both the FlowSystemModel.weights docstring and the
FlowSystem.__init__ docs) to state that scenario weights are unit weights by
default and only normalized when explicit scenario_weights are supplied
(referencing the scenario_weights parameter and FlowSystem.weights['scenario']),
or alternatively change the implementation to create normalized default scenario
weights instead — pick one approach and make the docstring and code consistent
(adjust FlowSystemModel.weights docstring and FlowSystem.__init__ accordingly).
In @tests/test_cluster_reduce_expand.py:
- Around line 490-523: The assertions use xarray DataArray `.all()` which yields
a 0-D DataArray (ambiguous truth value); update the two asserts in test_storage
expansion tests (references: cs_after, cs_expanded, Battery|charge_state,
test_storage_self_discharge_decay_in_expansion) to convert to a Python bool
before asserting — e.g. replace `(cs_after >= -0.01).all()` and `(cs_expanded >=
-0.01).all()` with a scalar check such as `(cs_after >= -0.01).all().item()` or
use a min-based scalar check like `float(cs_after.min()) >= -0.01` (and
similarly for cs_expanded) so the assert receives a boolean.
🧹 Nitpick comments (20)
docs/notebooks/data/tutorial_data.py (1)
34-59: Consider using a local random generator for isolation.The use of
np.random.seed(42)modifies global state, which could cause subtle issues if functions are called in different orders or if tests run in parallel. Consider usingnp.random.default_rng(42)for better isolation.🔎 Suggested refactor using local RNG
def get_heat_system_data() -> dict: """Data for 02-heat-system: one week with storage. ... """ timesteps = pd.date_range('2024-01-15', periods=168, freq='h') hours = np.arange(168) hour_of_day = hours % 24 day_of_week = (hours // 24) % 7 # Office heat demand pattern base_demand = np.where((hour_of_day >= 7) & (hour_of_day <= 18), 80, 30) weekend_factor = np.where(day_of_week >= 5, 0.5, 1.0) - np.random.seed(42) - heat_demand = base_demand * weekend_factor + np.random.normal(0, 5, len(timesteps)) + rng = np.random.default_rng(42) + heat_demand = base_demand * weekend_factor + rng.normal(0, 5, len(timesteps)) heat_demand = np.clip(heat_demand, 20, 100)docs/notebooks/data/__init__.py (1)
1-1: Consider adding explicit exports for discoverability.The module only has a comment. Consider adding
__all__to explicitly list available submodules or functions for better discoverability:🔎 Optional: Add __all__ for discoverability
# Data generation utilities for flixopt documentation examples + +__all__ = ['tutorial_data', 'generate_example_systems']docs/notebooks/data/raw/README.md (1)
1-31: Well-documented data sources with proper attribution.The README provides clear documentation of bundled datasets including source, license, and column descriptions. The attribution requirement for Open Power System Data is properly noted.
The bare URLs flagged by markdownlint (lines 10, 26, 28) are a minor style issue. Consider wrapping them in angle brackets for consistency:
🔎 Optional: Wrap bare URLs
-**URL:** https://re.jrc.ec.europa.eu/pvg_tools/en/ +**URL:** <https://re.jrc.ec.europa.eu/pvg_tools/en/>-**URL:** https://data.open-power-system-data.org/time_series/ +**URL:** <https://data.open-power-system-data.org/time_series/>-**Attribution required:** "Data from Open Power System Data. https://open-power-system-data.org" +**Attribution required:** "Data from Open Power System Data. <https://open-power-system-data.org>"tests/test_io_conversion.py (1)
765-768: Consider moving the skip earlier for efficiency.The skip occurs after the expensive
optimize()call on line 759. If the goal is simply to skip objective comparison for this case, consider moving the skip before the re-optimization:🔎 Suggested optimization
+ # Skip comparison for scenarios test case - scenario weights are now always normalized, + # which changes the objective value when loading old results with non-normalized weights + if result_name == '04_scenarios': + pytest.skip('Scenario weights are now always normalized - old results have different weights') + # Re-solve the FlowSystem fs.optimize(fx.solvers.HighsSolver(mip_gap=0)) # Get new objective effect total (sum for multi-scenario) new_objective = float(fs.solution['objective'].item()) new_effect_total = float(fs.solution[objective_effect_label].sum().item()) - # Skip comparison for scenarios test case - scenario weights are now always normalized, - # which changes the objective value when loading old results with non-normalized weights - if result_name == '04_scenarios': - pytest.skip('Scenario weights are now always normalized - old results have different weights')Alternatively, consider adding a dedicated test that verifies the normalized weights behavior is correct for
04_scenarios, rather than just skipping verification..github/workflows/docs.yaml (1)
60-87: Good notebook caching strategy with parallel execution.The caching approach is sound:
- Hash includes both notebooks and source files, ensuring cache invalidation on code changes
- Parallel execution (
-P 4) speeds up notebook runsset -eo pipefailensures failures are properly detectedTwo minor observations:
2>/dev/nullon tar (lines 65, 129): This silences all stderr, which could hide legitimate warnings. Consider logging stderr to a file or removing the suppression:-HASH=$({ find docs/notebooks -name '*.ipynb' -print0; find flixopt -name '*.py' -print0; } | sort -z | xargs -0 tar -cf - 2>/dev/null | sha256sum | cut -d' ' -f1) +HASH=$({ find docs/notebooks -name '*.ipynb' -print0; find flixopt -name '*.py' -print0; } | sort -z | xargs -0 tar -cf - | sha256sum | cut -d' ' -f1)
- Code duplication: The notebook caching logic is duplicated between
buildanddeployjobs. Consider extracting to a composite action if this pattern grows.docs/notebooks/08c2-clustering-storage-modes.ipynb (1)
240-246: Consider guarding against zero baseline cost.Line 244 performs division by
baseline_cost. While optimization costs are unlikely to be exactly zero, adding a guard would improve robustness:if baseline_cost != 0: comparison['Cost Gap [%]'] = (comparison['Cost [EUR]'] - baseline_cost) / abs(baseline_cost) * 100 else: comparison['Cost Gap [%]'] = 0.0docs/notebooks/08b-rolling-horizon.ipynb (1)
276-284: Align plot comment with actually plotted variableThe comment says “Plot CHP thermal flow across all segments”, but the code plots
ds['Boiler(Q_th)|flow_rate']. Either switch to the CHP variable or update the comment to “Boiler thermal flow” to avoid confusion when readers inspect the overlaps.tests/deprecated/examples/03_Optimization_modes/example_optimization_modes.py (1)
36-39: Tightenoptimizationstyping and provide meaningful durations for clustered runYou now append a
ClusteredResultinstance tooptimizations, but the annotationlist[fx.Optimization | fx.SegmentedOptimization]no longer reflects the actual contents. Also,ClusteredResult.durationsis always{'total': 0}, so the “Duration Comparison” plot under‑reports the clustered case.Consider either:
- Widening the type annotation (e.g. to
list[object]or a smallProtocolwithname,flow_system,durations), and- Populating
durationswith at least the solve time (and ideally clustering + solve), so the speed comparison stays informative.Also applies to: 178-217
docs/notebooks/07-scenarios-and-periods.ipynb (1)
427-433: Clarify example variable name forscenario_weightsIn the “Multi-Dimensional FlowSystem” example you use
scenario_weights=weights, while the notebook above introducesscenario_weightsfromget_scenarios_data(). Renamingweightstoscenario_weightsin the snippet would keep terminology consistent and reduce potential confusion for readers.docs/notebooks/data/generate_example_systems.py (1)
323-331: Buses without explicit carriers may reduce consistency with other examples.Here the new district heating system uses
fx.Bus('Electricity'),fx.Bus('Heat'), etc., withoutcarrier=..., whereas the earlier examples define explicitCarrierobjects and attach them to buses.Functionally this likely works (FlowSystem can infer or accept carrier-less buses), but for users reading notebooks and code, it’s slightly inconsistent and makes unit/color semantics less obvious. Consider adding explicit carriers or a brief comment that these buses intentionally omit carriers for brevity.
flixopt/clustering/base.py (2)
631-647: Duration-curve implementation depends on last variable’s length; clarify or make explicit.In
ClusteringPlotAccessor.compare(kind='duration_curve')you compute:for var in ds.data_vars: for rep in ds.coords['representation'].values: values = np.sort(ds[var].sel(representation=rep).values.flatten())[::-1] sorted_vars[(var, rep)] = values n = len(values) ...
nis taken from the final(var, rep)pair. As long as all variables/representations share the same number of points this works, but it’s a bit implicit and brittle.Consider using the first computed series (or checking that all lengths match) for clarity:
first_values = next(iter(sorted_vars.values())) n = len(first_values)This keeps behavior the same but avoids relying on an outer-scope
values.
382-407:expand_datawith cluster dimension requirescluster_structure; document constraint.When
aggregatedhas a'cluster'dimension,expand_datausesself.cluster_structure.timesteps_per_cluster. If a backend ever produces(cluster, time)data but leavescluster_structure=None, this branch will raise.Given the current
cluster()implementation always setscluster_structure, this is fine, but it’s worth documenting (in the docstring ofClusterResultorexpand_data) that:
(cluster, time)aggregated data requires a non‑Nonecluster_structure, or- callers must avoid passing such data when no structure exists.
Otherwise, alternative/manual backends may get surprising errors.
flixopt/components.py (1)
1066-1093: Unbounded-capacity branch returns scalars; ensureadd_variablesaccepts numeric bounds.
_absolute_charge_state_boundsnow returns(0, np.inf)whencapacity_in_flow_hours is None. This is cleaner than synthesizing a time-shaped array, but it assumesComponentModel.add_variablesaccepts scalar bounds and broadcasts them over the coordinate grid.If
add_variablesstill expects xarray bounds everywhere, this could lead to a type mismatch. Otherwise this is fine; just worth double-checking.flixopt/statistics_accessor.py (1)
1998-2137: storage(): secondary charge_state overlay relies on trace/axis orderingThe new storage plot overlays
charge_statelines on top ofstacked_barsubplots by creating high‑index secondary y‑axes (yaxis101+) and assigning line traces to them based on trace order. This is clever but somewhat brittle: it assumes that the facet grid inflow_ds.fxplot.stacked_bar(...)andcharge_da.fxplot.line(...)produce traces and y‑axes in compatible, sequential order.If you start seeing misaligned charge_state lines under more complex facet/animation setups, consider:
- deriving the target y‑axis from each line trace’s existing
yaxisrather than the loop index, and- mapping
yaxisN→yaxisN+100directly, instead of assumingi+1.For now, please test
statistics.plot.storage()with a clustered system and multiple facets/animations to ensure overlays land on the correct subplots.flixopt/comparison.py (3)
19-19:__all__is incomplete - consider exporting all public classes.The module defines three public classes (
Comparison,ComparisonStatistics,ComparisonStatisticsPlot) but only exportsComparison. While users access the others viaComparison.statistics, exporting them allows direct type annotations and improves discoverability.Suggested fix
-__all__ = ['Comparison'] +__all__ = ['Comparison', 'ComparisonStatistics', 'ComparisonStatisticsPlot']
142-146: Potential issue with coordinate comparison for non-numeric types.The comparison
(ref_coords == fs_coords).all()may raise or produce unexpected results when coordinates are object dtype (e.g., string labels) or when shapes mismatch despite passing the length check.Suggested safer comparison
- elif not (ref_coords == fs_coords).all(): + elif not np.array_equal(ref_coords, fs_coords):This requires adding
import numpy as npat the top of the file.
648-651: Potential silent data loss when heatmap receives multiple variables.If
_combine_datareturns a dataset with multiple variables, only the first one is used without warning. Consider validating or documenting this behavior.Add validation or warning
if not ds.data_vars: return self._finalize(ds, None, show) + if len(ds.data_vars) > 1: + warnings.warn( + f"Heatmap received {len(ds.data_vars)} variables, using only the first one.", + stacklevel=2, + ) da = ds[next(iter(ds.data_vars))]flixopt/dataset_plot_accessor.py (2)
254-257: Consider simplifying the labels dictionary construction.The nested conditional dict comprehension is hard to read. A more explicit approach would improve maintainability.
Suggested simplification
- color_labels = df[slots['color']].unique().tolist() if slots['color'] and slots['color'] in df.columns else [] + color_col = slots['color'] + color_labels = df[color_col].unique().tolist() if color_col and color_col in df.columns else [] color_map = process_colors(colors, color_labels, CONFIG.Plotting.default_qualitative_colorscale) - labels = {**(({slots['x']: xlabel}) if xlabel and slots['x'] else {}), **({'value': ylabel} if ylabel else {})} + labels = {} + if xlabel and slots['x']: + labels[slots['x']] = xlabel + if ylabel: + labels['value'] = ylabel
938-1019: Consider extracting shared heatmap logic to reduce duplication.The
DataArrayPlotAccessor.heatmapmethod duplicates significant logic fromDatasetPlotAccessor.heatmap. While the DataArray version works directly without variable selection, the slot resolution and imshow argument building could be shared.This is acceptable for now since maintaining two slightly different code paths may be clearer than a complex abstraction, but consider refactoring if the logic diverges further.
tests/test_clustering_io.py (1)
10-24: Remove unused fixturesimple_system_24h.This fixture is defined but never referenced in any test in the codebase. Remove it to reduce code clutter.
📜 Review details
Configuration used: defaults
Review profile: CHILL
Plan: Pro
⛔ Files ignored due to path filters (2)
docs/notebooks/data/raw/electricity_prices_de_2020.csvis excluded by!**/*.csvdocs/notebooks/data/raw/tmy_dresden.csvis excluded by!**/*.csv
📒 Files selected for processing (74)
.github/workflows/docs.yaml.pre-commit-config.yamlCHANGELOG.mddocs/notebooks/02-heat-system.ipynbdocs/notebooks/03-investment-optimization.ipynbdocs/notebooks/04-operational-constraints.ipynbdocs/notebooks/05-multi-carrier-system.ipynbdocs/notebooks/06a-time-varying-parameters.ipynbdocs/notebooks/07-scenarios-and-periods.ipynbdocs/notebooks/08a-aggregation.ipynbdocs/notebooks/08b-rolling-horizon.ipynbdocs/notebooks/08c-clustering.ipynbdocs/notebooks/08c2-clustering-storage-modes.ipynbdocs/notebooks/08d-clustering-multiperiod.ipynbdocs/notebooks/08e-clustering-internals.ipynbdocs/notebooks/09-plotting-and-data-access.ipynbdocs/notebooks/data/__init__.pydocs/notebooks/data/generate_example_systems.pydocs/notebooks/data/generate_realistic_profiles.pydocs/notebooks/data/raw/README.mddocs/notebooks/data/tutorial_data.pydocs/notebooks/fxplot_accessor_demo.ipynbdocs/user-guide/optimization/clustering.mddocs/user-guide/optimization/index.mddocs/user-guide/recipes/plotting-custom-data.mddocs/user-guide/results-plotting.mddocs/user-guide/results/index.mdflixopt/__init__.pyflixopt/clustering.pyflixopt/clustering/__init__.pyflixopt/clustering/base.pyflixopt/clustering/intercluster_helpers.pyflixopt/comparison.pyflixopt/components.pyflixopt/config.pyflixopt/core.pyflixopt/dataset_plot_accessor.pyflixopt/effects.pyflixopt/elements.pyflixopt/features.pyflixopt/flow_system.pyflixopt/io.pyflixopt/optimization.pyflixopt/optimize_accessor.pyflixopt/plot_result.pyflixopt/results.pyflixopt/statistics_accessor.pyflixopt/structure.pyflixopt/transform_accessor.pymkdocs.ymlpyproject.tomltests/deprecated/examples/03_Optimization_modes/example_optimization_modes.pytests/deprecated/test_bus.pytests/deprecated/test_effect.pytests/deprecated/test_flow.pytests/deprecated/test_flow_system_resample.pytests/deprecated/test_integration.pytests/deprecated/test_linear_converter.pytests/deprecated/test_scenarios.pytests/deprecated/test_storage.pytests/test_bus.pytests/test_cluster_reduce_expand.pytests/test_clustering/__init__.pytests/test_clustering/test_base.pytests/test_clustering/test_integration.pytests/test_clustering_io.pytests/test_effect.pytests/test_flow.pytests/test_flow_system_resample.pytests/test_io_conversion.pytests/test_linear_converter.pytests/test_scenarios.pytests/test_sel_isel_single_selection.pytests/test_storage.py
💤 Files with no reviewable changes (2)
- flixopt/clustering.py
- flixopt/effects.py
🧰 Additional context used
🧬 Code graph analysis (20)
tests/deprecated/test_bus.py (2)
flixopt/structure.py (3)
variables(1719-1725)timestep_duration(220-222)timestep_duration(1751-1752)flixopt/results.py (2)
variables(335-339)variables(1260-1268)
tests/deprecated/test_effect.py (1)
flixopt/structure.py (2)
timestep_duration(220-222)timestep_duration(1751-1752)
tests/test_bus.py (1)
flixopt/structure.py (3)
variables(1719-1725)timestep_duration(220-222)timestep_duration(1751-1752)
tests/test_effect.py (1)
flixopt/structure.py (2)
timestep_duration(220-222)timestep_duration(1751-1752)
tests/test_clustering/test_base.py (1)
flixopt/clustering/base.py (11)
Clustering(980-1135)ClusterResult(248-544)ClusterStructure(35-244)create_cluster_structure_from_mapping(1138-1185)cluster_order(1052-1056)n_clusters(1066-1071)timesteps_per_cluster(1089-1093)timestep_mapping(1096-1098)n_original_timesteps(350-352)validate(488-544)get_expansion_mapping(354-363)
flixopt/clustering/__init__.py (1)
flixopt/clustering/base.py (4)
Clustering(980-1135)ClusterResult(248-544)ClusterStructure(35-244)create_cluster_structure_from_mapping(1138-1185)
tests/deprecated/test_flow_system_resample.py (1)
flixopt/structure.py (2)
timestep_duration(220-222)timestep_duration(1751-1752)
tests/test_flow_system_resample.py (1)
flixopt/structure.py (2)
timestep_duration(220-222)timestep_duration(1751-1752)
tests/deprecated/test_integration.py (4)
tests/conftest.py (1)
assert_almost_equal_numeric(696-725)tests/deprecated/conftest.py (1)
assert_almost_equal_numeric(695-724)flixopt/structure.py (2)
solution(169-217)solution(1154-1168)flixopt/flow_system.py (2)
solution(1449-1466)solution(1469-1472)
tests/deprecated/test_flow.py (3)
flixopt/structure.py (3)
timestep_duration(220-222)timestep_duration(1751-1752)constraints(1710-1716)tests/conftest.py (1)
assert_conequal(755-771)tests/deprecated/conftest.py (1)
assert_conequal(754-770)
tests/deprecated/test_scenarios.py (2)
flixopt/structure.py (4)
scenario_weights(268-282)flow_system(439-457)values(1529-1534)values(1816-1817)flixopt/flow_system.py (2)
scenario_weights(2053-2060)scenario_weights(2063-2102)
flixopt/comparison.py (2)
flixopt/plot_result.py (2)
PlotResult(20-143)update_traces(75-88)flixopt/statistics_accessor.py (13)
flow_rates(365-387)flow_hours(390-411)flow_sizes(414-423)storage_sizes(426-437)sizes(440-444)sizes(1011-1054)sizes(1661-1716)carrier_units(328-336)plot(350-362)balance(1296-1386)flows(960-1009)flows(1569-1659)storage(1998-2143)
flixopt/optimization.py (1)
flixopt/flow_system.py (2)
FlowSystem(55-2510)create_model(1333-1353)
tests/test_linear_converter.py (2)
flixopt/structure.py (3)
timestep_duration(220-222)timestep_duration(1751-1752)constraints(1710-1716)tests/conftest.py (1)
assert_conequal(755-771)
tests/test_storage.py (2)
flixopt/structure.py (2)
timestep_duration(220-222)timestep_duration(1751-1752)flixopt/flow_system.py (1)
isel(2371-2397)
flixopt/components.py (1)
flixopt/clustering/intercluster_helpers.py (2)
build_boundary_coords(157-201)extract_capacity_bounds(74-154)
flixopt/dataset_plot_accessor.py (2)
flixopt/color_processing.py (1)
process_colors(165-233)flixopt/config.py (2)
CONFIG(184-912)Plotting(550-592)
flixopt/features.py (2)
flixopt/structure.py (9)
temporal_weight(255-257)temporal_dims(247-252)sum_temporal(259-265)dims(229-231)weights(239-244)get(1658-1663)get(1823-1825)timestep_duration(220-222)timestep_duration(1751-1752)flixopt/flow_system.py (5)
temporal_weight(1990-1998)temporal_dims(1980-1987)sum_temporal(2137-2154)dims(1936-1956)weights(2115-2135)
flixopt/elements.py (3)
flixopt/structure.py (5)
sum_temporal(259-265)timestep_duration(220-222)timestep_duration(1751-1752)temporal_weight(255-257)temporal_dims(247-252)flixopt/flow_system.py (3)
sum_temporal(2137-2154)temporal_weight(1990-1998)temporal_dims(1980-1987)flixopt/results.py (2)
flow_rate(2014-2015)size(2022-2030)
flixopt/__init__.py (3)
flixopt/carrier.py (2)
Carrier(14-125)CarrierContainer(128-159)flixopt/comparison.py (1)
Comparison(27-191)flixopt/plot_result.py (1)
PlotResult(20-143)
🪛 LanguageTool
docs/user-guide/results/index.md
[style] ~333-~333: Using many exclamation marks might seem excessive (in this case: 9 exclamation marks for a text that’s 4730 characters long)
Context: ...Works! cluster_boundary is ignored ``` !!! note "Component Differences" System...
(EN_EXCESSIVE_EXCLAMATION)
🪛 markdownlint-cli2 (0.18.1)
docs/notebooks/data/raw/README.md
10-10: Bare URL used
(MD034, no-bare-urls)
26-26: Bare URL used
(MD034, no-bare-urls)
28-28: Bare URL used
(MD034, no-bare-urls)
CHANGELOG.md
74-74: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
121-121: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
149-149: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
189-189: Code block style
Expected: indented; Actual: fenced
(MD046, code-block-style)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (5)
- GitHub Check: test (3.14)
- GitHub Check: test (3.12)
- GitHub Check: test (3.11)
- GitHub Check: test (3.13)
- GitHub Check: Build documentation
| "## Create the FlowSystem\n", | ||
| "\n", | ||
| "We use real-world district heating data at 15-minute resolution (one month):" | ||
| "We use a district heating system with real-world time series data (one month at hourly resolution):" | ||
| ] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clarify base timestep resolution (hourly vs 15‑min) in the aggregation notebook
The intro states “one month at hourly resolution”, but the resampling section comments “Resample from 15‑min to 4h resolution”. Depending on what create_district_heating_system() actually returns, either the intro or the resampling comment (and the “First Week of Data” slice using timesteps[:168]) is misleading.
Please align the narrative with the actual timestep frequency so readers can correctly interpret the resampling factor and the plotted time window.
Also applies to: 111-118
| "```python\n", | ||
| "# Load multi-period system\n", | ||
| "fs = fx.FlowSystem.from_netcdf('multiperiod_system.nc4')\n", | ||
| "\n", | ||
| "# Select time subset (optional)\n", | ||
| "fs = fs.transform.isel(time=slice(0, 168)) # First 168 timesteps\n", | ||
| "\n", | ||
| "# Cluster (applies per period/scenario)\n", | ||
| "fs_clustered = fs.transform.cluster(\n", | ||
| " n_clusters=10,\n", | ||
| " cluster_duration='1D',\n", | ||
| " time_series_for_high_peaks=['Demand(Flow)|fixed_relative_profile'],\n", | ||
| ")\n", | ||
| "\n", | ||
| "# Visualize clustering quality\n", | ||
| "fs_clustered.clustering.plot.compare(variable='Demand(Flow)|profile')\n", | ||
| "fs_clustered.clustering.plot.heatmap()\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
API reference snippet uses incorrect argument name and series label.
In the final “API Reference” block:
fs_clustered.clustering.plot.compare(variable='Demand(Flow)|profile')ClusteringPlotAccessor.compareexpectsvariables=..., notvariable=.... Usingvariablewill raise aTypeError.- Elsewhere in the notebook you use
...|fixed_relative_profileas the canonical suffix;|profilehere is inconsistent and may not match any time series.
Recommend updating this snippet to:
fs_clustered.clustering.plot.compare(
variables='Demand(Flow)|fixed_relative_profile'
)(or another label that actually exists in the example system).
🤖 Prompt for AI Agents
In @docs/notebooks/08d-clustering-multiperiod.ipynb around lines 555 - 571, The
API snippet calls fs_clustered.clustering.plot.compare with the wrong keyword
and an inconsistent series label; update the call to use the correct parameter
name "variables" instead of "variable" and use the canonical time-series suffix
used elsewhere (e.g., "Demand(Flow)|fixed_relative_profile") so the call becomes
fs_clustered.clustering.plot.compare(variables='Demand(Flow)|fixed_relative_profile')
or another existing series label.
| timesteps = pd.date_range('2019-01-01', periods=8760, freq='h') | ||
| # Map to 2020 weather data (TMY has 8760 hours, no Feb 29) | ||
| temp = _get_weather()['temperature_C'].values | ||
| ghi = _get_weather()['ghi_W_m2'].values | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumption about 8760h weather length is implicit; consider defensive check or comment.
_get_weather()['temperature_C'].values and ['ghi_W_m2'].values are used without aligning to timesteps. This assumes tmy_dresden.csv always has exactly 8760 hourly rows and matches the full-year index.
If the TMY file ever changes (e.g., leap year, different resolution), this will silently misalign weather vs timesteps. Either reindex to timesteps (as in other functions) or add a short comment/note that the file is expected to be 8760 hourly values for a non‑leap year.
🤖 Prompt for AI Agents
In @docs/notebooks/data/generate_example_systems.py around lines 544 - 548, The
code assumes _get_weather() returns exactly 8760 hourly rows and directly uses
temp and ghi arrays which may misalign with timesteps; update the block that
sets temp and ghi to either (1) explicitly reindex/resample the weather
Series/DataFrame returned by _get_weather() to the timesteps index (matching the
approach used elsewhere in the repo) or (2) add a defensive assertion and clear
comment: assert len(temp)==len(timesteps) (and same for ghi) with a note that
tmy_dresden.csv must be 8760 hourly values for a non‑leap year; reference the
variables timesteps, _get_weather(), temp, and ghi when making the change.
| # Reset warnings to default after imports. Some dependencies (demandlib, pvlib) | ||
| # may configure warnings during import. This ensures consistent warning behavior | ||
| # when this module is used in different contexts (scripts, notebooks, tests). | ||
| warnings.resetwarnings() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Resetting global warnings configuration in a library module is risky.
Calling warnings.resetwarnings() after importing dependencies affects the entire Python process, potentially undoing user- or application-level warning filters.
Given this module is imported from notebooks, scripts, and possibly tests, consider narrowing the scope instead:
- Use
warnings.filterwarningsfor the specific noisy warnings you want to silence, or - Wrap the problematic calls (e.g., demandlib/pvlib functions) in
warnings.catch_warnings()instead of resetting global state.
This keeps the examples quiet without surprising callers that rely on custom warning configuration.
🤖 Prompt for AI Agents
In @docs/notebooks/data/generate_realistic_profiles.py around lines 28 - 31,
Remove the global call to warnings.resetwarnings() in
generate_realistic_profiles.py and instead either apply targeted filters with
warnings.filterwarnings() for the specific noisy messages (referencing the
module names like demandlib and pvlib) or wrap the specific imports/calls that
trigger noise in a warnings.catch_warnings() context so the suppression is
local; replace the resetwarnings() usage with these focused patterns and ensure
any added filterwarnings() include module or message qualifiers to avoid
changing global warning state.
| ```python | ||
| import flixopt as fx | ||
|
|
||
| # Load or create your FlowSystem | ||
| flow_system = fx.FlowSystem(timesteps) | ||
| flow_system.add_elements(...) | ||
|
|
||
| # Stage 1: Cluster and optimize (fast) | ||
| fs_clustered = flow_system.transform.cluster( | ||
| n_clusters=12, | ||
| cluster_duration='1D', | ||
| time_series_for_high_peaks=['HeatDemand(Q)|fixed_relative_profile'], | ||
| ) | ||
| fs_clustered.optimize(fx.solvers.HighsSolver()) | ||
|
|
||
| # Stage 2: Expand back to full resolution | ||
| fs_expanded = fs_clustered.transform.expand_solution() | ||
|
|
||
| # Access full-resolution results | ||
| charge_state = fs_expanded.solution['Storage|charge_state'] | ||
| flow_rates = fs_expanded.solution['Boiler(Q_th)|flow_rate'] | ||
| ``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Search for cluster method definition and expand_solution method
echo "=== Searching for cluster method ==="
find . -type f -name "*.py" | head -20 | xargs grep -l "cluster" 2>/dev/null | head -5
echo "=== Searching for expand_solution method ==="
find . -type f -name "*.py" | xargs grep -l "expand_solution" 2>/dev/null | head -5
echo "=== Directory structure ==="
ls -la | head -20Repository: flixOpt/flixopt
Length of output: 1702
🏁 Script executed:
#!/bin/bash
# Check transform_accessor.py for cluster method
echo "=== transform_accessor.py cluster method ==="
rg -n "def cluster" flixopt/transform_accessor.py -A 10
# Check clustering/base.py for both methods
echo "=== clustering/base.py cluster method ==="
rg -n "def cluster" flixopt/clustering/base.py -A 10
echo "=== clustering/base.py expand_solution method ==="
rg -n "def expand_solution" flixopt/clustering/base.py -A 10
# Check test file for actual usage patterns
echo "=== test_cluster_reduce_expand.py usage ==="
head -100 tests/test_cluster_reduce_expand.pyRepository: flixOpt/flixopt
Length of output: 5804
🏁 Script executed:
#!/bin/bash
# Check for expand_solution in transform_accessor.py
echo "=== expand_solution in transform_accessor.py ==="
rg -n "def expand_solution" flixopt/transform_accessor.py -A 5
# Check solution variable naming patterns
echo "=== Solution variable examples in tests ==="
rg "solution\[" tests/test_cluster_reduce_expand.py | head -20
# Check if Storage elements are used in codebase
echo "=== Storage element references ==="
rg "Storage\|charge_state" . --type=py | head -10Repository: flixOpt/flixopt
Length of output: 2704
🏁 Script executed:
#!/bin/bash
# Check if Storage is a class or generic term
echo "=== Storage element class definition ==="
rg "class Storage" flixopt/ --type=py | head -5
# Check element naming in docs
echo "=== Storage element references in docs ==="
find docs -name "*.md" | xargs grep -l "Storage" | head -3
# Verify full example in actual code to ensure naming is consistent
echo "=== Check test for storage elements ==="
grep -A 5 "add_elements" tests/test_cluster_reduce_expand.py | head -20Repository: flixOpt/flixopt
Length of output: 1280
🏁 Script executed:
#!/bin/bash
# Get the full Storage element usage in tests
echo "=== Full Storage usage in test_cluster_reduce_expand.py ==="
grep -B2 -A10 "fx.Storage(" tests/test_cluster_reduce_expand.py | head -30
# Verify the documentation example is complete
echo "=== Check the actual documentation file ==="
sed -n '24,45p' docs/user-guide/optimization/clustering.mdRepository: flixOpt/flixopt
Length of output: 1358
🏁 Script executed:
#!/bin/bash
# Check the full context of the documentation file to see what elements might be added
echo "=== Full clustering.md file context ==="
wc -l docs/user-guide/optimization/clustering.md
# Get more context around lines 24-45
echo "=== Lines 1-50 of clustering.md ==="
sed -n '1,50p' docs/user-guide/optimization/clustering.md
# Look for any Storage element instantiation examples in docs
echo "=== Storage examples in docs ==="
find docs -name "*.md" -type f | xargs grep -l "Storage(" | head -5
rg "Storage\(" docs --type=md -B2 -A2 | head -30Repository: flixOpt/flixopt
Length of output: 3763
Fix the example to reference an actual Storage element instance.
The code example at lines 24-45 references fs_expanded.solution['Storage|charge_state'], but Storage is not instantiated in the code snippet—add_elements(...) is a placeholder. Solution variables require actual element instance names (e.g., 'Battery|charge_state' or 'HotWaterTank|charge_state'), not the class name 'Storage'. Either instantiate a Storage element in the add_elements() call or use a realistic element name that matches a created instance.
🤖 Prompt for AI Agents
In @docs/user-guide/optimization/clustering.md around lines 24 - 45, The example
references a nonexistent solution key 'Storage|charge_state'; update the snippet
so the added element name matches the solution lookup. Either instantiate a
storage element in flow_system.add_elements(...) with a concrete name (e.g.,
"Battery" or "HotWaterTank") or change
fs_expanded.solution['Storage|charge_state'] to the actual element instance name
used; ensure the lookup matches the instance naming produced by
flow_system.add_elements and keep the calls to
flow_system.transform.cluster(...) and fs_clustered.transform.expand_solution()
unchanged.
| from . import dataset_plot_accessor as _ # noqa: F401 | ||
| from .carrier import Carrier, CarrierContainer | ||
| from .clustering import ClusteringParameters | ||
| from .comparison import Comparison |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify Comparison class exists and has tests
# Check Comparison class definition
rg -nA10 '^class Comparison' flixopt/comparison.py
# Find tests for Comparison
rg -n 'Comparison' tests/ -g '*.py' | head -20Repository: flixOpt/flixopt
Length of output: 2369
🏁 Script executed:
#!/bin/bash
# Search for dedicated test files for Comparison class
find tests -name "*test*.py" -type f | head -20
# Search for Comparison class usage in non-deprecated tests
rg -l 'class.*Test.*Comparison|def.*test.*comparison' tests -g '*.py'
# Check for conftest or setup files
ls -la tests/Repository: flixOpt/flixopt
Length of output: 3308
🏁 Script executed:
#!/bin/bash
# Search for Comparison imports and tests across all non-deprecated test files
rg -n 'from.*Comparison|import.*Comparison' tests/*.py
# Search for any test functions that might test Comparison
rg -n 'def test.*' tests/*.py | grep -i comparison
# Check if Comparison is mentioned in conftest.py
grep -n 'Comparison' tests/conftest.pyRepository: flixOpt/flixopt
Length of output: 41
Add tests for the Comparison class before exposing it in the public API.
The Comparison class is well-implemented with comprehensive docstring and proper type hints, but there are no unit tests in the active test suite. Exporting it to the public API without test coverage creates maintenance risk and makes it harder to catch regressions. Add dedicated tests in tests/test_comparison.py covering initialization, method behavior, and edge cases.
🤖 Prompt for AI Agents
In @flixopt/__init__.py at line 20, The Comparison class is being exported from
the package without any unit tests; add a new test module
tests/test_comparison.py that exercises Comparison: write tests for
initialization (valid and invalid args), each public method's expected behavior
and return types, and edge cases (empty inputs, boundary values, and error
paths); use the Comparison class name in tests, assert type hints/outputs where
applicable, and ensure the new tests run in CI before exposing Comparison in
__init__.py.
| def extract_capacity_bounds( | ||
| capacity_param: InvestParameters | int | float | None, | ||
| boundary_coords: dict, | ||
| boundary_dims: list[str], | ||
| ) -> CapacityBounds: | ||
| """Extract capacity bounds from storage parameters for SOC_boundary variable. | ||
| This function determines the appropriate bounds for the SOC_boundary variable | ||
| based on the storage's capacity parameter: | ||
| - **Fixed capacity** (numeric): Upper bound is the fixed value. | ||
| - **InvestParameters**: Upper bound is maximum_size (or fixed_size if set). | ||
| The actual bound is enforced via separate constraints linked to investment.size. | ||
| - **None/Unbounded**: Upper bound is set to a large value (1e9). | ||
| The lower bound is always zero (SOC cannot be negative). | ||
| Args: | ||
| capacity_param: Storage capacity specification. Can be: | ||
| - Numeric (int/float): Fixed capacity | ||
| - InvestParameters: Investment-based sizing with min/max | ||
| - None: Unbounded storage | ||
| boundary_coords: Coordinate dictionary for SOC_boundary variable. | ||
| Must contain 'cluster_boundary' key. | ||
| boundary_dims: Dimension names for SOC_boundary variable. | ||
| First dimension must be 'cluster_boundary'. | ||
| Returns: | ||
| CapacityBounds with lower/upper bounds and investment flag. | ||
| Example: | ||
| >>> coords, dims = build_boundary_coords(14, flow_system) | ||
| >>> bounds = extract_capacity_bounds(InvestParameters(maximum_size=10000), coords, dims) | ||
| >>> bounds.has_investment | ||
| True | ||
| >>> bounds.upper.max() | ||
| 10000.0 | ||
| """ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix docstring inconsistency.
Line 87 in the docstring states "Upper bound is set to a large value (1e9)" but the actual constant DEFAULT_UNBOUNDED_CAPACITY is set to 1e6 (Line 53). Update the docstring to match the implementation.
🔎 Proposed fix
- **InvestParameters**: Upper bound is maximum_size (or fixed_size if set).
The actual bound is enforced via separate constraints linked to investment.size.
- - **None/Unbounded**: Upper bound is set to a large value (1e9).
+ - **None/Unbounded**: Upper bound is set to a large value (1e6).🤖 Prompt for AI Agents
In @flixopt/clustering/intercluster_helpers.py around lines 74 - 111, The
docstring for extract_capacity_bounds incorrectly states the unbounded upper
default as 1e9; update that sentence to match the implementation constant
DEFAULT_UNBOUNDED_CAPACITY (= 1e6) so the docs reflect the actual fallback value
used by the function (refer to DEFAULT_UNBOUNDED_CAPACITY and
extract_capacity_bounds).
| @property | ||
| def _absolute_charge_state_bounds(self) -> tuple[xr.DataArray, xr.DataArray]: | ||
| """Get symmetric bounds for charge_state (ΔE) variable. | ||
| For InterclusterStorageModel, charge_state represents ΔE (relative change | ||
| from cluster start), which can be negative. Therefore, we need symmetric | ||
| bounds: -capacity <= ΔE <= capacity. | ||
| Note that for investment-based sizing, additional constraints are added | ||
| in _add_investment_model to link bounds to the actual investment size. | ||
| """ | ||
| _, relative_upper_bound = self._relative_charge_state_bounds | ||
|
|
||
| if self.element.capacity_in_flow_hours is None: | ||
| return -np.inf, np.inf | ||
| elif isinstance(self.element.capacity_in_flow_hours, InvestParameters): | ||
| cap_max = self.element.capacity_in_flow_hours.maximum_or_fixed_size * relative_upper_bound | ||
| # Adding 0.0 converts -0.0 to 0.0 (linopy LP writer bug workaround) | ||
| return -cap_max + 0.0, cap_max + 0.0 | ||
| else: | ||
| cap = self.element.capacity_in_flow_hours * relative_upper_bound | ||
| # Adding 0.0 converts -0.0 to 0.0 (linopy LP writer bug workaround) | ||
| return -cap + 0.0, cap + 0.0 | ||
|
|
||
| def _do_modeling(self): | ||
| """Create storage model with inter-cluster linking constraints. | ||
| Uses template method pattern: calls parent's _do_modeling, then adds | ||
| inter-cluster linking. Overrides specific methods to customize behavior. | ||
| """ | ||
| super()._do_modeling() | ||
| self._add_intercluster_linking() | ||
|
|
||
| def _add_cluster_cyclic_constraint(self): | ||
| """Skip cluster cyclic constraint - handled by inter-cluster linking.""" | ||
| pass | ||
|
|
||
| def _add_investment_model(self): | ||
| """Create InvestmentModel with symmetric bounds for ΔE.""" | ||
| if isinstance(self.element.capacity_in_flow_hours, InvestParameters): | ||
| self.add_submodels( | ||
| InvestmentModel( | ||
| model=self._model, | ||
| label_of_element=self.label_of_element, | ||
| label_of_model=self.label_of_element, | ||
| parameters=self.element.capacity_in_flow_hours, | ||
| ), | ||
| short_name='investment', | ||
| ) | ||
| # Symmetric bounds: -size <= charge_state <= size | ||
| self.add_constraints( | ||
| self.charge_state >= -self.investment.size, | ||
| short_name='charge_state|lb', | ||
| ) | ||
| self.add_constraints( | ||
| self.charge_state <= self.investment.size, | ||
| short_name='charge_state|ub', | ||
| ) | ||
|
|
||
| def _add_initial_final_constraints(self): | ||
| """Skip initial/final constraints - handled by SOC_boundary in inter-cluster linking.""" | ||
| pass | ||
|
|
||
| def _add_intercluster_linking(self) -> None: | ||
| """Add inter-cluster storage linking following the S-K model from Blanke et al. (2022). | ||
| This method implements the core inter-cluster linking logic: | ||
| 1. Constrains charge_state (ΔE) at each cluster start to 0 | ||
| 2. Creates SOC_boundary variables to track absolute SOC at period boundaries | ||
| 3. Links boundaries via Eq. 5: SOC_boundary[d+1] = SOC_boundary[d] * (1-loss)^N + delta_SOC | ||
| 4. Adds combined bounds per Eq. 9: 0 ≤ SOC_boundary * (1-loss)^t + ΔE ≤ capacity | ||
| 5. Enforces initial/cyclic constraint on SOC_boundary | ||
| """ | ||
| from .clustering.intercluster_helpers import ( | ||
| build_boundary_coords, | ||
| extract_capacity_bounds, | ||
| ) | ||
|
|
||
| clustering = self._model.flow_system.clustering | ||
| if clustering is None or clustering.result.cluster_structure is None: | ||
| return | ||
|
|
||
| cluster_structure = clustering.result.cluster_structure | ||
| n_clusters = ( | ||
| int(cluster_structure.n_clusters) | ||
| if isinstance(cluster_structure.n_clusters, (int, np.integer)) | ||
| else int(cluster_structure.n_clusters.values) | ||
| ) | ||
| timesteps_per_cluster = cluster_structure.timesteps_per_cluster | ||
| n_original_clusters = cluster_structure.n_original_clusters | ||
| cluster_order = cluster_structure.cluster_order | ||
|
|
||
| # 1. Constrain ΔE = 0 at cluster starts | ||
| self._add_cluster_start_constraints(n_clusters, timesteps_per_cluster) | ||
|
|
||
| # 2. Create SOC_boundary variable | ||
| flow_system = self._model.flow_system | ||
| boundary_coords, boundary_dims = build_boundary_coords(n_original_clusters, flow_system) | ||
| capacity_bounds = extract_capacity_bounds(self.element.capacity_in_flow_hours, boundary_coords, boundary_dims) | ||
|
|
||
| soc_boundary = self.add_variables( | ||
| lower=capacity_bounds.lower, | ||
| upper=capacity_bounds.upper, | ||
| coords=boundary_coords, | ||
| dims=boundary_dims, | ||
| short_name='SOC_boundary', | ||
| ) | ||
|
|
||
| # 3. Link SOC_boundary to investment size | ||
| if capacity_bounds.has_investment and self.investment is not None: | ||
| self.add_constraints( | ||
| soc_boundary <= self.investment.size, | ||
| short_name='SOC_boundary_ub', | ||
| ) | ||
|
|
||
| # 4. Compute delta_SOC for each cluster | ||
| delta_soc = self._compute_delta_soc(n_clusters, timesteps_per_cluster) | ||
|
|
||
| # 5. Add linking constraints | ||
| self._add_linking_constraints( | ||
| soc_boundary, delta_soc, cluster_order, n_original_clusters, timesteps_per_cluster | ||
| ) | ||
|
|
||
| # 6. Add cyclic or initial constraint | ||
| if self.element.cluster_mode == 'intercluster_cyclic': | ||
| self.add_constraints( | ||
| soc_boundary.isel(cluster_boundary=0) == soc_boundary.isel(cluster_boundary=n_original_clusters), | ||
| short_name='cyclic', | ||
| ) | ||
| else: | ||
| # Apply initial_charge_state to SOC_boundary[0] | ||
| initial = self.element.initial_charge_state | ||
| if initial is not None: | ||
| if isinstance(initial, str): | ||
| # 'equals_final' means cyclic | ||
| self.add_constraints( | ||
| soc_boundary.isel(cluster_boundary=0) | ||
| == soc_boundary.isel(cluster_boundary=n_original_clusters), | ||
| short_name='initial_SOC_boundary', | ||
| ) | ||
| else: | ||
| self.add_constraints( | ||
| soc_boundary.isel(cluster_boundary=0) == initial, | ||
| short_name='initial_SOC_boundary', | ||
| ) | ||
|
|
||
| # 7. Add combined bound constraints | ||
| self._add_combined_bound_constraints( | ||
| soc_boundary, | ||
| cluster_order, | ||
| capacity_bounds.has_investment, | ||
| n_original_clusters, | ||
| timesteps_per_cluster, | ||
| ) | ||
|
|
||
| def _add_cluster_start_constraints(self, n_clusters: int, timesteps_per_cluster: int) -> None: | ||
| """Constrain ΔE = 0 at the start of each representative cluster. | ||
| This ensures that the relative charge state is measured from a known | ||
| reference point (the cluster start). | ||
| With 2D (cluster, time) structure, time=0 is the start of every cluster, | ||
| so we simply select isel(time=0) which broadcasts across the cluster dimension. | ||
| Args: | ||
| n_clusters: Number of representative clusters (unused with 2D structure). | ||
| timesteps_per_cluster: Timesteps in each cluster (unused with 2D structure). | ||
| """ | ||
| # With 2D structure: time=0 is start of every cluster | ||
| self.add_constraints( | ||
| self.charge_state.isel(time=0) == 0, | ||
| short_name='cluster_start', | ||
| ) | ||
|
|
||
| def _compute_delta_soc(self, n_clusters: int, timesteps_per_cluster: int) -> xr.DataArray: | ||
| """Compute net SOC change (delta_SOC) for each representative cluster. | ||
| The delta_SOC is the difference between the charge_state at the end | ||
| and start of each cluster: delta_SOC[c] = ΔE(end_c) - ΔE(start_c). | ||
| Since ΔE(start) = 0 by constraint, this simplifies to delta_SOC[c] = ΔE(end_c). | ||
| With 2D (cluster, time) structure, we can simply select isel(time=-1) and isel(time=0), | ||
| which already have the 'cluster' dimension. | ||
| Args: | ||
| n_clusters: Number of representative clusters (unused with 2D structure). | ||
| timesteps_per_cluster: Timesteps in each cluster (unused with 2D structure). | ||
| Returns: | ||
| DataArray with 'cluster' dimension containing delta_SOC for each cluster. | ||
| """ | ||
| # With 2D structure: result already has cluster dimension | ||
| return self.charge_state.isel(time=-1) - self.charge_state.isel(time=0) | ||
|
|
||
| def _add_linking_constraints( | ||
| self, | ||
| soc_boundary: xr.DataArray, | ||
| delta_soc: xr.DataArray, | ||
| cluster_order: xr.DataArray, | ||
| n_original_clusters: int, | ||
| timesteps_per_cluster: int, | ||
| ) -> None: | ||
| """Add constraints linking consecutive SOC_boundary values. | ||
| Per Blanke et al. (2022) Eq. 5, implements: | ||
| SOC_boundary[d+1] = SOC_boundary[d] * (1-loss)^N + delta_SOC[cluster_order[d]] | ||
| where N is timesteps_per_cluster and loss is self-discharge rate per timestep. | ||
| This connects the SOC at the end of original period d to the SOC at the | ||
| start of period d+1, accounting for self-discharge decay over the period. | ||
| Args: | ||
| soc_boundary: SOC_boundary variable. | ||
| delta_soc: Net SOC change per cluster. | ||
| cluster_order: Mapping from original periods to representative clusters. | ||
| n_original_clusters: Number of original (non-clustered) periods. | ||
| timesteps_per_cluster: Number of timesteps in each cluster period. | ||
| """ | ||
| soc_after = soc_boundary.isel(cluster_boundary=slice(1, None)) | ||
| soc_before = soc_boundary.isel(cluster_boundary=slice(None, -1)) | ||
|
|
||
| # Rename for alignment | ||
| soc_after = soc_after.rename({'cluster_boundary': 'original_cluster'}) | ||
| soc_after = soc_after.assign_coords(original_cluster=np.arange(n_original_clusters)) | ||
| soc_before = soc_before.rename({'cluster_boundary': 'original_cluster'}) | ||
| soc_before = soc_before.assign_coords(original_cluster=np.arange(n_original_clusters)) | ||
|
|
||
| # Get delta_soc for each original period using cluster_order | ||
| delta_soc_ordered = delta_soc.isel(cluster=cluster_order) | ||
|
|
||
| # Apply self-discharge decay factor (1-loss)^hours to soc_before per Eq. 5 | ||
| # relative_loss_per_hour is per-hour, so we need hours = timesteps * duration | ||
| # Use mean over time (linking operates at period level, not timestep) | ||
| # Keep as DataArray to respect per-period/scenario values | ||
| rel_loss = self.element.relative_loss_per_hour.mean('time') | ||
| hours_per_cluster = timesteps_per_cluster * self._model.timestep_duration.mean('time') | ||
| decay_n = (1 - rel_loss) ** hours_per_cluster | ||
|
|
||
| lhs = soc_after - soc_before * decay_n - delta_soc_ordered | ||
| self.add_constraints(lhs == 0, short_name='link') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Intercluster linking: decay factors currently ignore cluster assignments, causing dimension mismatch.
In InterclusterStorageModel._add_linking_constraints and _add_combined_bound_constraints, the self‑discharge decay is computed per cluster:
rel_loss = self.element.relative_loss_per_hour.mean('time')
hours_per_cluster = timesteps_per_cluster * self._model.timestep_duration.mean('time')
decay_n = (1 - rel_loss) ** hours_per_cluster # has 'cluster' dim
...
lhs = soc_after - soc_before * decay_n - delta_soc_orderedand similarly in _add_combined_bound_constraints via decay_t.
Because soc_before / soc_d are indexed over original_cluster, but decay_n / decay_t over cluster, xarray broadcasts them, yielding an extra cluster dimension. The constraints then effectively enforce the linking equations for all clusters at every original period, not just the one assigned by cluster_order. That can over‑constrain the problem and breaks the intended S‑N linking semantics, especially when relative_loss_per_hour > 0.
To align with Eq. 5 / Eq. 9, the decay should be selected according to the cluster assigned to each original period, exactly as you already do for delta_soc_ordered and cs_t. For example (sketch):
# After computing decay_n with a 'cluster' dim:
decay_n_ordered = decay_n.isel(cluster=cluster_order)
lhs = soc_after - soc_before * decay_n_ordered - delta_soc_orderedand analogously in _add_combined_bound_constraints, applying cluster_order to any per‑cluster decay term before combining with soc_d.
I recommend refactoring these two methods so that all decay factors are expressed in original_cluster (and period/scenario) space before being multiplied with soc_*, avoiding an extra cluster dimension in the constraints.
🤖 Prompt for AI Agents
In @flixopt/components.py around lines 1243 - 1485, The decay factors
(decay_n/decay_t) are left with a 'cluster' dimension and are being broadcast
against soc_before/soc_d (which use 'original_cluster'), causing extra
dimensions and over‑constraining; in _add_linking_constraints and
_add_combined_bound_constraints you must index the decay arrays by the cluster
assignment before using them (same pattern as delta_soc_ordered): compute
decay_n/decay_t with their 'cluster' dim then convert them to original-period
space via .isel(cluster=cluster_order) (or otherwise align coords) so that the
decay factors have the 'original_cluster' coord and match soc_before/soc_after
(and cs_t) prior to building lhs and combined bound expressions.
| @property | ||
| def weights(self) -> dict[str, xr.DataArray]: | ||
| """Weights for active dimensions (unit weights if not set). | ||
| Scenario weights are always normalized (handled by FlowSystem). | ||
| """ | ||
| Scenario weights of model. With optional normalization. | ||
| return self.flow_system.weights | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scenario weight docstring is inconsistent with actual behavior
FlowSystemModel.weights claims “Scenario weights are always normalized”, but FlowSystem.weights['scenario'] returns unit weights (all 1.0) when no explicit scenario_weights are set. That means multiple scenarios are summed, not averaged, by default.
Either adjust the docstring here (and in FlowSystem.__init__ docs) to describe the “unit weights unless explicitly set and normalized” behavior, or normalize the default scenario weights to keep the old documented semantics.
Possible doc tweak
- """Weights for active dimensions (unit weights if not set).
-
- Scenario weights are always normalized (handled by FlowSystem).
- """
+ """Weights for active dimensions (unit weights if not set).
+
+ Scenario weights are normalized when explicitly set via
+ ``FlowSystem.scenario_weights``; otherwise unit weights
+ (all 1.0) are used.
+ """🤖 Prompt for AI Agents
In @flixopt/structure.py around lines 238 - 245, The docstring on
FlowSystemModel.weights is inaccurate: FlowSystem.weights['scenario'] currently
returns unit (all-1.0) weights when no scenario_weights are provided, so
scenarios are summed by default; update the documentation (both the
FlowSystemModel.weights docstring and the FlowSystem.__init__ docs) to state
that scenario weights are unit weights by default and only normalized when
explicit scenario_weights are supplied (referencing the scenario_weights
parameter and FlowSystem.weights['scenario']), or alternatively change the
implementation to create normalized default scenario weights instead — pick one
approach and make the docstring and code consistent (adjust
FlowSystemModel.weights docstring and FlowSystem.__init__ accordingly).
| """Expanded charge_state should be non-negative (combined with SOC_boundary).""" | ||
| fs = create_system_with_storage(timesteps_8_days, cluster_mode='intercluster_cyclic') | ||
| fs_clustered = fs.transform.cluster(n_clusters=2, cluster_duration='1D') | ||
| fs_clustered.optimize(solver_fixture) | ||
|
|
||
| # Note: Before expansion, charge_state represents ΔE (relative to period start) | ||
| # which can be negative. After expansion, it becomes absolute SOC. | ||
|
|
||
| # After expansion: charge_state should be non-negative (absolute SOC) | ||
| fs_expanded = fs_clustered.transform.expand_solution() | ||
| cs_after = fs_expanded.solution['Battery|charge_state'] | ||
|
|
||
| # All values should be >= 0 (with small tolerance for numerical issues) | ||
| assert (cs_after >= -0.01).all(), f'Negative charge_state found: min={float(cs_after.min())}' | ||
|
|
||
| def test_storage_self_discharge_decay_in_expansion(self, solver_fixture, timesteps_8_days): | ||
| """Verify self-discharge decay factor applied correctly during expansion.""" | ||
| # Use significant self-discharge to make decay visible | ||
| fs = create_system_with_storage( | ||
| timesteps_8_days, | ||
| cluster_mode='intercluster_cyclic', | ||
| relative_loss_per_hour=0.01, # 1% per hour | ||
| ) | ||
| fs_clustered = fs.transform.cluster(n_clusters=2, cluster_duration='1D') | ||
| fs_clustered.optimize(solver_fixture) | ||
|
|
||
| # Expand solution | ||
| fs_expanded = fs_clustered.transform.expand_solution() | ||
| cs_expanded = fs_expanded.solution['Battery|charge_state'] | ||
|
|
||
| # With self-discharge, SOC should decay over time within each period | ||
| # The expanded solution should still be non-negative | ||
| assert (cs_expanded >= -0.01).all() | ||
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fix DataArray truth‑value usage in assertions
In the storage expansion tests you use:
assert (cs_after >= -0.01).all(), ...
...
assert (cs_expanded >= -0.01).all()On an xarray DataArray, .all() returns a 0‑D DataArray, so assert on it will raise a “truth value of a DataArray is ambiguous” error rather than giving a boolean assertion result.
Convert to a scalar boolean before asserting, e.g. via .all().item() or by checking the min value:
Proposed fix
- # All values should be >= 0 (with small tolerance for numerical issues)
- assert (cs_after >= -0.01).all(), f'Negative charge_state found: min={float(cs_after.min())}'
+ # All values should be >= 0 (with small tolerance for numerical issues)
+ assert (cs_after >= -0.01).all().item(), (
+ f'Negative charge_state found: min={float(cs_after.min())}'
+ )
@@
- # With self-discharge, SOC should decay over time within each period
- # The expanded solution should still be non-negative
- assert (cs_expanded >= -0.01).all()
+ # With self-discharge, SOC should decay over time within each period
+ # The expanded solution should still be non-negative
+ assert (cs_expanded >= -0.01).all().item()🤖 Prompt for AI Agents
In @tests/test_cluster_reduce_expand.py around lines 490 - 523, The assertions
use xarray DataArray `.all()` which yields a 0-D DataArray (ambiguous truth
value); update the two asserts in test_storage expansion tests (references:
cs_after, cs_expanded, Battery|charge_state,
test_storage_self_discharge_decay_in_expansion) to convert to a Python bool
before asserting — e.g. replace `(cs_after >= -0.01).all()` and `(cs_expanded >=
-0.01).all()` with a scalar check such as `(cs_after >= -0.01).all().item()` or
use a min-based scalar check like `float(cs_after.min()) >= -0.01` (and
similarly for cs_expanded) so the assert receives a boolean.
Summary
Major release preparation for v6.0.0 with complete clustering/aggregation rework.
New Features
Improvements
Breaking Changes
See CHANGELOG.md for full list of breaking changes in v6.0.0.
Test Plan
Summary by CodeRabbit
New Features
Documentation
Chores
✏️ Tip: You can customize this high-level summary in your review settings.