diff --git a/.github/workflows/deploy-docs.yaml b/.github/workflows/deploy-docs.yaml new file mode 100644 index 000000000..e69de29bb diff --git a/docs/release-notes/v2.2.0.md b/docs/release-notes/v2.2.0.md new file mode 100644 index 000000000..3cf7eef8d --- /dev/null +++ b/docs/release-notes/v2.2.0.md @@ -0,0 +1,55 @@ +# Release v2.2.0 + +**Release Date:** YYYY-MM-DD + +## What's New + +### Scenarios +Scenarios are a new feature of flixopt. They can be used to model uncertainties in the flow system, such as: +* Different demand profiles +* Different price forecasts +* Different weather conditions +* Different climate conditions +The might also be used to model an evolving system with multiple investment periods. Each **scenario** might be a new year, a new month, or a new day, with a different set of investment decisions to take. + +The weighted sum of the total objective effect of each scenario is used as the objective of the optimization. + +#### Investments and scenarios +Scenarios allow for more flexibility in investment decisions. +You can decide to allow different investment decisions for each scenario, or to allow a single investment decision for a subset of all scnarios, while not allowing for an invest in others. +This enables the following use cases: +* Find the best investment decision for each scenario individually +* Find the best overall investment decision for possible scenarios (robust decision-making) +* Find the best overall investment decision for a subset of all scenarios + +The last one might be useful if you want to model a system with multiple investment periods, where one investment decision is made for more than one scenario. +This might occur when scenarios represent years or months, while an investment decision influences the system for multiple years or months. + + +## Other new features +* Balanced storage - Storage charging and discharging sizes can now be forced to be equal in when optimizing their size. +* Feature 2 - Description + +## Improvements + +* Improvement 1 - Description +* Improvement 2 - Description + +## Bug Fixes + +* Fixed issue with X +* Resolved problem with Y + +## Breaking Changes + +* Change 1 - Migration instructions +* Change 2 - Migration instructions + +## Deprecations + +* Feature X will be removed in v{next_version} + +## Dependencies + +* Added dependency X v1.2.3 +* Updated dependency Y to v2.0.0 \ No newline at end of file diff --git a/docs/user-guide/Mathematical Notation/Investment.md b/docs/user-guide/Mathematical Notation/Investment.md new file mode 100644 index 000000000..64467261d --- /dev/null +++ b/docs/user-guide/Mathematical Notation/Investment.md @@ -0,0 +1,115 @@ +# Investments + +## Current state +$$ +\beta_{\text{invest}} \cdot \text{max}(\epsilon, \text V^{\text L}) \leq V \leq \beta_{\text{invest}} \cdot \text V^{\text U} +$$ +With: +- $V$ = size +- $V^{\text L}$ = minimum size +- $V^{\text U}$ = maximum size +- $\epsilon$ = epsilon, a small number (such as $1e^{-5}$) +- $\beta_{invest} \in {0,1}$ = wether the size is invested or not + +_Please edit the use cases as needed_ +## Quickfix 1: Optimize the single best size overall +### Single variable +This is already possible and should be, as this is a needed use case +An additional factor to when the size is actually available might me practical (Which indicates the (fixed) time of investment) +## Math +$$ +V(p) = V * a(p) +$$ +with: +- $V$ = size +- $a(p)$ = factor for availlability per period + +Factor $a(p)$ is simply multiplied with relative minimum or maximum(t). This is already possible by doing this yourself. +Effectively, the relative minimum or maximum are altered before using the same constraiints as before. +THis might lead to some issues regariding minimum_load factor, or others, as the size is not 0 in a scenario where the component cant produce. +**Therefore this might not be the best choice. See (#Variable per Scenario) + +## Variable per Scenario +- **size** and **invest** as a variable per period $V(s)$ and $\beta_{invest}(s)$ +- with scenario $s \in S$ + +### Usecase 1: Optimize the size for each Scenario independently +Restrictions are seperatly for each scenario +No changes needed. This could be the default behaviour. + +### Usecase 2: Optimize ONE size for ALL scenarios +The size is the same globally, but not a scalar, but a variable per scenario $V(s)$ +#### 2a: The same size in all scenarios +$$ +V(s) = V(s') \quad \forall s,s' \in S +$$ + +With: +- $V(s)$ and $V(s')$ = size +- $S$ = set of scenarios + +#### 2b: The same size, but can be 0 prior to the first increment +- Find the Optimal time of investment. +- Force an investment in a certain scenario (parameter optional as a list/array ob booleans) +- Combine optional and minimum/maximum size to force an investment inside a range if scenarios + +$$ +\beta_{\text{invest}}(s) \leq \beta_{\text{invest}}(s+1) \quad \forall s \in \{1,2,\ldots,S-1\} +$$ + +$$ +V(s') - V(s) \leq M \cdot (2 - \beta_{\text{invest}}(s) - \beta_{\text{invest}}(s')) \quad \forall s, s' \in S +$$ +$$ +V(s') - V(s) \geq M \cdot (2 - \beta_{\text{invest}}(s) - \beta_{\text{invest}}(s')) \quad \forall s, s' \in S +$$ + +This could be the default behaviour. (which would be consistent with other variables) + + +### Switch + +$$ +\begin{aligned} +& \text{SWITCH}_s \in \{0,1\} \quad \forall s \in \{1,2,\ldots,S\} \\ +& \sum_{s=1}^{S} \text{SWITCH}_s = 1 \\ +& \beta_{\text{invest}}(s) = \sum_{s'=1}^{s} \text{SWITCH}_{s'} \quad \forall s \in \{1,2,\ldots,S\} \\ +\end{aligned} +$$ + +$$ +\begin{aligned} +& V(s) \leq V_{\text{actual}} \quad \forall s \in \{1,2,\ldots,S\} \\ +& V(s) \geq V_{\text{actual}} - M \cdot (1 - \beta_{\text{invest}}(s)) \quad \forall s \in \{1,2,\ldots,S\} +\end{aligned} +$$ + + + + +### Usecase 3: Find the best scenario to increment the size (Timing of the investment) +The size can only increment once (based on a starting point). This allows to optimize the timing of an investment. +#### Math +Treat $\beta_{invest}$ like an ON/OFF variable, and introduce a SwitchOn, that can only be active once. + +*Thoughts:* +- Treating $\beta_{invest}$ like an ON/OFF variable suggest using the already presentconstraints linked to On/OffModel +- The timing could be constraint to be first in scenario x, or last in scenario y +- Restrict the number of consecutive scenarios +THis might needs the OnOffModel to be more generic (HOURS). Further, the span between scenarios needs to be weighted (like dt_in_hours), or the scenarios need to be measureable (integers) + + +### Others + +#### Usecase 4: Only increase/decrease the size +Start from a certain size. For each scenario, the size can increase, but never decrease. (Or the other way around). +This would mean that a size expansion is possible, + +#### Usecase 5: Restrict the increment in size per scenario +Restrict how much the size can increase/decrease for in scenario, based on the prior scenario. + + + + + +Many more are possible diff --git a/examples/03_Calculation_types/example_calculation_types.py b/examples/03_Calculation_types/example_calculation_types.py index 97b18e3c0..e9f5604f9 100644 --- a/examples/03_Calculation_types/example_calculation_types.py +++ b/examples/03_Calculation_types/example_calculation_types.py @@ -194,34 +194,34 @@ def get_solutions(calcs: List, variable: str) -> xr.Dataset: # --- Plotting for comparison --- fx.plotting.with_plotly( get_solutions(calculations, 'Speicher|charge_state').to_dataframe(), - mode='line', + style='line', title='Charge State Comparison', ylabel='Charge state', ).write_html('results/Charge State.html') fx.plotting.with_plotly( get_solutions(calculations, 'BHKW2(Q_th)|flow_rate').to_dataframe(), - mode='line', + style='line', title='BHKW2(Q_th) Flow Rate Comparison', ylabel='Flow rate', ).write_html('results/BHKW2 Thermal Power.html') fx.plotting.with_plotly( get_solutions(calculations, 'costs(operation)|total_per_timestep').to_dataframe(), - mode='line', + style='line', title='Operation Cost Comparison', ylabel='Costs [€]', ).write_html('results/Operation Costs.html') fx.plotting.with_plotly( pd.DataFrame(get_solutions(calculations, 'costs(operation)|total_per_timestep').to_dataframe().sum()).T, - mode='bar', + style='stacked_bar', title='Total Cost Comparison', ylabel='Costs [€]', ).update_layout(barmode='group').write_html('results/Total Costs.html') fx.plotting.with_plotly( - pd.DataFrame([calc.durations for calc in calculations], index=[calc.name for calc in calculations]), 'bar' + pd.DataFrame([calc.durations for calc in calculations], index=[calc.name for calc in calculations]), 'stacked_bar' ).update_layout(title='Duration Comparison', xaxis_title='Calculation type', yaxis_title='Time (s)').write_html( 'results/Speed Comparison.html' ) diff --git a/examples/04_Scenarios/scenario_example.py b/examples/04_Scenarios/scenario_example.py new file mode 100644 index 000000000..b9932a016 --- /dev/null +++ b/examples/04_Scenarios/scenario_example.py @@ -0,0 +1,125 @@ +""" +This script shows how to use the flixopt framework to model a simple energy system. +""" + +import numpy as np +import pandas as pd +from rich.pretty import pprint # Used for pretty printing + +import flixopt as fx + +if __name__ == '__main__': + # Create datetime array starting from '2020-01-01' for the given time period + timesteps = pd.date_range('2020-01-01', periods=9, freq='h') + scenarios = pd.Index(['Base Case', 'High Demand']) + + # --- Create Time Series Data --- + # Heat demand profile (e.g., kW) over time and corresponding power prices + heat_demand_per_h = pd.DataFrame({'Base Case':[30, 0, 90, 110, 110, 20, 20, 20, 20], + 'High Demand':[30, 0, 100, 118, 125, 20, 20, 20, 20]}, index=timesteps) + power_prices = np.array([0.08, 0.09]) + + flow_system = fx.FlowSystem(timesteps=timesteps, scenarios=scenarios, scenario_weights=np.array([0.5, 0.6])) + + # --- Define Energy Buses --- + # These represent nodes, where the used medias are balanced (electricity, heat, and gas) + flow_system.add_elements(fx.Bus(label='Strom'), fx.Bus(label='Fernwärme'), fx.Bus(label='Gas')) + + # --- Define Effects (Objective and CO2 Emissions) --- + # Cost effect: used as the optimization objective --> minimizing costs + costs = fx.Effect( + label='costs', + unit='€', + description='Kosten', + is_standard=True, # standard effect: no explicit value needed for costs + is_objective=True, # Minimizing costs as the optimization objective + ) + + # CO2 emissions effect with an associated cost impact + CO2 = fx.Effect( + label='CO2', + unit='kg', + description='CO2_e-Emissionen', + specific_share_to_other_effects_operation={costs.label: 0.2}, + maximum_operation_per_hour=1000, # Max CO2 emissions per hour + ) + + # --- Define Flow System Components --- + # Boiler: Converts fuel (gas) into thermal energy (heat) + boiler = fx.linear_converters.Boiler( + label='Boiler', + eta=0.5, + Q_th=fx.Flow(label='Q_th', bus='Fernwärme', size=50, relative_minimum=0.1, relative_maximum=1, on_off_parameters=fx.OnOffParameters()), + Q_fu=fx.Flow(label='Q_fu', bus='Gas'), + ) + + # Combined Heat and Power (CHP): Generates both electricity and heat from fuel + chp = fx.linear_converters.CHP( + label='CHP', + eta_th=0.5, + eta_el=0.4, + P_el=fx.Flow('P_el', bus='Strom', size=60, relative_minimum=5 / 60, on_off_parameters=fx.OnOffParameters()), + Q_th=fx.Flow('Q_th', bus='Fernwärme'), + Q_fu=fx.Flow('Q_fu', bus='Gas'), + ) + + # Storage: Energy storage system with charging and discharging capabilities + storage = fx.Storage( + label='Storage', + charging=fx.Flow('Q_th_load', bus='Fernwärme', size=1000), + discharging=fx.Flow('Q_th_unload', bus='Fernwärme', size=1000), + capacity_in_flow_hours=fx.InvestParameters(fix_effects=20, fixed_size=30, optional=False), + initial_charge_state=0, # Initial storage state: empty + relative_maximum_charge_state=np.array([80, 70, 80, 80, 80, 80, 80, 80, 80, 80]) * 0.01, + eta_charge=0.9, + eta_discharge=1, # Efficiency factors for charging/discharging + relative_loss_per_hour=0.08, # 8% loss per hour. Absolute loss depends on current charge state + prevent_simultaneous_charge_and_discharge=True, # Prevent charging and discharging at the same time + ) + + # Heat Demand Sink: Represents a fixed heat demand profile + heat_sink = fx.Sink( + label='Heat Demand', + sink=fx.Flow(label='Q_th_Last', bus='Fernwärme', size=1, fixed_relative_profile=heat_demand_per_h), + ) + + # Gas Source: Gas tariff source with associated costs and CO2 emissions + gas_source = fx.Source( + label='Gastarif', + source=fx.Flow(label='Q_Gas', bus='Gas', size=1000, effects_per_flow_hour={costs.label: 0.04, CO2.label: 0.3}), + ) + + # Power Sink: Represents the export of electricity to the grid + power_sink = fx.Sink( + label='Einspeisung', sink=fx.Flow(label='P_el', bus='Strom', effects_per_flow_hour=-1 * power_prices) + ) + + # --- Build the Flow System --- + # Add all defined components and effects to the flow system + flow_system.add_elements(costs, CO2, boiler, storage, chp, heat_sink, gas_source, power_sink) + + # Visualize the flow system for validation purposes + flow_system.plot_network(show=True) + + # --- Define and Run Calculation --- + # Create a calculation object to model the Flow System + calculation = fx.FullCalculation(name='Sim1', flow_system=flow_system) + calculation.do_modeling() # Translate the model to a solvable form, creating equations and Variables + + # --- Solve the Calculation and Save Results --- + calculation.solve(fx.solvers.HighsSolver(mip_gap=0, time_limit_seconds=30)) + + # --- Analyze Results --- + calculation.results['Fernwärme'].plot_node_balance_pie() + calculation.results['Fernwärme'].plot_node_balance(style='stacked_bar') + calculation.results['Storage'].plot_node_balance() + calculation.results.plot_heatmap('CHP(Q_th)|flow_rate') + + # Convert the results for the storage component to a dataframe and display + df = calculation.results['Storage'].node_balance_with_charge_state() + print(df) + calculation.results['Storage'].plot_charge_state(engine='matplotlib') + + # Save results to file for later usage + calculation.results.to_file() + fig, ax = calculation.results['Storage'].plot_charge_state(engine='matplotlib') diff --git a/flixopt/calculation.py b/flixopt/calculation.py index c7367cad2..942b63a81 100644 --- a/flixopt/calculation.py +++ b/flixopt/calculation.py @@ -12,6 +12,7 @@ import math import pathlib import timeit +import warnings from typing import Any, Dict, List, Optional, Union import numpy as np @@ -43,20 +44,32 @@ def __init__( self, name: str, flow_system: FlowSystem, - active_timesteps: Optional[pd.DatetimeIndex] = None, + selected_timesteps: Optional[pd.DatetimeIndex] = None, + selected_scenarios: Optional[pd.Index] = None, folder: Optional[pathlib.Path] = None, + active_timesteps: Optional[pd.DatetimeIndex] = None, ): """ Args: name: name of calculation flow_system: flow_system which should be calculated - active_timesteps: list with indices, which should be used for calculation. If None, then all timesteps are used. + selected_timesteps: timesteps which should be used for calculation. If None, then all timesteps are used. + selected_scenarios: scenarios which should be used for calculation. If None, then all scenarios are used. folder: folder where results should be saved. If None, then the current working directory is used. + active_timesteps: Deprecated. Use selected_timesteps instead. """ + if active_timesteps is not None: + warnings.warn( + 'active_timesteps is deprecated. Use selected_timesteps instead.', + DeprecationWarning, + stacklevel=2, + ) + selected_timesteps = active_timesteps self.name = name self.flow_system = flow_system self.model: Optional[SystemModel] = None - self.active_timesteps = active_timesteps + self.selected_timesteps = selected_timesteps + self.selected_scenarios = selected_scenarios self.durations = {'modeling': 0.0, 'solving': 0.0, 'saving': 0.0} self.folder = pathlib.Path.cwd() / 'results' if folder is None else pathlib.Path(folder) @@ -74,47 +87,49 @@ def __init__( def main_results(self) -> Dict[str, Union[Scalar, Dict]]: from flixopt.features import InvestmentModel - return { + main_results = { 'Objective': self.model.objective.value, - 'Penalty': float(self.model.effects.penalty.total.solution.values), + 'Penalty': self.model.effects.penalty.total.solution.values, 'Effects': { f'{effect.label} [{effect.unit}]': { - 'operation': float(effect.model.operation.total.solution.values), - 'invest': float(effect.model.invest.total.solution.values), - 'total': float(effect.model.total.solution.values), + 'operation': effect.model.operation.total.solution.values, + 'invest': effect.model.invest.total.solution.values, + 'total': effect.model.total.solution.values, } for effect in self.flow_system.effects }, 'Invest-Decisions': { 'Invested': { - model.label_of_element: float(model.size.solution) + model.label_of_element: model.size.solution for component in self.flow_system.components.values() for model in component.model.all_sub_models - if isinstance(model, InvestmentModel) and float(model.size.solution) >= CONFIG.modeling.EPSILON + if isinstance(model, InvestmentModel) and model.size.solution.max() >= CONFIG.modeling.EPSILON }, 'Not invested': { - model.label_of_element: float(model.size.solution) + model.label_of_element: model.size.solution for component in self.flow_system.components.values() for model in component.model.all_sub_models - if isinstance(model, InvestmentModel) and float(model.size.solution) < CONFIG.modeling.EPSILON + if isinstance(model, InvestmentModel) and model.size.solution.max() < CONFIG.modeling.EPSILON }, }, 'Buses with excess': [ { bus.label_full: { - 'input': float(np.sum(bus.model.excess_input.solution.values)), - 'output': float(np.sum(bus.model.excess_output.solution.values)), + 'input': bus.model.excess_input.solution.sum('time'), + 'output': bus.model.excess_output.solution.sum('time'), } } for bus in self.flow_system.buses.values() if bus.with_excess and ( - float(np.sum(bus.model.excess_input.solution.values)) > 1e-3 - or float(np.sum(bus.model.excess_output.solution.values)) > 1e-3 + bus.model.excess_input.solution.sum() > 1e-3 + or bus.model.excess_output.solution.sum() > 1e-3 ) ], } + return utils.round_floats(main_results) + @property def summary(self): return { @@ -128,6 +143,15 @@ def summary(self): 'Config': CONFIG.to_dict(), } + @property + def active_timesteps(self) -> pd.DatetimeIndex: + warnings.warn( + 'active_timesteps is deprecated. Use selected_timesteps instead.', + DeprecationWarning, + stacklevel=2, + ) + return self.selected_timesteps + class FullCalculation(Calculation): """ @@ -183,8 +207,8 @@ def solve(self, solver: _Solver, log_file: Optional[pathlib.Path] = None, log_ma def _activate_time_series(self): self.flow_system.transform_data() - self.flow_system.time_series_collection.activate_timesteps( - active_timesteps=self.active_timesteps, + self.flow_system.time_series_collection.set_selection( + timesteps=self.selected_timesteps, scenarios=self.selected_scenarios ) @@ -199,7 +223,7 @@ def __init__( flow_system: FlowSystem, aggregation_parameters: AggregationParameters, components_to_clusterize: Optional[List[Component]] = None, - active_timesteps: Optional[pd.DatetimeIndex] = None, + selected_timesteps: Optional[pd.DatetimeIndex] = None, folder: Optional[pathlib.Path] = None, ): """ @@ -213,11 +237,13 @@ def __init__( components_to_clusterize: List of Components to perform aggregation on. If None, then all components are aggregated. This means, teh variables in the components are equalized to each other, according to the typical periods computed in the DataAggregation - active_timesteps: pd.DatetimeIndex or None + selected_timesteps: pd.DatetimeIndex or None list with indices, which should be used for calculation. If None, then all timesteps are used. folder: folder where results should be saved. If None, then the current working directory is used. """ - super().__init__(name, flow_system, active_timesteps, folder=folder) + if flow_system.time_series_collection.scenarios is not None: + raise ValueError('Aggregation is not supported for scenarios yet. Please use FullCalculation instead.') + super().__init__(name, flow_system, selected_timesteps, folder=folder) self.aggregation_parameters = aggregation_parameters self.components_to_clusterize = components_to_clusterize self.aggregation = None @@ -272,9 +298,9 @@ def _perform_aggregation(self): # Aggregation - creation of aggregated timeseries: self.aggregation = Aggregation( - original_data=self.flow_system.time_series_collection.to_dataframe( - include_extra_timestep=False - ), # Exclude last row (NaN) + original_data=self.flow_system.time_series_collection.as_dataset( + with_extra_timestep=False, with_constants=False + ).to_dataframe(), hours_per_time_step=float(dt_min), hours_per_period=self.aggregation_parameters.hours_per_period, nr_of_periods=self.aggregation_parameters.nr_of_periods, @@ -286,9 +312,11 @@ def _perform_aggregation(self): self.aggregation.cluster() self.aggregation.plot(show=True, save=self.folder / 'aggregation.html') if self.aggregation_parameters.aggregate_data_and_fix_non_binary_vars: - self.flow_system.time_series_collection.insert_new_data( - self.aggregation.aggregated_data, include_extra_timestep=False - ) + for col in self.aggregation.aggregated_data.columns: + data = self.aggregation.aggregated_data[col].values + if col in self.flow_system.time_series_collection._has_extra_timestep: + data = np.append(data, data[-1]) + self.flow_system.time_series_collection.update_time_series(col, data) self.durations['aggregation'] = round(timeit.default_timer() - t_start_agg, 2) @@ -327,13 +355,13 @@ def __init__( self.nr_of_previous_values = nr_of_previous_values self.sub_calculations: List[FullCalculation] = [] - self.all_timesteps = self.flow_system.time_series_collection.all_timesteps - self.all_timesteps_extra = self.flow_system.time_series_collection.all_timesteps_extra + self.all_timesteps = self.flow_system.time_series_collection._full_timesteps + self.all_timesteps_extra = self.flow_system.time_series_collection._full_timesteps_extra self.segment_names = [ f'Segment_{i + 1}' for i in range(math.ceil(len(self.all_timesteps) / self.timesteps_per_segment)) ] - self.active_timesteps_per_segment = self._calculate_timesteps_of_segment() + self.selected_timesteps_per_segment = self._calculate_timesteps_of_segment() assert timesteps_per_segment > 2, 'The Segment length must be greater 2, due to unwanted internal side effects' assert self.timesteps_per_segment_with_overlap <= len(self.all_timesteps), ( @@ -359,7 +387,7 @@ def do_modeling_and_solve( logger.info(f'{" Segmented Solving ":#^80}') for i, (segment_name, timesteps_of_segment) in enumerate( - zip(self.segment_names, self.active_timesteps_per_segment, strict=False) + zip(self.segment_names, self.selected_timesteps_per_segment, strict=False) ): if self.sub_calculations: self._transfer_start_values(i) @@ -370,7 +398,7 @@ def do_modeling_and_solve( ) calculation = FullCalculation( - f'{self.name}-{segment_name}', self.flow_system, active_timesteps=timesteps_of_segment + f'{self.name}-{segment_name}', self.flow_system, selected_timesteps=timesteps_of_segment ) self.sub_calculations.append(calculation) calculation.do_modeling() @@ -404,9 +432,9 @@ def _transfer_start_values(self, segment_index: int): This function gets the last values of the previous solved segment and inserts them as start values for the next segment """ - timesteps_of_prior_segment = self.active_timesteps_per_segment[segment_index - 1] + timesteps_of_prior_segment = self.selected_timesteps_per_segment[segment_index - 1] - start = self.active_timesteps_per_segment[segment_index][0] + start = self.selected_timesteps_per_segment[segment_index][0] start_previous_values = timesteps_of_prior_segment[self.timesteps_per_segment - self.nr_of_previous_values] end_previous_values = timesteps_of_prior_segment[self.timesteps_per_segment - 1] @@ -435,12 +463,12 @@ def _reset_start_values(self): comp.initial_charge_state = self._original_start_values[comp.label_full] def _calculate_timesteps_of_segment(self) -> List[pd.DatetimeIndex]: - active_timesteps_per_segment = [] + selected_timesteps_per_segment = [] for i, _ in enumerate(self.segment_names): start = self.timesteps_per_segment * i end = min(start + self.timesteps_per_segment_with_overlap, len(self.all_timesteps)) - active_timesteps_per_segment.append(self.all_timesteps[start:end]) - return active_timesteps_per_segment + selected_timesteps_per_segment.append(self.all_timesteps[start:end]) + return selected_timesteps_per_segment @property def timesteps_per_segment_with_overlap(self): diff --git a/flixopt/components.py b/flixopt/components.py index 1f5fe5ece..47bb282f5 100644 --- a/flixopt/components.py +++ b/flixopt/components.py @@ -9,7 +9,7 @@ import numpy as np from . import utils -from .core import NumericData, NumericDataTS, PlausibilityError, Scalar, TimeSeries +from .core import PlausibilityError, Scalar, ScenarioData, TimeSeries, TimestepData from .elements import Component, ComponentModel, Flow from .features import InvestmentModel, OnOffModel, PiecewiseModel from .interface import InvestParameters, OnOffParameters, PiecewiseConversion @@ -34,7 +34,7 @@ def __init__( inputs: List[Flow], outputs: List[Flow], on_off_parameters: OnOffParameters = None, - conversion_factors: List[Dict[str, NumericDataTS]] = None, + conversion_factors: List[Dict[str, TimestepData]] = None, piecewise_conversion: Optional[PiecewiseConversion] = None, meta_data: Optional[Dict] = None, ): @@ -86,9 +86,9 @@ def _plausibility_checks(self) -> None: if self.piecewise_conversion: for flow in self.flows.values(): if isinstance(flow.size, InvestParameters) and flow.size.fixed_size is None: - raise PlausibilityError( - f'piecewise_conversion (in {self.label_full}) and variable size ' - f'(in flow {flow.label_full}) do not make sense together!' + logger.warning( + f'Using a FLow with a fixed size ({flow.label_full}) AND a piecewise_conversion ' + f'(in {self.label_full}) and variable size is uncommon. Please check if this is intended!' ) def transform_data(self, flow_system: 'FlowSystem'): @@ -96,6 +96,7 @@ def transform_data(self, flow_system: 'FlowSystem'): if self.conversion_factors: self.conversion_factors = self._transform_conversion_factors(flow_system) if self.piecewise_conversion: + self.piecewise_conversion.has_time_dim = True self.piecewise_conversion.transform_data(flow_system, f'{self.label_full}|PiecewiseConversion') def _transform_conversion_factors(self, flow_system: 'FlowSystem') -> List[Dict[str, TimeSeries]]: @@ -127,16 +128,17 @@ def __init__( label: str, charging: Flow, discharging: Flow, - capacity_in_flow_hours: Union[Scalar, InvestParameters], - relative_minimum_charge_state: NumericData = 0, - relative_maximum_charge_state: NumericData = 1, - initial_charge_state: Union[Scalar, Literal['lastValueOfSim']] = 0, - minimal_final_charge_state: Optional[Scalar] = None, - maximal_final_charge_state: Optional[Scalar] = None, - eta_charge: NumericData = 1, - eta_discharge: NumericData = 1, - relative_loss_per_hour: NumericData = 0, + capacity_in_flow_hours: Union[ScenarioData, InvestParameters], + relative_minimum_charge_state: TimestepData = 0, + relative_maximum_charge_state: TimestepData = 1, + initial_charge_state: Union[ScenarioData, Literal['lastValueOfSim']] = 0, + minimal_final_charge_state: Optional[ScenarioData] = None, + maximal_final_charge_state: Optional[ScenarioData] = None, + eta_charge: TimestepData = 1, + eta_discharge: TimestepData = 1, + relative_loss_per_hour: TimestepData = 0, prevent_simultaneous_charge_and_discharge: bool = True, + balanced: bool = False, meta_data: Optional[Dict] = None, ): """ @@ -162,6 +164,7 @@ def __init__( relative_loss_per_hour: loss per chargeState-Unit per hour. The default is 0. prevent_simultaneous_charge_and_discharge: If True, loading and unloading at the same time is not possible. Increases the number of binary variables, but is recommended for easier evaluation. The default is True. + balanced: Wether to equate the size of the charging and discharging flow. Only if not fixed. meta_data: used to store more information about the Element. Is not used internally, but saved in the results. Only use python native types. """ # TODO: fixed_relative_chargeState implementieren @@ -176,17 +179,18 @@ def __init__( self.charging = charging self.discharging = discharging self.capacity_in_flow_hours = capacity_in_flow_hours - self.relative_minimum_charge_state: NumericDataTS = relative_minimum_charge_state - self.relative_maximum_charge_state: NumericDataTS = relative_maximum_charge_state + self.relative_minimum_charge_state: TimestepData = relative_minimum_charge_state + self.relative_maximum_charge_state: TimestepData = relative_maximum_charge_state self.initial_charge_state = initial_charge_state self.minimal_final_charge_state = minimal_final_charge_state self.maximal_final_charge_state = maximal_final_charge_state - self.eta_charge: NumericDataTS = eta_charge - self.eta_discharge: NumericDataTS = eta_discharge - self.relative_loss_per_hour: NumericDataTS = relative_loss_per_hour + self.eta_charge: TimestepData = eta_charge + self.eta_discharge: TimestepData = eta_discharge + self.relative_loss_per_hour: TimestepData = relative_loss_per_hour self.prevent_simultaneous_charge_and_discharge = prevent_simultaneous_charge_and_discharge + self.balanced = balanced def create_model(self, model: SystemModel) -> 'StorageModel': self._plausibility_checks() @@ -198,55 +202,83 @@ def transform_data(self, flow_system: 'FlowSystem') -> None: self.relative_minimum_charge_state = flow_system.create_time_series( f'{self.label_full}|relative_minimum_charge_state', self.relative_minimum_charge_state, - needs_extra_timestep=True, + has_extra_timestep=True, ) self.relative_maximum_charge_state = flow_system.create_time_series( f'{self.label_full}|relative_maximum_charge_state', self.relative_maximum_charge_state, - needs_extra_timestep=True, + has_extra_timestep=True, ) self.eta_charge = flow_system.create_time_series(f'{self.label_full}|eta_charge', self.eta_charge) self.eta_discharge = flow_system.create_time_series(f'{self.label_full}|eta_discharge', self.eta_discharge) self.relative_loss_per_hour = flow_system.create_time_series( f'{self.label_full}|relative_loss_per_hour', self.relative_loss_per_hour ) + if not isinstance(self.initial_charge_state, str): + self.initial_charge_state = flow_system.create_time_series( + f'{self.label_full}|initial_charge_state', self.initial_charge_state, has_time_dim=False + ) + self.minimal_final_charge_state = flow_system.create_time_series( + f'{self.label_full}|minimal_final_charge_state', self.minimal_final_charge_state, has_time_dim=False + ) + self.maximal_final_charge_state = flow_system.create_time_series( + f'{self.label_full}|maximal_final_charge_state', self.maximal_final_charge_state, has_time_dim=False + ) if isinstance(self.capacity_in_flow_hours, InvestParameters): - self.capacity_in_flow_hours.transform_data(flow_system) + self.capacity_in_flow_hours.transform_data(flow_system, f'{self.label_full}|InvestParameters') + else: + self.capacity_in_flow_hours = flow_system.create_time_series( + f'{self.label_full}|capacity_in_flow_hours', self.capacity_in_flow_hours, has_time_dim=False + ) def _plausibility_checks(self) -> None: """ Check for infeasible or uncommon combinations of parameters """ super()._plausibility_checks() - if utils.is_number(self.initial_charge_state): - if isinstance(self.capacity_in_flow_hours, InvestParameters): - if self.capacity_in_flow_hours.fixed_size is None: - maximum_capacity = self.capacity_in_flow_hours.maximum_size - minimum_capacity = self.capacity_in_flow_hours.minimum_size - else: - maximum_capacity = self.capacity_in_flow_hours.fixed_size - minimum_capacity = self.capacity_in_flow_hours.fixed_size + if isinstance(self.initial_charge_state, str): + if self.initial_charge_state != 'lastValueOfSim': + raise PlausibilityError(f'initial_charge_state has undefined value: {self.initial_charge_state}') + return + if isinstance(self.capacity_in_flow_hours, InvestParameters): + if self.capacity_in_flow_hours.fixed_size is None: + maximum_capacity = self.capacity_in_flow_hours.maximum_size + minimum_capacity = self.capacity_in_flow_hours.minimum_size else: - maximum_capacity = self.capacity_in_flow_hours - minimum_capacity = self.capacity_in_flow_hours - - # initial capacity >= allowed min for maximum_size: - minimum_inital_capacity = maximum_capacity * self.relative_minimum_charge_state.isel(time=1) - # initial capacity <= allowed max for minimum_size: - maximum_inital_capacity = minimum_capacity * self.relative_maximum_charge_state.isel(time=1) + maximum_capacity = self.capacity_in_flow_hours.fixed_size + minimum_capacity = self.capacity_in_flow_hours.fixed_size + else: + maximum_capacity = self.capacity_in_flow_hours + minimum_capacity = self.capacity_in_flow_hours + + # initial capacity >= allowed min for maximum_size: + minimum_inital_capacity = maximum_capacity * self.relative_minimum_charge_state.isel(time=0) + # initial capacity <= allowed max for minimum_size: + maximum_inital_capacity = minimum_capacity * self.relative_maximum_charge_state.isel(time=0) + # TODO: index=1 ??? I think index 0 + + if (self.initial_charge_state > maximum_inital_capacity).any(): + raise ValueError( + f'{self.label_full}: {self.initial_charge_state=} ' + f'is above allowed maximum charge_state {maximum_inital_capacity}' + ) + if (self.initial_charge_state < minimum_inital_capacity).any(): + raise ValueError( + f'{self.label_full}: {self.initial_charge_state=} ' + f'is below allowed minimum charge_state {minimum_inital_capacity}' + ) - if self.initial_charge_state > maximum_inital_capacity: - raise ValueError( - f'{self.label_full}: {self.initial_charge_state=} ' - f'is above allowed maximum charge_state {maximum_inital_capacity}' - ) - if self.initial_charge_state < minimum_inital_capacity: - raise ValueError( - f'{self.label_full}: {self.initial_charge_state=} ' - f'is below allowed minimum charge_state {minimum_inital_capacity}' - ) - elif self.initial_charge_state != 'lastValueOfSim': - raise ValueError(f'{self.label_full}: {self.initial_charge_state=} has an invalid value') + if self.balanced: + if not isinstance(self.charging.size, InvestParameters) or not isinstance(self.discharging.size, InvestParameters): + raise PlausibilityError( + f'Balancing charging and discharging Flows in {self.label_full} ' + f'is only possible with Investments.') + if (self.charging.size.minimum_size > self.discharging.size.maximum_size or + self.charging.size.maximum_size < self.discharging.size.minimum_size): + raise PlausibilityError( + f'Balancing charging and discharging Flows in {self.label_full} need compatible minimum and maximum sizes.' + f'Got: {self.charging.size.minimum_size=}, {self.charging.size.maximum_size=} and ' + f'{self.charging.size.minimum_size=}, {self.charging.size.maximum_size=}.') @register_class_for_io @@ -264,8 +296,8 @@ def __init__( out1: Flow, in2: Optional[Flow] = None, out2: Optional[Flow] = None, - relative_losses: Optional[NumericDataTS] = None, - absolute_losses: Optional[NumericDataTS] = None, + relative_losses: Optional[TimestepData] = None, + absolute_losses: Optional[TimestepData] = None, on_off_parameters: OnOffParameters = None, prevent_simultaneous_flows_in_both_directions: bool = True, meta_data: Optional[Dict] = None, @@ -348,7 +380,7 @@ def __init__(self, model: SystemModel, element: Transmission): def do_modeling(self): """Initiates all FlowModels""" # Force On Variable if absolute losses are present - if (self.element.absolute_losses is not None) and np.any(self.element.absolute_losses.active_data != 0): + if (self.element.absolute_losses is not None) and np.any(self.element.absolute_losses.selected_data != 0): for flow in self.element.inputs + self.element.outputs: if flow.on_off_parameters is None: flow.on_off_parameters = OnOffParameters() @@ -385,14 +417,14 @@ def create_transmission_equation(self, name: str, in_flow: Flow, out_flow: Flow) # eq: out(t) + on(t)*loss_abs(t) = in(t)*(1 - loss_rel(t)) con_transmission = self.add( self._model.add_constraints( - out_flow.model.flow_rate == -in_flow.model.flow_rate * (self.element.relative_losses.active_data - 1), + out_flow.model.flow_rate == -in_flow.model.flow_rate * (self.element.relative_losses.selected_data - 1), name=f'{self.label_full}|{name}', ), name, ) if self.element.absolute_losses is not None: - con_transmission.lhs += in_flow.model.on_off.on * self.element.absolute_losses.active_data + con_transmission.lhs += in_flow.model.on_off.on * self.element.absolute_losses.selected_data return con_transmission @@ -420,8 +452,10 @@ def do_modeling(self): self.add( self._model.add_constraints( - sum([flow.model.flow_rate * conv_factors[flow.label].active_data for flow in used_inputs]) - == sum([flow.model.flow_rate * conv_factors[flow.label].active_data for flow in used_outputs]), + sum([flow.model.flow_rate * conv_factors[flow.label].selected_data for flow in used_inputs]) + == sum( + [flow.model.flow_rate * conv_factors[flow.label].selected_data for flow in used_outputs] + ), name=f'{self.label_full}|conversion_{i}', ) ) @@ -461,12 +495,15 @@ def do_modeling(self): lb, ub = self.absolute_charge_state_bounds self.charge_state = self.add( self._model.add_variables( - lower=lb, upper=ub, coords=self._model.coords_extra, name=f'{self.label_full}|charge_state' + lower=lb, + upper=ub, + coords=self._model.get_coords(extra_timestep=True), + name=f'{self.label_full}|charge_state', ), 'charge_state', ) self.netto_discharge = self.add( - self._model.add_variables(coords=self._model.coords, name=f'{self.label_full}|netto_discharge'), + self._model.add_variables(coords=self._model.get_coords(), name=f'{self.label_full}|netto_discharge'), 'netto_discharge', ) # netto_discharge: @@ -481,12 +518,12 @@ def do_modeling(self): ) charge_state = self.charge_state - rel_loss = self.element.relative_loss_per_hour.active_data + rel_loss = self.element.relative_loss_per_hour.selected_data hours_per_step = self._model.hours_per_step charge_rate = self.element.charging.model.flow_rate discharge_rate = self.element.discharging.model.flow_rate - eff_charge = self.element.eta_charge.active_data - eff_discharge = self.element.eta_discharge.active_data + eff_charge = self.element.eta_charge.selected_data + eff_discharge = self.element.eta_discharge.selected_data self.add( self._model.add_constraints( @@ -513,29 +550,34 @@ def do_modeling(self): # Initial charge state self._initial_and_final_charge_state() + if self.element.balanced: + self.add( + self._model.add_constraints( + self.element.charging.model._investment.size * 1 == self.element.discharging.model._investment.size * 1, + name=f'{self.label_full}|balanced_sizes', + ), + 'balanced_sizes' + ) + def _initial_and_final_charge_state(self): if self.element.initial_charge_state is not None: name_short = 'initial_charge_state' name = f'{self.label_full}|{name_short}' - if utils.is_number(self.element.initial_charge_state): + if isinstance(self.element.initial_charge_state, str): self.add( self._model.add_constraints( - self.charge_state.isel(time=0) == self.element.initial_charge_state, name=name + self.charge_state.isel(time=0) == self.charge_state.isel(time=-1), name=name ), name_short, ) - elif self.element.initial_charge_state == 'lastValueOfSim': + else: self.add( self._model.add_constraints( - self.charge_state.isel(time=0) == self.charge_state.isel(time=-1), name=name + self.charge_state.isel(time=0) == self.element.initial_charge_state, name=name ), name_short, ) - else: # TODO: Validation in Storage Class, not in Model - raise PlausibilityError( - f'initial_charge_state has undefined value: {self.element.initial_charge_state}' - ) if self.element.maximal_final_charge_state is not None: self.add( @@ -556,7 +598,7 @@ def _initial_and_final_charge_state(self): ) @property - def absolute_charge_state_bounds(self) -> Tuple[NumericData, NumericData]: + def absolute_charge_state_bounds(self) -> Tuple[TimestepData, TimestepData]: relative_lower_bound, relative_upper_bound = self.relative_charge_state_bounds if not isinstance(self.element.capacity_in_flow_hours, InvestParameters): return ( @@ -570,10 +612,10 @@ def absolute_charge_state_bounds(self) -> Tuple[NumericData, NumericData]: ) @property - def relative_charge_state_bounds(self) -> Tuple[NumericData, NumericData]: + def relative_charge_state_bounds(self) -> Tuple[TimestepData, TimestepData]: return ( - self.element.relative_minimum_charge_state.active_data, - self.element.relative_maximum_charge_state.active_data, + self.element.relative_minimum_charge_state, + self.element.relative_maximum_charge_state, ) diff --git a/flixopt/core.py b/flixopt/core.py index 08be18f1d..ea447e652 100644 --- a/flixopt/core.py +++ b/flixopt/core.py @@ -7,6 +7,7 @@ import json import logging import pathlib +import textwrap from collections import Counter from typing import Any, Dict, Iterator, List, Literal, Optional, Tuple, Union @@ -25,6 +26,12 @@ NumericDataTS = Union[NumericData, 'TimeSeriesData'] """Represents either standard numeric data or TimeSeriesData.""" +TimestepData = NumericData +"""Represents any form of numeric data that corresponds to timesteps.""" + +ScenarioData = NumericData +"""Represents any form of numeric data that corresponds to scenarios.""" + class PlausibilityError(Exception): """Error for a failing Plausibility check.""" @@ -40,61 +47,446 @@ class ConversionError(Exception): class DataConverter: """ - Converts various data types into xarray.DataArray with a timesteps index. + Converts various data types into xarray.DataArray with optional time and scenario dimension. - Supports: scalars, arrays, Series, DataFrames, and DataArrays. + Current implementation handles: + - Scalar values + - NumPy arrays + - xarray.DataArray """ @staticmethod - def as_dataarray(data: NumericData, timesteps: pd.DatetimeIndex) -> xr.DataArray: - """Convert data to xarray.DataArray with specified timesteps index.""" + def as_dataarray( + data: TimestepData, timesteps: Optional[pd.DatetimeIndex] = None, scenarios: Optional[pd.Index] = None + ) -> xr.DataArray: + """ + Convert data to xarray.DataArray with specified dimensions. + + Args: + data: The data to convert (scalar, array, or DataArray) + timesteps: Optional DatetimeIndex for time dimension + scenarios: Optional Index for scenario dimension + + Returns: + DataArray with the converted data + """ + # Prepare dimensions and coordinates + coords, dims = DataConverter._prepare_dimensions(timesteps, scenarios) + + # Select appropriate converter based on data type + if isinstance(data, (int, float, np.integer, np.floating)): + return DataConverter._convert_scalar(data, coords, dims) + + elif isinstance(data, xr.DataArray): + return DataConverter._convert_dataarray(data, coords, dims) + + elif isinstance(data, np.ndarray): + return DataConverter._convert_ndarray(data, coords, dims) + + elif isinstance(data, pd.Series): + return DataConverter._convert_series(data, coords, dims) + + elif isinstance(data, pd.DataFrame): + return DataConverter._convert_dataframe(data, coords, dims) + + else: + raise ConversionError(f'Unsupported data type: {type(data).__name__}') + + @staticmethod + def _validate_timesteps(timesteps: pd.DatetimeIndex) -> pd.DatetimeIndex: + """ + Validate and prepare time index. + + Args: + timesteps: The time index to validate + + Returns: + Validated time index + """ if not isinstance(timesteps, pd.DatetimeIndex) or len(timesteps) == 0: - raise ValueError(f'Timesteps must be a non-empty DatetimeIndex, got {type(timesteps).__name__}') + raise ConversionError('Timesteps must be a non-empty DatetimeIndex') + if not timesteps.name == 'time': - raise ConversionError(f'DatetimeIndex is not named correctly. Must be named "time", got {timesteps.name=}') - - coords = [timesteps] - dims = ['time'] - expected_shape = (len(timesteps),) - - try: - if isinstance(data, (int, float, np.integer, np.floating)): - return xr.DataArray(data, coords=coords, dims=dims) - elif isinstance(data, pd.DataFrame): - if not data.index.equals(timesteps): - raise ConversionError("DataFrame index doesn't match timesteps index") - if not len(data.columns) == 1: - raise ConversionError('DataFrame must have exactly one column') - return xr.DataArray(data.values.flatten(), coords=coords, dims=dims) - elif isinstance(data, pd.Series): - if not data.index.equals(timesteps): - raise ConversionError("Series index doesn't match timesteps index") - return xr.DataArray(data.values, coords=coords, dims=dims) - elif isinstance(data, np.ndarray): - if data.ndim != 1: - raise ConversionError(f'Array must be 1-dimensional, got {data.ndim}') - elif data.shape[0] != expected_shape[0]: - raise ConversionError(f"Array shape {data.shape} doesn't match expected {expected_shape}") - return xr.DataArray(data, coords=coords, dims=dims) - elif isinstance(data, xr.DataArray): - if data.dims != tuple(dims): - raise ConversionError(f"DataArray dimensions {data.dims} don't match expected {dims}") - if data.sizes[dims[0]] != len(coords[0]): - raise ConversionError( - f"DataArray length {data.sizes[dims[0]]} doesn't match expected {len(coords[0])}" - ) - return data.copy(deep=True) + raise ConversionError(f'Scenarios must be named "time", got "{timesteps.name}"') + + return timesteps + + @staticmethod + def _validate_scenarios(scenarios: pd.Index) -> pd.Index: + """ + Validate and prepare scenario index. + + Args: + scenarios: The scenario index to validate + """ + if not isinstance(scenarios, pd.Index) or len(scenarios) == 0: + raise ConversionError('Scenarios must be a non-empty Index') + + if not scenarios.name == 'scenario': + raise ConversionError(f'Scenarios must be named "scenario", got "{scenarios.name}"') + + return scenarios + + @staticmethod + def _prepare_dimensions( + timesteps: Optional[pd.DatetimeIndex], scenarios: Optional[pd.Index] + ) -> Tuple[Dict[str, pd.Index], Tuple[str, ...]]: + """ + Prepare coordinates and dimensions for the DataArray. + + Args: + timesteps: Optional time index + scenarios: Optional scenario index + + Returns: + Tuple of (coordinates dict, dimensions tuple) + """ + # Validate inputs if provided + if timesteps is not None: + timesteps = DataConverter._validate_timesteps(timesteps) + + if scenarios is not None: + scenarios = DataConverter._validate_scenarios(scenarios) + + # Build coordinates and dimensions + coords = {} + dims = [] + + if timesteps is not None: + coords['time'] = timesteps + dims.append('time') + + if scenarios is not None: + coords['scenario'] = scenarios + dims.append('scenario') + + return coords, tuple(dims) + + @staticmethod + def _convert_scalar( + data: Union[int, float, np.integer, np.floating], coords: Dict[str, pd.Index], dims: Tuple[str, ...] + ) -> xr.DataArray: + """ + Convert a scalar value to a DataArray. + + Args: + data: The scalar value + coords: Coordinate dictionary + dims: Dimension names + + Returns: + DataArray with the scalar value + """ + if isinstance(data, (np.integer, np.floating)): + data = data.item() + return xr.DataArray(data, coords=coords, dims=dims) + + @staticmethod + def _convert_dataarray(data: xr.DataArray, coords: Dict[str, pd.Index], dims: Tuple[str, ...]) -> xr.DataArray: + """ + Convert an existing DataArray to desired dimensions. + + Args: + data: The source DataArray + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray with the target dimensions + """ + # No dimensions case + if len(dims) == 0: + if data.size != 1: + raise ConversionError('When converting to dimensionless DataArray, source must be scalar') + return xr.DataArray(data.values.item()) + + # Check if data already has matching dimensions and coordinates + if set(data.dims) == set(dims): + # Check if coordinates match + is_compatible = True + for dim in dims: + if dim in data.dims and not np.array_equal(data.coords[dim].values, coords[dim].values): + is_compatible = False + break + + if is_compatible: + # Ensure dimensions are in the correct order + if data.dims != dims: + # Transpose to get dimensions in the right order + return data.transpose(*dims).copy(deep=True) + else: + # Return existing DataArray if compatible and order is correct + return data.copy(deep=True) + + # Handle dimension broadcasting + if len(data.dims) == 1 and len(dims) == 2: + # Single dimension to two dimensions + if data.dims[0] == 'time' and 'scenario' in dims: + # Broadcast time dimension to include scenarios + return DataConverter._broadcast_time_to_scenarios(data, coords, dims) + + elif data.dims[0] == 'scenario' and 'time' in dims: + # Broadcast scenario dimension to include time + return DataConverter._broadcast_scenario_to_time(data, coords, dims) + + raise ConversionError( + f'Cannot convert {data.dims} to {dims}. Source coordinates: {data.coords}, Target coordinates: {coords}' + ) + @staticmethod + def _broadcast_time_to_scenarios( + data: xr.DataArray, coords: Dict[str, pd.Index], dims: Tuple[str, ...] + ) -> xr.DataArray: + """ + Broadcast a time-only DataArray to include scenarios. + + Args: + data: The time-indexed DataArray + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray with time and scenario dimensions + """ + # Check compatibility + if not np.array_equal(data.coords['time'].values, coords['time'].values): + raise ConversionError("Source time coordinates don't match target time coordinates") + + if len(coords['scenario']) <= 1: + return data.copy(deep=True) + + # Broadcast values + values = np.repeat(data.values[:, np.newaxis], len(coords['scenario']), axis=1) + return xr.DataArray(values.copy(), coords=coords, dims=dims) + + @staticmethod + def _broadcast_scenario_to_time( + data: xr.DataArray, coords: Dict[str, pd.Index], dims: Tuple[str, ...] + ) -> xr.DataArray: + """ + Broadcast a scenario-only DataArray to include time. + + Args: + data: The scenario-indexed DataArray + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray with time and scenario dimensions + """ + # Check compatibility + if not np.array_equal(data.coords['scenario'].values, coords['scenario'].values): + raise ConversionError("Source scenario coordinates don't match target scenario coordinates") + + # Broadcast values + values = np.repeat(data.values[:, np.newaxis], len(coords['time']), axis=1).T + return xr.DataArray(values.copy(), coords=coords, dims=dims) + + @staticmethod + def _convert_ndarray(data: np.ndarray, coords: Dict[str, pd.Index], dims: Tuple[str, ...]) -> xr.DataArray: + """ + Convert a NumPy array to a DataArray. + + Args: + data: The NumPy array + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray from the NumPy array + """ + # Handle dimensionless case + if len(dims) == 0: + if data.size != 1: + raise ConversionError('Without dimensions, can only convert scalar arrays') + return xr.DataArray(data.item()) + + # Handle single dimension + elif len(dims) == 1: + return DataConverter._convert_ndarray_single_dim(data, coords, dims) + + # Handle two dimensions + elif len(dims) == 2: + return DataConverter._convert_ndarray_two_dims(data, coords, dims) + + else: + raise ConversionError('Maximum 2 dimensions supported') + + @staticmethod + def _convert_ndarray_single_dim( + data: np.ndarray, coords: Dict[str, pd.Index], dims: Tuple[str, ...] + ) -> xr.DataArray: + """ + Convert a NumPy array to a single-dimension DataArray. + + Args: + data: The NumPy array + coords: Target coordinates + dims: Target dimensions (length 1) + + Returns: + DataArray with single dimension + """ + dim_name = dims[0] + dim_length = len(coords[dim_name]) + + if data.ndim == 1: + # 1D array must match dimension length + if data.shape[0] != dim_length: + raise ConversionError(f"Array length {data.shape[0]} doesn't match {dim_name} length {dim_length}") + return xr.DataArray(data, coords=coords, dims=dims) + else: + raise ConversionError(f'Expected 1D array for single dimension, got {data.ndim}D') + + @staticmethod + def _convert_ndarray_two_dims(data: np.ndarray, coords: Dict[str, pd.Index], dims: Tuple[str, ...]) -> xr.DataArray: + """ + Convert a NumPy array to a two-dimension DataArray. + + Args: + data: The NumPy array + coords: Target coordinates + dims: Target dimensions (length 2) + + Returns: + DataArray with two dimensions + """ + scenario_length = len(coords['scenario']) + time_length = len(coords['time']) + + if data.ndim == 1: + # For 1D array, create 2D array based on which dimension it matches + if data.shape[0] == time_length: + # Broadcast across scenarios + values = np.repeat(data[:, np.newaxis], scenario_length, axis=1) + return xr.DataArray(values, coords=coords, dims=dims) + elif data.shape[0] == scenario_length: + # Broadcast across time + values = np.repeat(data[np.newaxis, :], time_length, axis=0) + return xr.DataArray(values, coords=coords, dims=dims) + else: + raise ConversionError(f"1D array length {data.shape[0]} doesn't match either dimension") + + elif data.ndim == 2: + # For 2D array, shape must match dimensions + expected_shape = (time_length, scenario_length) + if data.shape != expected_shape: + raise ConversionError(f"2D array shape {data.shape} doesn't match expected shape {expected_shape}") + return xr.DataArray(data, coords=coords, dims=dims) + + else: + raise ConversionError(f'Expected 1D or 2D array for two dimensions, got {data.ndim}D') + + @staticmethod + def _convert_series(data: pd.Series, coords: Dict[str, pd.Index], dims: Tuple[str, ...]) -> xr.DataArray: + """ + Convert pandas Series to xarray DataArray. + + Args: + data: pandas Series to convert + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray from the pandas Series + """ + # Handle single dimension case + if len(dims) == 1: + dim_name = dims[0] + + # Check if series index matches the dimension + if data.index.equals(coords[dim_name]): + return xr.DataArray(data.values.copy(), coords=coords, dims=dims) else: - raise ConversionError(f'Unsupported type: {type(data).__name__}') - except Exception as e: - if isinstance(e, ConversionError): - raise - raise ConversionError(f'Converting data {type(data)} to xarray.Dataset raised an error: {str(e)}') from e + raise ConversionError( + f"Series index doesn't match {dim_name} coordinates.\n" + f'Series index: {data.index}\n' + f'Target {dim_name} coordinates: {coords[dim_name]}' + ) + + # Handle two dimensions case + elif len(dims) == 2: + # Check if dimensions are time and scenario + if dims != ('time', 'scenario'): + raise ConversionError( + f'Two-dimensional conversion only supports time and scenario dimensions, got {dims}' + ) + + # Case 1: Series is indexed by time + if data.index.equals(coords['time']): + # Broadcast across scenarios + values = np.repeat(data.values[:, np.newaxis], len(coords['scenario']), axis=1) + return xr.DataArray(values.copy(), coords=coords, dims=dims) + + # Case 2: Series is indexed by scenario + elif data.index.equals(coords['scenario']): + # Broadcast across time + values = np.repeat(data.values[np.newaxis, :], len(coords['time']), axis=0) + return xr.DataArray(values.copy(), coords=coords, dims=dims) + + else: + raise ConversionError( + "Series index must match either 'time' or 'scenario' coordinates.\n" + f'Series index: {data.index}\n' + f'Target time coordinates: {coords["time"]}\n' + f'Target scenario coordinates: {coords["scenario"]}' + ) + + else: + raise ConversionError(f'Maximum 2 dimensions supported, got {len(dims)}') + + @staticmethod + def _convert_dataframe(data: pd.DataFrame, coords: Dict[str, pd.Index], dims: Tuple[str, ...]) -> xr.DataArray: + """ + Convert pandas DataFrame to xarray DataArray. + Only allows time as index and scenarios as columns. + + Args: + data: pandas DataFrame to convert + coords: Target coordinates + dims: Target dimensions + + Returns: + DataArray from the pandas DataFrame + """ + # Single dimension case + if len(dims) == 1: + # If DataFrame has one column, treat it like a Series + if len(data.columns) == 1: + series = data.iloc[:, 0] + return DataConverter._convert_series(series, coords, dims) + + raise ConversionError( + f'When converting DataFrame to single-dimension DataArray, DataFrame must have exactly one column, got {len(data.columns)}' + ) + + # Two dimensions case + elif len(dims) == 2: + # Check if dimensions are time and scenario + if dims != ('time', 'scenario'): + raise ConversionError( + f'Two-dimensional conversion only supports time and scenario dimensions, got {dims}' + ) + + # DataFrame must have time as index and scenarios as columns + if data.index.equals(coords['time']) and data.columns.equals(coords['scenario']): + # Create DataArray with proper dimension order + return xr.DataArray(data.values.copy(), coords=coords, dims=dims) + else: + raise ConversionError( + 'DataFrame must have time as index and scenarios as columns.\n' + f'DataFrame index: {data.index}\n' + f'DataFrame columns: {data.columns}\n' + f'Target time coordinates: {coords["time"]}\n' + f'Target scenario coordinates: {coords["scenario"]}' + ) + + else: + raise ConversionError(f'Maximum 2 dimensions supported, got {len(dims)}') class TimeSeriesData: # TODO: Move to Interface.py - def __init__(self, data: NumericData, agg_group: Optional[str] = None, agg_weight: Optional[float] = None): + def __init__(self, data: TimestepData, agg_group: Optional[str] = None, agg_weight: Optional[float] = None): """ timeseries class for transmit timeseries AND special characteristics of timeseries, i.g. to define weights needed in calculation_type 'aggregated' @@ -146,18 +538,19 @@ class TimeSeries: name (str): The name of the time series aggregation_weight (Optional[float]): Weight used for aggregation aggregation_group (Optional[str]): Group name for shared aggregation weighting - needs_extra_timestep (bool): Whether this series needs an extra timestep + has_extra_timestep (bool): Whether this series needs an extra timestep """ @classmethod def from_datasource( cls, - data: NumericData, + data: NumericDataTS, name: str, timesteps: pd.DatetimeIndex, + scenarios: Optional[pd.Index] = None, aggregation_weight: Optional[float] = None, aggregation_group: Optional[str] = None, - needs_extra_timestep: bool = False, + has_extra_timestep: bool = False, ) -> 'TimeSeries': """ Initialize the TimeSeries from multiple data sources. @@ -166,19 +559,20 @@ def from_datasource( data: The time series data name: The name of the TimeSeries timesteps: The timesteps of the TimeSeries + scenarios: The scenarios of the TimeSeries aggregation_weight: The weight in aggregation calculations aggregation_group: Group this TimeSeries belongs to for aggregation weight sharing - needs_extra_timestep: Whether this series requires an extra timestep + has_extra_timestep: Whether this series requires an extra timestep Returns: A new TimeSeries instance """ return cls( - DataConverter.as_dataarray(data, timesteps), + DataConverter.as_dataarray(data, timesteps, scenarios), name, aggregation_weight, aggregation_group, - needs_extra_timestep, + has_extra_timestep, ) @classmethod @@ -212,7 +606,7 @@ def from_json(cls, data: Optional[Dict[str, Any]] = None, path: Optional[str] = name=data['name'], aggregation_weight=data['aggregation_weight'], aggregation_group=data['aggregation_group'], - needs_extra_timestep=data['needs_extra_timestep'], + has_extra_timestep=data['has_extra_timestep'], ) def __init__( @@ -221,7 +615,7 @@ def __init__( name: str, aggregation_weight: Optional[float] = None, aggregation_group: Optional[str] = None, - needs_extra_timestep: bool = False, + has_extra_timestep: bool = False, ): """ Initialize a TimeSeries with a DataArray. @@ -231,35 +625,40 @@ def __init__( name: The name of the TimeSeries aggregation_weight: The weight in aggregation calculations aggregation_group: Group this TimeSeries belongs to for weight sharing - needs_extra_timestep: Whether this series requires an extra timestep + has_extra_timestep: Whether this series requires an extra timestep Raises: - ValueError: If data doesn't have a 'time' index or has more than 1 dimension + ValueError: If data has unsupported dimensions """ - if 'time' not in data.indexes: - raise ValueError(f'DataArray must have a "time" index. Got {data.indexes}') - if data.ndim > 1: - raise ValueError(f'Number of dimensions of DataArray must be 1. Got {data.ndim}') + allowed_dims = {'time', 'scenario'} + if not set(data.dims).issubset(allowed_dims): + raise ValueError(f'DataArray dimensions must be subset of {allowed_dims}. Got {data.dims}') self.name = name self.aggregation_weight = aggregation_weight self.aggregation_group = aggregation_group - self.needs_extra_timestep = needs_extra_timestep + self.has_extra_timestep = has_extra_timestep # Data management self._stored_data = data.copy(deep=True) self._backup = self._stored_data.copy(deep=True) - self._active_timesteps = self._stored_data.indexes['time'] - self._active_data = None - self._update_active_data() - def reset(self): + # Selection state + self._selected_timesteps: Optional[pd.DatetimeIndex] = None + self._selected_scenarios: Optional[pd.Index] = None + + # Flag for whether this series has various dimensions + self.has_time_dim = 'time' in data.dims + self.has_scenario_dim = 'scenario' in data.dims + + def reset(self) -> None: """ - Reset active timesteps to the full set of stored timesteps. + Reset selections to include all timesteps and scenarios. + This is equivalent to clearing all selections. """ - self.active_timesteps = None + self.set_selection(None, None) - def restore_data(self): + def restore_data(self) -> None: """ Restore stored_data from the backup and reset active timesteps. """ @@ -280,8 +679,8 @@ def to_json(self, path: Optional[pathlib.Path] = None) -> Dict[str, Any]: 'name': self.name, 'aggregation_weight': self.aggregation_weight, 'aggregation_group': self.aggregation_group, - 'needs_extra_timestep': self.needs_extra_timestep, - 'data': self.active_data.to_dict(), + 'has_extra_timestep': self.has_extra_timestep, + 'data': self.selected_data.to_dict(), } # Convert datetime objects to ISO strings @@ -289,7 +688,7 @@ def to_json(self, path: Optional[pathlib.Path] = None) -> Dict[str, Any]: # Save to file if path is provided if path is not None: - indent = 4 if len(self.active_timesteps) <= 480 else None + indent = 4 if len(self.selected_timesteps) <= 480 else None with open(path, 'w', encoding='utf-8') as f: json.dump(data, f, indent=indent, ensure_ascii=False) @@ -303,84 +702,116 @@ def stats(self) -> str: Returns: String representation of data statistics """ - return get_numeric_stats(self.active_data, padd=0) - - def _update_active_data(self): - """ - Update the active data based on active_timesteps. - """ - self._active_data = self._stored_data.sel(time=self.active_timesteps) + return get_numeric_stats(self.selected_data, padd=0, by_scenario=True) @property def all_equal(self) -> bool: """Check if all values in the series are equal.""" - return np.unique(self.active_data.values).size == 1 + return np.unique(self.selected_data.values).size == 1 @property - def active_timesteps(self) -> pd.DatetimeIndex: - """Get the current active timesteps.""" - return self._active_timesteps - - @active_timesteps.setter - def active_timesteps(self, timesteps: Optional[pd.DatetimeIndex]): + def selected_data(self) -> xr.DataArray: """ - Set active_timesteps and refresh active_data. - - Args: - timesteps: New timesteps to activate, or None to use all stored timesteps - - Raises: - TypeError: If timesteps is not a pandas DatetimeIndex or None + Get a view of stored_data based on current selections. + This computes the view dynamically based on the current selection state. """ - if timesteps is None: - self._active_timesteps = self.stored_data.indexes['time'] - elif isinstance(timesteps, pd.DatetimeIndex): - self._active_timesteps = timesteps - else: - raise TypeError('active_timesteps must be a pandas DatetimeIndex or None') + return self._stored_data.sel(**self._valid_selector) - self._update_active_data() + @property + def selected_timesteps(self) -> Optional[pd.DatetimeIndex]: + """Get the current active timesteps, or None if no time dimension.""" + if not self.has_time_dim: + return None + if self._selected_timesteps is None: + return self._stored_data.indexes['time'] + return self._selected_timesteps @property - def active_data(self) -> xr.DataArray: - """Get a view of stored_data based on active_timesteps.""" - return self._active_data + def active_scenarios(self) -> Optional[pd.Index]: + """Get the current active scenarios, or None if no scenario dimension.""" + if not self.has_scenario_dim: + return None + if self._selected_scenarios is None: + return self._stored_data.indexes['scenario'] + return self._selected_scenarios @property def stored_data(self) -> xr.DataArray: """Get a copy of the full stored data.""" return self._stored_data.copy() - @stored_data.setter - def stored_data(self, value: NumericData): + def update_stored_data(self, value: xr.DataArray) -> None: """ - Update stored_data and refresh active_data. + Update stored_data and refresh selected_data. Args: value: New data to store """ - new_data = DataConverter.as_dataarray(value, timesteps=self.active_timesteps) + new_data = DataConverter.as_dataarray( + value, + timesteps=self.selected_timesteps if self.has_time_dim else None, + scenarios=self.active_scenarios if self.has_scenario_dim else None, + ) # Skip if data is unchanged to avoid overwriting backup if new_data.equals(self._stored_data): return self._stored_data = new_data - self.active_timesteps = None # Reset to full timeline + self.set_selection(None, None) # Reset selections to full dataset + + def set_selection(self, timesteps: Optional[pd.DatetimeIndex] = None, scenarios: Optional[pd.Index] = None) -> None: + """ + Set active subset for timesteps and scenarios. + + Args: + timesteps: Timesteps to activate, or None to clear. Ignored if series has no time dimension. + scenarios: Scenarios to activate, or None to clear. Ignored if series has no scenario dimension. + """ + # Only update timesteps if the series has time dimension + if self.has_time_dim: + if timesteps is None or timesteps.equals(self._stored_data.indexes['time']): + self._selected_timesteps = None + else: + self._selected_timesteps = timesteps + + # Only update scenarios if the series has scenario dimension + if self.has_scenario_dim: + if scenarios is None or scenarios.equals(self._stored_data.indexes['scenario']): + self._selected_scenarios = None + else: + self._selected_scenarios = scenarios @property def sel(self): - return self.active_data.sel + """Direct access to the selected_data's sel method for convenience.""" + return self.selected_data.sel @property def isel(self): - return self.active_data.isel + """Direct access to the selected_data's isel method for convenience.""" + return self.selected_data.isel + + @property + def _valid_selector(self) -> Dict[str, pd.Index]: + """Get the current selection as a dictionary.""" + selector = {} + + # Only include time in selector if series has time dimension + if self.has_time_dim and self._selected_timesteps is not None: + selector['time'] = self._selected_timesteps + + # Only include scenario in selector if series has scenario dimension + if self.has_scenario_dim and self._selected_scenarios is not None: + selector['scenario'] = self._selected_scenarios + + return selector def _apply_operation(self, other, op): """Apply an operation between this TimeSeries and another object.""" if isinstance(other, TimeSeries): - other = other.active_data - return op(self.active_data, other) + other = other.selected_data + return op(self.selected_data, other) def __add__(self, other): return self._apply_operation(other, lambda x, y: x + y) @@ -395,25 +826,25 @@ def __truediv__(self, other): return self._apply_operation(other, lambda x, y: x / y) def __radd__(self, other): - return other + self.active_data + return other + self.selected_data def __rsub__(self, other): - return other - self.active_data + return other - self.selected_data def __rmul__(self, other): - return other * self.active_data + return other * self.selected_data def __rtruediv__(self, other): - return other / self.active_data + return other / self.selected_data def __neg__(self) -> xr.DataArray: - return -self.active_data + return -self.selected_data def __pos__(self) -> xr.DataArray: - return +self.active_data + return +self.selected_data def __abs__(self) -> xr.DataArray: - return abs(self.active_data) + return abs(self.selected_data) def __gt__(self, other): """ @@ -426,8 +857,8 @@ def __gt__(self, other): True if all values in this TimeSeries are greater than other """ if isinstance(other, TimeSeries): - return self.active_data > other.active_data - return self.active_data > other + return self.selected_data > other.selected_data + return self.selected_data > other def __ge__(self, other): """ @@ -440,8 +871,8 @@ def __ge__(self, other): True if all values in this TimeSeries are greater than or equal to other """ if isinstance(other, TimeSeries): - return self.active_data >= other.active_data - return self.active_data >= other + return self.selected_data >= other.selected_data + return self.selected_data >= other def __lt__(self, other): """ @@ -454,8 +885,8 @@ def __lt__(self, other): True if all values in this TimeSeries are less than other """ if isinstance(other, TimeSeries): - return self.active_data < other.active_data - return self.active_data < other + return self.selected_data < other.selected_data + return self.selected_data < other def __le__(self, other): """ @@ -468,8 +899,8 @@ def __le__(self, other): True if all values in this TimeSeries are less than or equal to other """ if isinstance(other, TimeSeries): - return self.active_data <= other.active_data - return self.active_data <= other + return self.selected_data <= other.selected_data + return self.selected_data <= other def __eq__(self, other): """ @@ -482,8 +913,8 @@ def __eq__(self, other): True if all values in this TimeSeries are equal to other """ if isinstance(other, TimeSeries): - return self.active_data == other.active_data - return self.active_data == other + return self.selected_data == other.selected_data + return self.selected_data == other def __array_ufunc__(self, ufunc, method, *inputs, **kwargs): """ @@ -491,8 +922,8 @@ def __array_ufunc__(self, ufunc, method, *inputs, **kwargs): This allows NumPy functions to work with TimeSeries objects. """ - # Convert any TimeSeries inputs to their active_data - inputs = [x.active_data if isinstance(x, TimeSeries) else x for x in inputs] + # Convert any TimeSeries inputs to their selected_data + inputs = [x.selected_data if isinstance(x, TimeSeries) else x for x in inputs] return getattr(ufunc, method)(*inputs, **kwargs) def __repr__(self): @@ -506,10 +937,10 @@ def __repr__(self): 'name': self.name, 'aggregation_weight': self.aggregation_weight, 'aggregation_group': self.aggregation_group, - 'needs_extra_timestep': self.needs_extra_timestep, - 'shape': self.active_data.shape, - 'time_range': f'{self.active_timesteps[0]} to {self.active_timesteps[-1]}', + 'has_extra_timestep': self.has_extra_timestep, + 'shape': self.selected_data.shape, } + attr_str = ', '.join(f'{k}={repr(v)}' for k, v in attrs.items()) return f'TimeSeries({attr_str})' @@ -520,281 +951,333 @@ def __str__(self): Returns: Descriptive string with statistics """ - return f"TimeSeries '{self.name}': {self.stats}" + return f'TimeSeries "{self.name}":\n{textwrap.indent(self.stats, " ")}' class TimeSeriesCollection: """ - Collection of TimeSeries objects with shared timestep management. + Simplified central manager for time series data with reference tracking. - TimeSeriesCollection handles multiple TimeSeries objects with synchronized - timesteps, provides operations on collections, and manages extra timesteps. + Provides a way to store time series data and work with subsets of dimensions + that automatically update all references when changed. """ def __init__( self, timesteps: pd.DatetimeIndex, + scenarios: Optional[pd.Index] = None, hours_of_last_timestep: Optional[float] = None, hours_of_previous_timesteps: Optional[Union[float, np.ndarray]] = None, ): - """ - Args: - timesteps: The timesteps of the Collection. - hours_of_last_timestep: The duration of the last time step. Uses the last time interval if not specified - hours_of_previous_timesteps: The duration of previous timesteps. - If None, the first time increment of time_series is used. - This is needed to calculate previous durations (for example consecutive_on_hours). - If you use an array, take care that its long enough to cover all previous values! - """ - # Prepare and validate timesteps - self._validate_timesteps(timesteps) - self.hours_of_previous_timesteps = self._calculate_hours_of_previous_timesteps( - timesteps, hours_of_previous_timesteps + """Initialize a TimeSeriesCollection.""" + self._full_timesteps = self._validate_timesteps(timesteps) + self._full_scenarios = self._validate_scenarios(scenarios) + + self._full_timesteps_extra = self._create_timesteps_with_extra( + self._full_timesteps, + self._calculate_hours_of_final_timestep( + self._full_timesteps, hours_of_final_timestep=hours_of_last_timestep + ), + ) + self._full_hours_per_timestep = self.calculate_hours_per_timestep( + self._full_timesteps_extra, self._full_scenarios ) - # Set up timesteps and hours - self.all_timesteps = timesteps - self.all_timesteps_extra = self._create_timesteps_with_extra(timesteps, hours_of_last_timestep) - self.all_hours_per_timestep = self.calculate_hours_per_timestep(self.all_timesteps_extra) + self.hours_of_previous_timesteps = self._calculate_hours_of_previous_timesteps( + timesteps, hours_of_previous_timesteps + ) # TODO: Make dynamic - # Active timestep tracking - self._active_timesteps = None - self._active_timesteps_extra = None - self._active_hours_per_timestep = None + # Series that need extra timestep + self._has_extra_timestep: set = set() - # Dictionary of time series by name - self.time_series_data: Dict[str, TimeSeries] = {} + # Storage for TimeSeries objects + self._time_series: Dict[str, TimeSeries] = {} - # Aggregation - self.group_weights: Dict[str, float] = {} - self.weights: Dict[str, float] = {} + # Active subset selectors + self._selected_timesteps: Optional[pd.DatetimeIndex] = None + self._selected_scenarios: Optional[pd.Index] = None + self._selected_timesteps_extra: Optional[pd.DatetimeIndex] = None + self._selected_hours_per_timestep: Optional[xr.DataArray] = None - @classmethod - def with_uniform_timesteps( - cls, start_time: pd.Timestamp, periods: int, freq: str, hours_per_step: Optional[float] = None - ) -> 'TimeSeriesCollection': - """Create a collection with uniform timesteps.""" - timesteps = pd.date_range(start_time, periods=periods, freq=freq, name='time') - return cls(timesteps, hours_of_previous_timesteps=hours_per_step) - - def create_time_series( - self, data: Union[NumericData, TimeSeriesData], name: str, needs_extra_timestep: bool = False + def add_time_series( + self, + name: str, + data: Union[NumericDataTS, TimeSeries], + has_time_dim: bool = True, + has_scenario_dim: bool = True, + aggregation_weight: Optional[float] = None, + aggregation_group: Optional[str] = None, + has_extra_timestep: bool = False, ) -> TimeSeries: """ - Creates a TimeSeries from the given data and adds it to the collection. + Add a new TimeSeries to the allocator. Args: - data: The data to create the TimeSeries from. - name: The name of the TimeSeries. - needs_extra_timestep: Whether to create an additional timestep at the end of the timesteps. - The data to create the TimeSeries from. + name: Name of the time series + data: Data for the time series (can be raw data or an existing TimeSeries) + has_time_dim: Whether the TimeSeries has a time dimension + has_scenario_dim: Whether the TimeSeries has a scenario dimension + aggregation_weight: Weight used for aggregation + aggregation_group: Group name for shared aggregation weighting + has_extra_timestep: Whether this series needs an extra timestep Returns: - The created TimeSeries. - + The created TimeSeries object """ - # Check for duplicate name - if name in self.time_series_data: - raise ValueError(f"TimeSeries '{name}' already exists in this collection") + if name in self._time_series: + raise KeyError(f"TimeSeries '{name}' already exists in allocator") + if not has_time_dim and has_extra_timestep: + raise ValueError('A not time-indexed TimeSeries cannot have an extra timestep') + + # Choose which timesteps to use + if has_time_dim: + target_timesteps = self.timesteps_extra if has_extra_timestep else self.timesteps + else: + target_timesteps = None - # Determine which timesteps to use - timesteps_to_use = self.timesteps_extra if needs_extra_timestep else self.timesteps + target_scenarios = self.scenarios if has_scenario_dim else None - # Create the time series - if isinstance(data, TimeSeriesData): - time_series = TimeSeries.from_datasource( + # Create or adapt the TimeSeries object + if isinstance(data, TimeSeries): + # Use the existing TimeSeries but update its parameters + time_series = data + # Update the stored data to use our timesteps and scenarios + data_array = DataConverter.as_dataarray( + time_series.stored_data, timesteps=target_timesteps, scenarios=target_scenarios + ) + time_series = TimeSeries( + data=data_array, name=name, - data=data.data, - timesteps=timesteps_to_use, - aggregation_weight=data.agg_weight, - aggregation_group=data.agg_group, - needs_extra_timestep=needs_extra_timestep, + aggregation_weight=aggregation_weight or time_series.aggregation_weight, + aggregation_group=aggregation_group or time_series.aggregation_group, + has_extra_timestep=has_extra_timestep or time_series.has_extra_timestep, ) - # Connect the user time series to the created TimeSeries - data.label = name else: + # Create a new TimeSeries from raw data time_series = TimeSeries.from_datasource( - name=name, data=data, timesteps=timesteps_to_use, needs_extra_timestep=needs_extra_timestep + data=data, + name=name, + timesteps=target_timesteps, + scenarios=target_scenarios, + aggregation_weight=aggregation_weight, + aggregation_group=aggregation_group, + has_extra_timestep=has_extra_timestep, ) - # Add to the collection - self.add_time_series(time_series) + # Add to storage + self._time_series[name] = time_series - return time_series + # Track if it needs extra timestep + if has_extra_timestep: + self._has_extra_timestep.add(name) - def calculate_aggregation_weights(self) -> Dict[str, float]: - """Calculate and return aggregation weights for all time series.""" - self.group_weights = self._calculate_group_weights() - self.weights = self._calculate_weights() - - if np.all(np.isclose(list(self.weights.values()), 1, atol=1e-6)): - logger.info('All Aggregation weights were set to 1') - - return self.weights + # Return the TimeSeries object + return time_series - def activate_timesteps(self, active_timesteps: Optional[pd.DatetimeIndex] = None): + def set_selection(self, timesteps: Optional[pd.DatetimeIndex] = None, scenarios: Optional[pd.Index] = None) -> None: """ - Update active timesteps for the collection and all time series. - If no arguments are provided, the active timesteps are reset. + Set active subset for timesteps and scenarios. Args: - active_timesteps: The active timesteps of the model. - If None, the all timesteps of the TimeSeriesCollection are taken. + timesteps: Timesteps to activate, or None to clear + scenarios: Scenarios to activate, or None to clear """ - if active_timesteps is None: - return self.reset() + if timesteps is None: + self._selected_timesteps = None + self._selected_timesteps_extra = None + else: + self._selected_timesteps = self._validate_timesteps(timesteps, self._full_timesteps) + self._selected_timesteps_extra = self._create_timesteps_with_extra( + timesteps, self._calculate_hours_of_final_timestep(timesteps, self._full_timesteps) + ) - if not np.all(np.isin(active_timesteps, self.all_timesteps)): - raise ValueError('active_timesteps must be a subset of the timesteps of the TimeSeriesCollection') + if scenarios is None: + self._selected_scenarios = None + else: + self._selected_scenarios = self._validate_scenarios(scenarios, self._full_scenarios) - # Calculate derived timesteps - self._active_timesteps = active_timesteps - first_ts_index = np.where(self.all_timesteps == active_timesteps[0])[0][0] - last_ts_idx = np.where(self.all_timesteps == active_timesteps[-1])[0][0] - self._active_timesteps_extra = self.all_timesteps_extra[first_ts_index : last_ts_idx + 2] - self._active_hours_per_timestep = self.all_hours_per_timestep.isel(time=slice(first_ts_index, last_ts_idx + 1)) + self._selected_hours_per_timestep = self.calculate_hours_per_timestep(self.timesteps_extra, self.scenarios) - # Update all time series - self._update_time_series_timesteps() + # Apply the selection to all TimeSeries objects + for ts_name, ts in self._time_series.items(): + if ts.has_time_dim: + timesteps = self.timesteps_extra if ts_name in self._has_extra_timestep else self.timesteps + else: + timesteps = None - def reset(self): - """Reset active timesteps to defaults for all time series.""" - self._active_timesteps = None - self._active_timesteps_extra = None - self._active_hours_per_timestep = None + ts.set_selection(timesteps=timesteps, scenarios=self.scenarios if ts.has_scenario_dim else None) + self._propagate_selection_to_time_series() - for time_series in self.time_series_data.values(): - time_series.reset() + def as_dataset(self, with_extra_timestep: bool = True, with_constants: bool = True) -> xr.Dataset: + """ + Convert the TimeSeriesCollection to a xarray Dataset, containing the data of each TimeSeries. - def restore_data(self): - """Restore original data for all time series.""" - for time_series in self.time_series_data.values(): - time_series.restore_data() + Args: + with_extra_timestep: Whether to exclude the extra timesteps. + Effectively, this removes the last timestep for certain TimeSeries, but mitigates the presence of NANs in others. + with_constants: Whether to exclude TimeSeries with a constant value from the dataset. + """ + if self.scenarios is None: + ds = xr.Dataset(coords={'time': self.timesteps_extra}) + else: + ds = xr.Dataset(coords={'scenario': self.scenarios, 'time': self.timesteps_extra}) - def add_time_series(self, time_series: TimeSeries): - """Add an existing TimeSeries to the collection.""" - if time_series.name in self.time_series_data: - raise ValueError(f"TimeSeries '{time_series.name}' already exists in this collection") + for ts in self._time_series.values(): + if not with_constants and ts.all_equal: + continue + ds[ts.name] = ts.selected_data - self.time_series_data[time_series.name] = time_series + if not with_extra_timestep: + return ds.sel(time=self.timesteps) - def insert_new_data(self, data: pd.DataFrame, include_extra_timestep: bool = False): - """ - Update time series with new data from a DataFrame. + return ds - Args: - data: DataFrame containing new data with timestamps as index - include_extra_timestep: Whether the provided data already includes the extra timestep, by default False - """ - if not isinstance(data, pd.DataFrame): - raise TypeError(f'data must be a pandas DataFrame, got {type(data).__name__}') + @property + def timesteps(self) -> pd.DatetimeIndex: + """Get the current active timesteps.""" + if self._selected_timesteps is None: + return self._full_timesteps + return self._selected_timesteps - # Check if the DataFrame index matches the expected timesteps - expected_timesteps = self.timesteps_extra if include_extra_timestep else self.timesteps - if not data.index.equals(expected_timesteps): - raise ValueError( - f'DataFrame index must match {"collection timesteps with extra timestep" if include_extra_timestep else "collection timesteps"}' - ) + @property + def timesteps_extra(self) -> pd.DatetimeIndex: + """Get the current active timesteps with extra timestep.""" + if self._selected_timesteps_extra is None: + return self._full_timesteps_extra + return self._selected_timesteps_extra - for name, ts in self.time_series_data.items(): - if name in data.columns: - if not ts.needs_extra_timestep: - # For time series without extra timestep - if include_extra_timestep: - # If data includes extra timestep but series doesn't need it, exclude the last point - ts.stored_data = data[name].iloc[:-1] - else: - # Use data as is - ts.stored_data = data[name] - else: - # For time series with extra timestep - if include_extra_timestep: - # Data already includes extra timestep - ts.stored_data = data[name] - else: - # Need to add extra timestep - extrapolate from the last value - extra_step_value = data[name].iloc[-1] - extra_step_index = pd.DatetimeIndex([self.timesteps_extra[-1]], name='time') - extra_step_series = pd.Series([extra_step_value], index=extra_step_index) + @property + def hours_per_timestep(self) -> xr.DataArray: + """Get the current active hours per timestep.""" + if self._selected_hours_per_timestep is None: + return self._full_hours_per_timestep + return self._selected_hours_per_timestep - # Combine the regular data with the extra timestep - ts.stored_data = pd.concat([data[name], extra_step_series]) + @property + def scenarios(self) -> Optional[pd.Index]: + """Get the current active scenarios.""" + if self._selected_scenarios is None: + return self._full_scenarios + return self._selected_scenarios + + def _propagate_selection_to_time_series(self) -> None: + """Apply the current selection to all TimeSeries objects.""" + for ts_name, ts in self._time_series.items(): + if ts.has_time_dim: + timesteps = self.timesteps_extra if ts_name in self._has_extra_timestep else self.timesteps + else: + timesteps = None - logger.debug(f'Updated data for {name}') + ts.set_selection(timesteps=timesteps, scenarios=self.scenarios if ts.has_scenario_dim else None) - def to_dataframe( - self, filtered: Literal['all', 'constant', 'non_constant'] = 'non_constant', include_extra_timestep: bool = True - ) -> pd.DataFrame: + def __getitem__(self, name: str) -> TimeSeries: """ - Convert collection to DataFrame with optional filtering and timestep control. + Get a reference to a time series or data array. Args: - filtered: Filter time series by variability, by default 'non_constant' - include_extra_timestep: Whether to include the extra timestep in the result, by default True + name: Name of the data array or time series Returns: - DataFrame representation of the collection + TimeSeries object if it exists, otherwise DataArray with current selection applied """ - include_constants = filtered != 'non_constant' - ds = self.to_dataset(include_constants=include_constants) + # First check if this is a TimeSeries + if name in self._time_series: + # Return the TimeSeries object (it will handle selection internally) + return self._time_series[name] + raise ValueError(f'No TimeSeries named "{name}" found') + + def __contains__(self, value) -> bool: + if isinstance(value, str): + return value in self._time_series + elif isinstance(value, TimeSeries): + return value.name in self._time_series + raise TypeError(f'Invalid type for __contains__ of {self.__class__.__name__}: {type(value)}') - if not include_extra_timestep: - ds = ds.isel(time=slice(None, -1)) - - df = ds.to_dataframe() - - # Apply filtering - if filtered == 'all': - return df - elif filtered == 'constant': - return df.loc[:, df.nunique() == 1] - elif filtered == 'non_constant': - return df.loc[:, df.nunique() > 1] - else: - raise ValueError("filtered must be one of: 'all', 'constant', 'non_constant'") + def __iter__(self) -> Iterator[TimeSeries]: + """Iterate over TimeSeries objects.""" + return iter(self._time_series.values()) - def to_dataset(self, include_constants: bool = True) -> xr.Dataset: + def update_time_series(self, name: str, data: TimestepData) -> TimeSeries: """ - Combine all time series into a single Dataset with all timesteps. + Update an existing TimeSeries with new data. Args: - include_constants: Whether to include time series with constant values, by default True + name: Name of the TimeSeries to update + data: New data to assign Returns: - Dataset containing all selected time series with all timesteps + The updated TimeSeries + + Raises: + KeyError: If no TimeSeries with the given name exists """ - # Determine which series to include - if include_constants: - series_to_include = self.time_series_data.values() - else: - series_to_include = self.non_constants + if name not in self._time_series: + raise KeyError(f"No TimeSeries named '{name}' found") - # Create individual datasets and merge them - ds = xr.merge([ts.active_data.to_dataset(name=ts.name) for ts in series_to_include]) + # Get the TimeSeries + ts = self._time_series[name] - # Ensure the correct time coordinates - ds = ds.reindex(time=self.timesteps_extra) + # Determine which timesteps to use if the series has a time dimension + if ts.has_time_dim: + target_timesteps = self.timesteps_extra if name in self._has_extra_timestep else self.timesteps + else: + target_timesteps = None - ds.attrs.update( - { - 'timesteps_extra': f'{self.timesteps_extra[0]} ... {self.timesteps_extra[-1]} | len={len(self.timesteps_extra)}', - 'hours_per_timestep': self._format_stats(self.hours_per_timestep), - } + # Convert data to proper format + data_array = DataConverter.as_dataarray( + data, timesteps=target_timesteps, scenarios=self.scenarios if ts.has_scenario_dim else None ) - return ds + # Update the TimeSeries + ts.update_stored_data(data_array) + + return ts + + def calculate_aggregation_weights(self) -> Dict[str, float]: + """Calculate and return aggregation weights for all time series.""" + group_weights = self._calculate_group_weights() - def _update_time_series_timesteps(self): - """Update active timesteps for all time series.""" - for ts in self.time_series_data.values(): - if ts.needs_extra_timestep: - ts.active_timesteps = self.timesteps_extra + weights = {} + for name, ts in self._time_series.items(): + if ts.aggregation_group is not None: + # Use group weight + weights[name] = group_weights.get(ts.aggregation_group, 1) else: - ts.active_timesteps = self.timesteps + # Use individual weight or default to 1 + weights[name] = ts.aggregation_weight or 1 + + if np.all(np.isclose(list(weights.values()), 1, atol=1e-6)): + logger.info('All Aggregation weights were set to 1') + + return weights + + def _calculate_group_weights(self) -> Dict[str, float]: + """Calculate weights for aggregation groups.""" + # Count series in each group + groups = [ts.aggregation_group for ts in self._time_series.values() if ts.aggregation_group is not None] + group_counts = Counter(groups) + + # Calculate weight for each group (1/count) + return {group: 1 / count for group, count in group_counts.items()} @staticmethod - def _validate_timesteps(timesteps: pd.DatetimeIndex): - """Validate timesteps format and rename if needed.""" + def _validate_timesteps( + timesteps: pd.DatetimeIndex, present_timesteps: Optional[pd.DatetimeIndex] = None + ) -> pd.DatetimeIndex: + """ + Validate timesteps format and rename if needed. + Args: + timesteps: The timesteps to validate + present_timesteps: The timesteps that are present in the dataset + + Raises: + ValueError: If timesteps is not a pandas DatetimeIndex + ValueError: If timesteps is not at least 2 timestamps + ValueError: If timesteps has a different name than 'time' + ValueError: If timesteps is not sorted + ValueError: If timesteps contains duplicates + ValueError: If timesteps is not a subset of present_timesteps + """ if not isinstance(timesteps, pd.DatetimeIndex): raise TypeError('timesteps must be a pandas DatetimeIndex') @@ -803,22 +1286,61 @@ def _validate_timesteps(timesteps: pd.DatetimeIndex): # Ensure timesteps has the required name if timesteps.name != 'time': - logger.warning('Renamed timesteps to "time" (was "%s")', timesteps.name) + logger.debug('Renamed timesteps to "time" (was "%s")', timesteps.name) timesteps.name = 'time' + # Ensure timesteps is sorted + if not timesteps.is_monotonic_increasing: + raise ValueError('timesteps must be sorted') + + # Ensure timesteps has no duplicates + if len(timesteps) != len(timesteps.drop_duplicates()): + raise ValueError('timesteps must not contain duplicates') + + # Ensure timesteps is a subset of present_timesteps + if present_timesteps is not None and not set(timesteps).issubset(set(present_timesteps)): + raise ValueError('timesteps must be a subset of present_timesteps') + + return timesteps + @staticmethod - def _create_timesteps_with_extra( - timesteps: pd.DatetimeIndex, hours_of_last_timestep: Optional[float] - ) -> pd.DatetimeIndex: - """Create timesteps with an extra step at the end.""" - if hours_of_last_timestep is not None: - # Create the extra timestep using the specified duration - last_date = pd.DatetimeIndex([timesteps[-1] + pd.Timedelta(hours=hours_of_last_timestep)], name='time') - else: - # Use the last interval as the extra timestep duration - last_date = pd.DatetimeIndex([timesteps[-1] + (timesteps[-1] - timesteps[-2])], name='time') + def _validate_scenarios(scenarios: pd.Index, present_scenarios: Optional[pd.Index] = None) -> Optional[pd.Index]: + """ + Validate scenario format and rename if needed. + Args: + scenarios: The scenarios to validate + present_scenarios: The present_scenarios that are present in the dataset + + Raises: + ValueError: If timesteps is not a pandas DatetimeIndex + ValueError: If timesteps is not at least 2 timestamps + ValueError: If timesteps has a different name than 'time' + ValueError: If timesteps is not sorted + ValueError: If timesteps contains duplicates + ValueError: If timesteps is not a subset of present_timesteps + """ + if scenarios is None: + return None + + if not isinstance(scenarios, pd.Index): + logger.warning('Converting scenarios to pandas.Index') + scenarios = pd.Index(scenarios, name='scenario') + + # Ensure timesteps has the required name + if scenarios.name != 'scenario': + logger.debug('Renamed scenarios to "scneario" (was "%s")', scenarios.name) + scenarios.name = 'scenario' - # Combine with original timesteps + # Ensure timesteps is a subset of present_timesteps + if present_scenarios is not None and not set(scenarios).issubset(set(present_scenarios)): + raise ValueError('scenarios must be a subset of present_scenarios') + + return scenarios + + @staticmethod + def _create_timesteps_with_extra(timesteps: pd.DatetimeIndex, hours_of_last_timestep: float) -> pd.DatetimeIndex: + """Create timesteps with an extra step at the end.""" + last_date = pd.DatetimeIndex([timesteps[-1] + pd.Timedelta(hours=hours_of_last_timestep)], name='time') return pd.DatetimeIndex(timesteps.append(last_date), name='time') @staticmethod @@ -834,137 +1356,130 @@ def _calculate_hours_of_previous_timesteps( return first_interval.total_seconds() / 3600 # Convert to hours @staticmethod - def calculate_hours_per_timestep(timesteps_extra: pd.DatetimeIndex) -> xr.DataArray: - """Calculate duration of each timestep.""" - # Calculate differences between consecutive timestamps - hours_per_step = np.diff(timesteps_extra) / pd.Timedelta(hours=1) + def _calculate_hours_of_final_timestep( + timesteps: pd.DatetimeIndex, + timesteps_superset: Optional[pd.DatetimeIndex] = None, + hours_of_final_timestep: Optional[float] = None, + ) -> float: + """ + Calculate duration of the final timestep. + If timesteps_subset is provided, the final timestep is calculated for this subset. + The hours_of_final_timestep is only used if the final timestep cant be determined from the timesteps. - return xr.DataArray( - data=hours_per_step, coords={'time': timesteps_extra[:-1]}, dims=('time',), name='hours_per_step' - ) + Args: + timesteps: The full timesteps + timesteps_subset: The subset of timesteps + hours_of_final_timestep: The duration of the final timestep, if already known - def _calculate_group_weights(self) -> Dict[str, float]: - """Calculate weights for aggregation groups.""" - # Count series in each group - groups = [ts.aggregation_group for ts in self.time_series_data.values() if ts.aggregation_group is not None] - group_counts = Counter(groups) + Returns: + The duration of the final timestep in hours - # Calculate weight for each group (1/count) - return {group: 1 / count for group, count in group_counts.items()} + Raises: + ValueError: If the provided timesteps_subset does not end before the timesteps superset + """ + if timesteps_superset is None: + if hours_of_final_timestep is not None: + return hours_of_final_timestep + return (timesteps[-1] - timesteps[-2]) / pd.Timedelta(hours=1) - def _calculate_weights(self) -> Dict[str, float]: - """Calculate weights for all time series.""" - # Calculate weight for each time series - weights = {} - for name, ts in self.time_series_data.items(): - if ts.aggregation_group is not None: - # Use group weight - weights[name] = self.group_weights.get(ts.aggregation_group, 1) - else: - # Use individual weight or default to 1 - weights[name] = ts.aggregation_weight or 1 + final_timestep = timesteps[-1] - return weights + if timesteps_superset[-1] == final_timestep: + if hours_of_final_timestep is not None: + return hours_of_final_timestep + return (timesteps_superset[-1] - timesteps_superset[-2]) / pd.Timedelta(hours=1) - def _format_stats(self, data) -> str: - """Format statistics for a data array.""" - if hasattr(data, 'values'): - values = data.values + elif timesteps_superset[-1] <= final_timestep: + raise ValueError( + f'The provided timesteps ({timesteps}) end after the provided timesteps_superset ({timesteps_superset})' + ) else: - values = np.asarray(data) - - mean_val = np.mean(values) - min_val = np.min(values) - max_val = np.max(values) - - return f'mean: {mean_val:.2f}, min: {min_val:.2f}, max: {max_val:.2f}' - - def __getitem__(self, name: str) -> TimeSeries: - """Get a TimeSeries by name.""" - try: - return self.time_series_data[name] - except KeyError as e: - raise KeyError(f'TimeSeries "{name}" not found in the TimeSeriesCollection') from e - - def __iter__(self) -> Iterator[TimeSeries]: - """Iterate through all TimeSeries in the collection.""" - return iter(self.time_series_data.values()) - - def __len__(self) -> int: - """Get the number of TimeSeries in the collection.""" - return len(self.time_series_data) - - def __contains__(self, item: Union[str, TimeSeries]) -> bool: - """Check if a TimeSeries exists in the collection.""" - if isinstance(item, str): - return item in self.time_series_data - elif isinstance(item, TimeSeries): - return any([item is ts for ts in self.time_series_data.values()]) - return False - - @property - def non_constants(self) -> List[TimeSeries]: - """Get time series with varying values.""" - return [ts for ts in self.time_series_data.values() if not ts.all_equal] - - @property - def constants(self) -> List[TimeSeries]: - """Get time series with constant values.""" - return [ts for ts in self.time_series_data.values() if ts.all_equal] - - @property - def timesteps(self) -> pd.DatetimeIndex: - """Get the active timesteps.""" - return self.all_timesteps if self._active_timesteps is None else self._active_timesteps - - @property - def timesteps_extra(self) -> pd.DatetimeIndex: - """Get the active timesteps with extra step.""" - return self.all_timesteps_extra if self._active_timesteps_extra is None else self._active_timesteps_extra + # Get the first timestep in the superset that is after the final timestep of the subset + extra_timestep = timesteps_superset[timesteps_superset > final_timestep].min() + return (extra_timestep - final_timestep) / pd.Timedelta(hours=1) - @property - def hours_per_timestep(self) -> xr.DataArray: - """Get the duration of each active timestep.""" - return ( - self.all_hours_per_timestep if self._active_hours_per_timestep is None else self._active_hours_per_timestep - ) - - @property - def hours_of_last_timestep(self) -> float: - """Get the duration of the last timestep.""" - return float(self.hours_per_timestep[-1].item()) - - def __repr__(self): - return f'TimeSeriesCollection:\n{self.to_dataset()}' + @staticmethod + def calculate_hours_per_timestep( + timesteps_extra: pd.DatetimeIndex, scenarios: Optional[pd.Index] = None + ) -> xr.DataArray: + """Calculate duration of each timestep.""" + # Calculate differences between consecutive timestamps + hours_per_step = np.diff(timesteps_extra) / pd.Timedelta(hours=1) - def __str__(self): - longest_name = max([time_series.name for time_series in self.time_series_data], key=len) + return DataConverter.as_dataarray( + hours_per_step, + timesteps=timesteps_extra[:-1], + scenarios=scenarios, + ).rename('hours_per_step') - stats_summary = '\n'.join( - [ - f' - {time_series.name:<{len(longest_name)}}: {get_numeric_stats(time_series.active_data)}' - for time_series in self.time_series_data - ] - ) - return ( - f'TimeSeriesCollection with {len(self.time_series_data)} series\n' - f' Time Range: {self.timesteps[0]} → {self.timesteps[-1]}\n' - f' No. of timesteps: {len(self.timesteps)} + 1 extra\n' - f' Hours per timestep: {get_numeric_stats(self.hours_per_timestep)}\n' - f' Time Series Data:\n' - f'{stats_summary}' - ) +def get_numeric_stats(data: xr.DataArray, decimals: int = 2, padd: int = 10, by_scenario: bool = False) -> str: + """ + Calculates the mean, median, min, max, and standard deviation of a numeric DataArray. + Args: + data: The DataArray to analyze + decimals: Number of decimal places to show + padd: Padding for alignment + by_scenario: Whether to break down stats by scenario -def get_numeric_stats(data: xr.DataArray, decimals: int = 2, padd: int = 10) -> str: - """Calculates the mean, median, min, max, and standard deviation of a numeric DataArray.""" + Returns: + String representation of data statistics + """ format_spec = f'>{padd}.{decimals}f' if padd else f'.{decimals}f' + + # If by_scenario is True and there's a scenario dimension with multiple values + if by_scenario and 'scenario' in data.dims and data.sizes['scenario'] > 1: + results = [] + for scenario in data.coords['scenario'].values: + scenario_data = data.sel(scenario=scenario) + if np.unique(scenario_data).size == 1: + results.append(f' {scenario}: {scenario_data.max().item():{format_spec}} (constant)') + else: + mean = scenario_data.mean().item() + median = scenario_data.median().item() + min_val = scenario_data.min().item() + max_val = scenario_data.max().item() + std = scenario_data.std().item() + results.append( + f' {scenario}: {mean:{format_spec}} (mean), {median:{format_spec}} (median), ' + f'{min_val:{format_spec}} (min), {max_val:{format_spec}} (max), {std:{format_spec}} (std)' + ) + return '\n'.join(['By scenario:'] + results) + + # Standard logic for non-scenario data or aggregated stats if np.unique(data).size == 1: return f'{data.max().item():{format_spec}} (constant)' + mean = data.mean().item() median = data.median().item() min_val = data.min().item() max_val = data.max().item() std = data.std().item() + return f'{mean:{format_spec}} (mean), {median:{format_spec}} (median), {min_val:{format_spec}} (min), {max_val:{format_spec}} (max), {std:{format_spec}} (std)' + + +def extract_data( + data: Optional[Union[int, float, xr.DataArray, TimeSeries]], + if_none: Any = None +) -> Any: + """ + Convert data to xr.DataArray. + + Args: + data: The data to convert (scalar, array, or DataArray) + if_none: The value to return if data is None + + Returns: + DataArray with the converted data, or the value specified by if_none + """ + if data is None: + return if_none + if isinstance(data, TimeSeries): + return data.selected_data + if isinstance(data, xr.DataArray): + return data + if isinstance(data, (int, float, np.integer, np.floating)): + return data + raise TypeError(f'Unsupported data type: {type(data).__name__}') diff --git a/flixopt/effects.py b/flixopt/effects.py index 82aa63a43..8affc933b 100644 --- a/flixopt/effects.py +++ b/flixopt/effects.py @@ -7,13 +7,13 @@ import logging import warnings -from typing import TYPE_CHECKING, Dict, Iterator, List, Literal, Optional, Union +from typing import TYPE_CHECKING, Dict, Iterator, List, Literal, Optional, Set, Tuple, Union import linopy import numpy as np -import pandas as pd +import xarray as xr -from .core import NumericData, NumericDataTS, Scalar, TimeSeries, TimeSeriesCollection +from .core import NumericDataTS, ScenarioData, TimeSeries, TimeSeriesCollection, TimestepData, extract_data from .features import ShareAllocationModel from .structure import Element, ElementModel, Interface, Model, SystemModel, register_class_for_io @@ -38,16 +38,16 @@ def __init__( meta_data: Optional[Dict] = None, is_standard: bool = False, is_objective: bool = False, - specific_share_to_other_effects_operation: Optional['EffectValuesUser'] = None, - specific_share_to_other_effects_invest: Optional['EffectValuesUser'] = None, - minimum_operation: Optional[Scalar] = None, - maximum_operation: Optional[Scalar] = None, - minimum_invest: Optional[Scalar] = None, - maximum_invest: Optional[Scalar] = None, + specific_share_to_other_effects_operation: Optional['EffectValuesUserTimestep'] = None, + specific_share_to_other_effects_invest: Optional['EffectValuesUserScenario'] = None, + minimum_operation: Optional[ScenarioData] = None, + maximum_operation: Optional[ScenarioData] = None, + minimum_invest: Optional[ScenarioData] = None, + maximum_invest: Optional[ScenarioData] = None, minimum_operation_per_hour: Optional[NumericDataTS] = None, maximum_operation_per_hour: Optional[NumericDataTS] = None, - minimum_total: Optional[Scalar] = None, - maximum_total: Optional[Scalar] = None, + minimum_total: Optional[ScenarioData] = None, + maximum_total: Optional[ScenarioData] = None, ): """ Args: @@ -76,10 +76,12 @@ def __init__( self.description = description self.is_standard = is_standard self.is_objective = is_objective - self.specific_share_to_other_effects_operation: EffectValuesUser = ( + self.specific_share_to_other_effects_operation: EffectValuesUserTimestep = ( specific_share_to_other_effects_operation or {} ) - self.specific_share_to_other_effects_invest: EffectValuesUser = specific_share_to_other_effects_invest or {} + self.specific_share_to_other_effects_invest: EffectValuesUserTimestep = ( + specific_share_to_other_effects_invest or {} + ) self.minimum_operation = minimum_operation self.maximum_operation = maximum_operation self.minimum_operation_per_hour: NumericDataTS = minimum_operation_per_hour @@ -96,11 +98,33 @@ def transform_data(self, flow_system: 'FlowSystem'): self.maximum_operation_per_hour = flow_system.create_time_series( f'{self.label_full}|maximum_operation_per_hour', self.maximum_operation_per_hour, flow_system ) - self.specific_share_to_other_effects_operation = flow_system.create_effect_time_series( f'{self.label_full}|operation->', self.specific_share_to_other_effects_operation, 'operation' ) + self.minimum_operation = flow_system.create_time_series( + f'{self.label_full}|minimum_operation', self.minimum_operation, has_time_dim=False + ) + self.maximum_operation = flow_system.create_time_series( + f'{self.label_full}|maximum_operation', self.maximum_operation, has_time_dim=False + ) + self.minimum_invest = flow_system.create_time_series( + f'{self.label_full}|minimum_invest', self.minimum_invest, has_time_dim=False + ) + self.maximum_invest = flow_system.create_time_series( + f'{self.label_full}|maximum_invest', self.maximum_invest, has_time_dim=False + ) + self.minimum_total = flow_system.create_time_series( + f'{self.label_full}|minimum_total', self.minimum_total, has_time_dim=False, + ) + self.maximum_total = flow_system.create_time_series( + f'{self.label_full}|maximum_total', self.maximum_total, has_time_dim=False + ) + self.specific_share_to_other_effects_invest = flow_system.create_effect_time_series( + f'{self.label_full}|invest->', self.specific_share_to_other_effects_invest, 'invest', + has_time_dim=False + ) + def create_model(self, model: SystemModel) -> 'EffectModel': self._plausibility_checks() self.model = EffectModel(model, self) @@ -118,31 +142,29 @@ def __init__(self, model: SystemModel, element: Effect): self.total: Optional[linopy.Variable] = None self.invest: ShareAllocationModel = self.add( ShareAllocationModel( - self._model, - False, - self.label_of_element, - 'invest', + model=self._model, + has_time_dim=False, + has_scenario_dim=True, + label_of_element=self.label_of_element, + label='invest', label_full=f'{self.label_full}(invest)', - total_max=self.element.maximum_invest, - total_min=self.element.minimum_invest, + total_max=extract_data(self.element.maximum_invest), + total_min=extract_data(self.element.minimum_invest), ) ) self.operation: ShareAllocationModel = self.add( ShareAllocationModel( - self._model, - True, - self.label_of_element, - 'operation', + model=self._model, + has_time_dim=True, + has_scenario_dim=True, + label_of_element=self.label_of_element, + label='operation', label_full=f'{self.label_full}(operation)', - total_max=self.element.maximum_operation, - total_min=self.element.minimum_operation, - min_per_hour=self.element.minimum_operation_per_hour.active_data - if self.element.minimum_operation_per_hour is not None - else None, - max_per_hour=self.element.maximum_operation_per_hour.active_data - if self.element.maximum_operation_per_hour is not None - else None, + total_max=extract_data(self.element.maximum_operation), + total_min=extract_data(self.element.minimum_operation), + min_per_hour=extract_data(self.element.minimum_operation_per_hour), + max_per_hour=extract_data(self.element.maximum_operation_per_hour), ) ) @@ -152,9 +174,9 @@ def do_modeling(self): self.total = self.add( self._model.add_variables( - lower=self.element.minimum_total if self.element.minimum_total is not None else -np.inf, - upper=self.element.maximum_total if self.element.maximum_total is not None else np.inf, - coords=None, + lower=extract_data(self.element.minimum_total, if_none=-np.inf), + upper=extract_data(self.element.maximum_total, if_none=np.inf), + coords=self._model.get_coords(time_dim=False), name=f'{self.label_full}|total', ), 'total', @@ -162,7 +184,7 @@ def do_modeling(self): self.add( self._model.add_constraints( - self.total == self.operation.total.sum() + self.invest.total.sum(), name=f'{self.label_full}|total' + self.total == self.operation.total + self.invest.total, name=f'{self.label_full}|total' ), 'total', ) @@ -171,11 +193,12 @@ def do_modeling(self): EffectValuesExpr = Dict[str, linopy.LinearExpression] # Used to create Shares EffectTimeSeries = Dict[str, TimeSeries] # Used internally to index values EffectValuesDict = Dict[str, NumericDataTS] # How effect values are stored -EffectValuesUser = Union[NumericDataTS, Dict[str, NumericDataTS]] # User-specified Shares to Effects -""" This datatype is used to define the share to an effect by a certain attribute. """ -EffectValuesUserScalar = Union[Scalar, Dict[str, Scalar]] # User-specified Shares to Effects -""" This datatype is used to define the share to an effect by a certain attribute. Only scalars are allowed. """ +EffectValuesUserScenario = Union[ScenarioData, Dict[str, ScenarioData]] +""" This datatype is used to define the share to an effect for every scenario. """ + +EffectValuesUserTimestep = Union[TimestepData, Dict[str, TimestepData]] +""" This datatype is used to define the share to an effect for every timestep. """ class EffectCollection: @@ -207,7 +230,9 @@ def add_effects(self, *effects: Effect) -> None: self._effects[effect.label] = effect logger.info(f'Registered new Effect: {effect.label}') - def create_effect_values_dict(self, effect_values_user: EffectValuesUser) -> Optional[EffectValuesDict]: + def create_effect_values_dict( + self, effect_values_user: Union[EffectValuesUserScenario, EffectValuesUserTimestep] + ) -> Optional[EffectValuesDict]: """ Converts effect values into a dictionary. If a scalar is provided, it is associated with a default effect type. @@ -244,26 +269,18 @@ def get_effect_label(eff: Union[Effect, str]) -> str: def _plausibility_checks(self) -> None: # Check circular loops in effects: - # TODO: Improve checks!! Only most basic case covered... + operation, invest = self.calculate_effect_share_factors() - def error_str(effect_label: str, share_ffect_label: str): - return ( - f' {effect_label} -> has share in: {share_ffect_label}\n' - f' {share_ffect_label} -> has share in: {effect_label}' - ) + operation_cycles = detect_cycles(tuples_to_adjacency_list([key for key in operation])) + invest_cycles = detect_cycles(tuples_to_adjacency_list([key for key in invest])) - for effect in self.effects.values(): - # Effekt darf nicht selber als Share in seinen ShareEffekten auftauchen: - # operation: - for target_effect in effect.specific_share_to_other_effects_operation.keys(): - assert effect not in self[target_effect].specific_share_to_other_effects_operation.keys(), ( - f'Error: circular operation-shares \n{error_str(target_effect.label, target_effect.label)}' - ) - # invest: - for target_effect in effect.specific_share_to_other_effects_invest.keys(): - assert effect not in self[target_effect].specific_share_to_other_effects_invest.keys(), ( - f'Error: circular invest-shares \n{error_str(target_effect.label, target_effect.label)}' - ) + if operation_cycles: + cycle_str = "\n".join([" -> ".join(cycle) for cycle in operation_cycles]) + raise ValueError(f'Error: circular operation-shares detected:\n{cycle_str}') + + if invest_cycles: + cycle_str = "\n".join([" -> ".join(cycle) for cycle in invest_cycles]) + raise ValueError(f'Error: circular invest-shares detected:\n{cycle_str}') def __getitem__(self, effect: Union[str, Effect]) -> 'Effect': """ @@ -296,7 +313,10 @@ def __contains__(self, item: Union[str, 'Effect']) -> bool: if isinstance(item, str): return item in self.effects # Check if the label exists elif isinstance(item, Effect): - return item in self.effects.values() # Check if the object exists + if item.label_full in self.effects: + return True + if item in self.effects.values(): # Check if the object exists + return True return False @property @@ -327,6 +347,30 @@ def objective_effect(self, value: Effect) -> None: raise ValueError(f'An objective-effect already exists! ({self._objective_effect.label=})') self._objective_effect = value + def calculate_effect_share_factors(self) -> Tuple[ + Dict[Tuple[str, str], xr.DataArray], + Dict[Tuple[str, str], xr.DataArray], + ]: + shares_invest = {} + for name, effect in self.effects.items(): + if effect.specific_share_to_other_effects_invest: + shares_invest[name] = { + target: extract_data(data) + for target, data in effect.specific_share_to_other_effects_invest.items() + } + shares_invest = calculate_all_conversion_paths(shares_invest) + + shares_operation = {} + for name, effect in self.effects.items(): + if effect.specific_share_to_other_effects_operation: + shares_operation[name] = { + target: extract_data(data) + for target, data in effect.specific_share_to_other_effects_operation.items() + } + shares_operation = calculate_all_conversion_paths(shares_operation) + + return shares_operation, shares_invest + class EffectCollectionModel(Model): """ @@ -346,29 +390,42 @@ def add_share_to_effects( ) -> None: for effect, expression in expressions.items(): if target == 'operation': - self.effects[effect].model.operation.add_share(name, expression) + self.effects[effect].model.operation.add_share( + name, + expression, + has_time_dim=True, + has_scenario_dim=True, + ) elif target == 'invest': - self.effects[effect].model.invest.add_share(name, expression) + self.effects[effect].model.invest.add_share( + name, + expression, + has_time_dim=False, + has_scenario_dim=True, + ) else: raise ValueError(f'Target {target} not supported!') def add_share_to_penalty(self, name: str, expression: linopy.LinearExpression) -> None: if expression.ndim != 0: raise TypeError(f'Penalty shares must be scalar expressions! ({expression.ndim=})') - self.penalty.add_share(name, expression) + self.penalty.add_share(name, expression, has_time_dim=False, has_scenario_dim=False) def do_modeling(self): for effect in self.effects: effect.create_model(self._model) self.penalty = self.add( - ShareAllocationModel(self._model, shares_are_time_series=False, label_of_element='Penalty') + ShareAllocationModel(self._model, has_time_dim=False, has_scenario_dim=False, label_of_element='Penalty') ) for model in [effect.model for effect in self.effects] + [self.penalty]: model.do_modeling() self._add_share_between_effects() - self._model.add_objective(self.effects.objective_effect.model.total + self.penalty.total) + self._model.add_objective( + (self.effects.objective_effect.model.total * self._model.scenario_weights).sum() + + self.penalty.total.sum() + ) def _add_share_between_effects(self): for origin_effect in self.effects: @@ -376,11 +433,161 @@ def _add_share_between_effects(self): for target_effect, time_series in origin_effect.specific_share_to_other_effects_operation.items(): self.effects[target_effect].model.operation.add_share( origin_effect.model.operation.label_full, - origin_effect.model.operation.total_per_timestep * time_series.active_data, + origin_effect.model.operation.total_per_timestep * time_series.selected_data, + has_time_dim=True, + has_scenario_dim=True, ) # 2. invest: -> hier ist es Scalar (share) for target_effect, factor in origin_effect.specific_share_to_other_effects_invest.items(): self.effects[target_effect].model.invest.add_share( origin_effect.model.invest.label_full, origin_effect.model.invest.total * factor, + has_time_dim=False, + has_scenario_dim=True, ) + + +def calculate_all_conversion_paths( + conversion_dict: Dict[str, Dict[str, xr.DataArray]], +) -> Dict[Tuple[str, str], xr.DataArray]: + """ + Calculates all possible direct and indirect conversion factors between units/domains. + This function uses Breadth-First Search (BFS) to find all possible conversion paths + between different units or domains in a conversion graph. It computes both direct + conversions (explicitly provided in the input) and indirect conversions (derived + through intermediate units). + Args: + conversion_dict: A nested dictionary where: + - Outer keys represent origin units/domains + - Inner dictionaries map target units/domains to their conversion factors + - Conversion factors can be integers, floats, or numpy arrays + Returns: + A dictionary mapping (origin, target) tuples to their respective conversion factors. + Each key is a tuple of strings representing the origin and target units/domains. + Each value is the conversion factor (int, float, or numpy array) from origin to target. + """ + # Initialize the result dictionary to accumulate all paths + result = {} + + # Add direct connections to the result first + for origin, targets in conversion_dict.items(): + for target, factor in targets.items(): + result[(origin, target)] = factor + + # Track all paths by keeping path history to avoid cycles + # Iterate over each domain in the dictionary + for origin in conversion_dict: + # Keep track of visited paths to avoid repeating calculations + processed_paths = set() + # Use a queue with (current_domain, factor, path_history) + queue = [(origin, 1, [origin])] + + while queue: + current_domain, factor, path = queue.pop(0) + + # Skip if we've processed this exact path before + path_key = tuple(path) + if path_key in processed_paths: + continue + processed_paths.add(path_key) + + # Iterate over the neighbors of the current domain + for target, conversion_factor in conversion_dict.get(current_domain, {}).items(): + # Skip if target would create a cycle + if target in path: + continue + + # Calculate the indirect conversion factor + indirect_factor = factor * conversion_factor + new_path = path + [target] + + # Only consider paths starting at origin and ending at some target + if len(new_path) > 2 and new_path[0] == origin: + # Update the result dictionary - accumulate factors from different paths + if (origin, target) in result: + result[(origin, target)] = result[(origin, target)] + indirect_factor + else: + result[(origin, target)] = indirect_factor + + # Add new path to queue for further exploration + queue.append((target, indirect_factor, new_path)) + + # Convert all values to DataArrays + result = {key: value if isinstance(value, xr.DataArray) else xr.DataArray(value) + for key, value in result.items()} + + return result + + +def detect_cycles(graph: Dict[str, List[str]]) -> List[List[str]]: + """ + Detects cycles in a directed graph using DFS. + + Args: + graph: Adjacency list representation of the graph + + Returns: + List of cycles found, where each cycle is a list of nodes + """ + # Track nodes in current recursion stack + visiting = set() + # Track nodes that have been fully explored + visited = set() + # Store all found cycles + cycles = [] + + def dfs_find_cycles(node, path=None): + if path is None: + path = [] + + # Current path to this node + current_path = path + [node] + # Add node to current recursion stack + visiting.add(node) + + # Check all neighbors + for neighbor in graph.get(node, []): + # If neighbor is in current path, we found a cycle + if neighbor in visiting: + # Get the cycle by extracting the relevant portion of the path + cycle_start = current_path.index(neighbor) + cycle = current_path[cycle_start:] + [neighbor] + cycles.append(cycle) + # If neighbor hasn't been fully explored, check it + elif neighbor not in visited: + dfs_find_cycles(neighbor, current_path) + + # Remove node from current path and mark as fully explored + visiting.remove(node) + visited.add(node) + + # Check each unvisited node + for node in graph: + if node not in visited: + dfs_find_cycles(node) + + return cycles + + +def tuples_to_adjacency_list(edges: List[Tuple[str, str]]) -> Dict[str, List[str]]: + """ + Converts a list of edge tuples (source, target) to an adjacency list representation. + + Args: + edges: List of (source, target) tuples representing directed edges + + Returns: + Dictionary mapping each source node to a list of its target nodes + """ + graph = {} + + for source, target in edges: + if source not in graph: + graph[source] = [] + graph[source].append(target) + + # Ensure target nodes with no outgoing edges are in the graph + if target not in graph: + graph[target] = [] + + return graph diff --git a/flixopt/elements.py b/flixopt/elements.py index a0bd8c91f..d216c2ca7 100644 --- a/flixopt/elements.py +++ b/flixopt/elements.py @@ -10,8 +10,8 @@ import numpy as np from .config import CONFIG -from .core import NumericData, NumericDataTS, PlausibilityError, Scalar, TimeSeriesCollection -from .effects import EffectValuesUser +from .core import PlausibilityError, Scalar, ScenarioData, TimestepData, extract_data +from .effects import EffectValuesUserTimestep from .features import InvestmentModel, OnOffModel, PreventSimultaneousUsageModel from .interface import InvestParameters, OnOffParameters from .structure import Element, ElementModel, SystemModel, register_class_for_io @@ -96,7 +96,7 @@ class Bus(Element): """ def __init__( - self, label: str, excess_penalty_per_flow_hour: Optional[NumericDataTS] = 1e5, meta_data: Optional[Dict] = None + self, label: str, excess_penalty_per_flow_hour: Optional[TimestepData] = 1e5, meta_data: Optional[Dict] = None ): """ Args: @@ -154,17 +154,17 @@ def __init__( self, label: str, bus: str, - size: Union[Scalar, InvestParameters] = None, - fixed_relative_profile: Optional[NumericDataTS] = None, - relative_minimum: NumericDataTS = 0, - relative_maximum: NumericDataTS = 1, - effects_per_flow_hour: Optional[EffectValuesUser] = None, + size: Union[ScenarioData, InvestParameters] = None, + fixed_relative_profile: Optional[TimestepData] = None, + relative_minimum: TimestepData = 0, + relative_maximum: TimestepData = 1, + effects_per_flow_hour: Optional[EffectValuesUserTimestep] = None, on_off_parameters: Optional[OnOffParameters] = None, - flow_hours_total_max: Optional[Scalar] = None, - flow_hours_total_min: Optional[Scalar] = None, - load_factor_min: Optional[Scalar] = None, - load_factor_max: Optional[Scalar] = None, - previous_flow_rate: Optional[NumericData] = None, + flow_hours_total_max: Optional[ScenarioData] = None, + flow_hours_total_min: Optional[ScenarioData] = None, + load_factor_min: Optional[ScenarioData] = None, + load_factor_max: Optional[ScenarioData] = None, + previous_flow_rate: Optional[ScenarioData] = None, meta_data: Optional[Dict] = None, ): r""" @@ -248,10 +248,25 @@ def transform_data(self, flow_system: 'FlowSystem'): self.effects_per_flow_hour = flow_system.create_effect_time_series( self.label_full, self.effects_per_flow_hour, 'per_flow_hour' ) + self.flow_hours_total_max = flow_system.create_time_series( + f'{self.label_full}|flow_hours_total_max', self.flow_hours_total_max, has_time_dim=False + ) + self.flow_hours_total_min = flow_system.create_time_series( + f'{self.label_full}|flow_hours_total_min', self.flow_hours_total_min, has_time_dim=False + ) + self.load_factor_max = flow_system.create_time_series( + f'{self.label_full}|load_factor_max', self.load_factor_max, has_time_dim=False + ) + self.load_factor_min = flow_system.create_time_series( + f'{self.label_full}|load_factor_min', self.load_factor_min, has_time_dim=False + ) + if self.on_off_parameters is not None: self.on_off_parameters.transform_data(flow_system, self.label_full) if isinstance(self.size, InvestParameters): - self.size.transform_data(flow_system) + self.size.transform_data(flow_system, self.label_full) + else: + self.size = flow_system.create_time_series(f'{self.label_full}|size', self.size, has_time_dim=False) def infos(self, use_numpy: bool = True, use_element_label: bool = False) -> Dict: infos = super().infos(use_numpy, use_element_label) @@ -269,8 +284,8 @@ def _plausibility_checks(self) -> None: if np.any(self.relative_minimum > self.relative_maximum): raise PlausibilityError(self.label_full + ': Take care, that relative_minimum <= relative_maximum!') - if ( - self.size == CONFIG.modeling.BIG and self.fixed_relative_profile is not None + if not isinstance(self.size, InvestParameters) and ( + np.any(self.size == CONFIG.modeling.BIG) and self.fixed_relative_profile is not None ): # Default Size --> Most likely by accident logger.warning( f'Flow "{self.label}" has no size assigned, but a "fixed_relative_profile". ' @@ -279,15 +294,14 @@ def _plausibility_checks(self) -> None: ) if self.fixed_relative_profile is not None and self.on_off_parameters is not None: - raise ValueError( - f'Flow {self.label} has both a fixed_relative_profile and an on_off_parameters. This is not supported. ' - f'Use relative_minimum and relative_maximum instead, ' - f'if you want to allow flows to be switched on and off.' + logger.warning( + f'Flow {self.label} has both a fixed_relative_profile and an on_off_parameters.' + f'This will allow the flow to be switched on and off, effectively differing from the fixed_flow_rate.' ) if (self.relative_minimum > 0).any() and self.on_off_parameters is None: logger.warning( - f'Flow {self.label} has a relative_minimum of {self.relative_minimum.active_data} and no on_off_parameters. ' + f'Flow {self.label} has a relative_minimum of {self.relative_minimum.selected_data} and no on_off_parameters. ' f'This prevents the flow_rate from switching off (flow_rate = 0). ' f'Consider using on_off_parameters to allow the flow to be switched on and off.' ) @@ -323,7 +337,7 @@ def do_modeling(self): self._model.add_variables( lower=self.flow_rate_lower_bound, upper=self.flow_rate_upper_bound, - coords=self._model.coords, + coords=self._model.get_coords(), name=f'{self.label_full}|flow_rate', ), 'flow_rate', @@ -362,9 +376,9 @@ def do_modeling(self): self.total_flow_hours = self.add( self._model.add_variables( - lower=self.element.flow_hours_total_min if self.element.flow_hours_total_min is not None else 0, - upper=self.element.flow_hours_total_max if self.element.flow_hours_total_max is not None else np.inf, - coords=None, + lower=extract_data(self.element.flow_hours_total_min, 0), + upper=extract_data(self.element.flow_hours_total_max, np.inf), + coords=self._model.get_coords(time_dim=False), name=f'{self.label_full}|total_flow_hours', ), 'total_flow_hours', @@ -372,7 +386,7 @@ def do_modeling(self): self.add( self._model.add_constraints( - self.total_flow_hours == (self.flow_rate * self._model.hours_per_step).sum(), + self.total_flow_hours == (self.flow_rate * self._model.hours_per_step).sum('time'), name=f'{self.label_full}|total_flow_hours', ), 'total_flow_hours', @@ -384,13 +398,21 @@ def do_modeling(self): # Shares self._create_shares() + def results_structure(self): + return { + **super().results_structure(), + 'start': self.element.bus if self.element.is_input_in_component else self.element.component, + 'end': self.element.component if self.element.is_input_in_component else self.element.bus, + 'component': self.element.component, + } + def _create_shares(self): # Arbeitskosten: if self.element.effects_per_flow_hour != {}: self._model.effects.add_share_to_effects( name=self.label_full, # Use the full label of the element expressions={ - effect: self.flow_rate * self._model.hours_per_step * factor.active_data + effect: self.flow_rate * self._model.hours_per_step * factor.selected_data for effect, factor in self.element.effects_per_flow_hour.items() }, target='operation', @@ -402,7 +424,7 @@ def _create_bounds_for_load_factor(self): # eq: var_sumFlowHours <= size * dt_tot * load_factor_max if self.element.load_factor_max is not None: name_short = 'load_factor_max' - flow_hours_per_size_max = self._model.hours_per_step.sum() * self.element.load_factor_max + flow_hours_per_size_max = self._model.hours_per_step.sum('time') * self.element.load_factor_max size = self.element.size if self._investment is None else self._investment.size self.add( @@ -416,7 +438,7 @@ def _create_bounds_for_load_factor(self): # eq: size * sum(dt)* load_factor_min <= var_sumFlowHours if self.element.load_factor_min is not None: name_short = 'load_factor_min' - flow_hours_per_size_min = self._model.hours_per_step.sum() * self.element.load_factor_min + flow_hours_per_size_min = self._model.hours_per_step.sum('time') * self.element.load_factor_min size = self.element.size if self._investment is None else self._investment.size self.add( @@ -428,34 +450,36 @@ def _create_bounds_for_load_factor(self): ) @property - def flow_rate_bounds_on(self) -> Tuple[NumericData, NumericData]: + def flow_rate_bounds_on(self) -> Tuple[TimestepData, TimestepData]: """Returns absolute flow rate bounds. Important for OnOffModel""" relative_minimum, relative_maximum = self.flow_rate_lower_bound_relative, self.flow_rate_upper_bound_relative size = self.element.size if not isinstance(size, InvestParameters): - return relative_minimum * size, relative_maximum * size + return relative_minimum * extract_data(size), relative_maximum * extract_data(size) if size.fixed_size is not None: - return relative_minimum * size.fixed_size, relative_maximum * size.fixed_size - return relative_minimum * size.minimum_size, relative_maximum * size.maximum_size + return (relative_minimum * extract_data(size.fixed_size), + relative_maximum * extract_data(size.fixed_size)) + return (relative_minimum * extract_data(size.minimum_size), + relative_maximum * extract_data(size.maximum_size)) @property - def flow_rate_lower_bound_relative(self) -> NumericData: + def flow_rate_lower_bound_relative(self) -> TimestepData: """Returns the lower bound of the flow_rate relative to its size""" fixed_profile = self.element.fixed_relative_profile if fixed_profile is None: - return self.element.relative_minimum.active_data - return fixed_profile.active_data + return extract_data(self.element.relative_minimum) + return extract_data(fixed_profile) @property - def flow_rate_upper_bound_relative(self) -> NumericData: + def flow_rate_upper_bound_relative(self) -> TimestepData: """ Returns the upper bound of the flow_rate relative to its size""" fixed_profile = self.element.fixed_relative_profile if fixed_profile is None: - return self.element.relative_maximum.active_data - return fixed_profile.active_data + return extract_data(self.element.relative_maximum) + return extract_data(fixed_profile) @property - def flow_rate_lower_bound(self) -> NumericData: + def flow_rate_lower_bound(self) -> TimestepData: """ Returns the minimum bound the flow_rate can reach. Further constraining might be done in OnOffModel and InvestmentModel @@ -465,18 +489,18 @@ def flow_rate_lower_bound(self) -> NumericData: if isinstance(self.element.size, InvestParameters): if self.element.size.optional: return 0 - return self.flow_rate_lower_bound_relative * self.element.size.minimum_size - return self.flow_rate_lower_bound_relative * self.element.size + return self.flow_rate_lower_bound_relative * extract_data(self.element.size.minimum_size) + return self.flow_rate_lower_bound_relative * extract_data(self.element.size) @property - def flow_rate_upper_bound(self) -> NumericData: + def flow_rate_upper_bound(self) -> TimestepData: """ Returns the maximum bound the flow_rate can reach. Further constraining might be done in OnOffModel and InvestmentModel """ if isinstance(self.element.size, InvestParameters): - return self.flow_rate_upper_bound_relative * self.element.size.maximum_size - return self.flow_rate_upper_bound_relative * self.element.size + return self.flow_rate_upper_bound_relative * extract_data(self.element.size.maximum_size) + return self.flow_rate_upper_bound_relative * extract_data(self.element.size) class BusModel(ElementModel): @@ -497,14 +521,18 @@ def do_modeling(self) -> None: # Fehlerplus/-minus: if self.element.with_excess: excess_penalty = np.multiply( - self._model.hours_per_step, self.element.excess_penalty_per_flow_hour.active_data + self._model.hours_per_step, self.element.excess_penalty_per_flow_hour.selected_data ) self.excess_input = self.add( - self._model.add_variables(lower=0, coords=self._model.coords, name=f'{self.label_full}|excess_input'), + self._model.add_variables( + lower=0, coords=self._model.get_coords(), name=f'{self.label_full}|excess_input' + ), 'excess_input', ) self.excess_output = self.add( - self._model.add_variables(lower=0, coords=self._model.coords, name=f'{self.label_full}|excess_output'), + self._model.add_variables( + lower=0, coords=self._model.get_coords(), name=f'{self.label_full}|excess_output' + ), 'excess_output', ) eq_bus_balance.lhs -= -self.excess_input + self.excess_output @@ -519,7 +547,8 @@ def results_structure(self): inputs.append(self.excess_input.name) if self.excess_output is not None: outputs.append(self.excess_output.name) - return {**super().results_structure(), 'inputs': inputs, 'outputs': outputs} + return {**super().results_structure(), 'inputs': inputs, 'outputs': outputs, + 'flows': [flow.label_full for flow in self.element.inputs + self.element.outputs]} class ComponentModel(ElementModel): @@ -572,4 +601,5 @@ def results_structure(self): **super().results_structure(), 'inputs': [flow.model.flow_rate.name for flow in self.element.inputs], 'outputs': [flow.model.flow_rate.name for flow in self.element.outputs], + 'flows': [flow.label_full for flow in self.element.inputs + self.element.outputs], } diff --git a/flixopt/features.py b/flixopt/features.py index c2a62adb1..94231ccec 100644 --- a/flixopt/features.py +++ b/flixopt/features.py @@ -9,9 +9,8 @@ import linopy import numpy as np -from . import utils from .config import CONFIG -from .core import NumericData, Scalar, TimeSeries +from .core import Scalar, ScenarioData, TimeSeries, TimestepData, extract_data from .interface import InvestParameters, OnOffParameters, Piecewise from .structure import Model, SystemModel @@ -27,13 +26,14 @@ def __init__( label_of_element: str, parameters: InvestParameters, defining_variable: [linopy.Variable], - relative_bounds_of_defining_variable: Tuple[NumericData, NumericData], + relative_bounds_of_defining_variable: Tuple[TimestepData, TimestepData], label: Optional[str] = None, on_variable: Optional[linopy.Variable] = None, ): super().__init__(model, label_of_element, label) self.size: Optional[Union[Scalar, linopy.Variable]] = None self.is_invested: Optional[linopy.Variable] = None + self.scenario_of_investment: Optional[linopy.Variable] = None self.piecewise_effects: Optional[PiecewiseEffectsModel] = None @@ -43,31 +43,32 @@ def __init__( self.parameters = parameters def do_modeling(self): - if self.parameters.fixed_size and not self.parameters.optional: - self.size = self.add( - self._model.add_variables( - lower=self.parameters.fixed_size, upper=self.parameters.fixed_size, name=f'{self.label_full}|size' - ), - 'size', - ) - else: - self.size = self.add( - self._model.add_variables( - lower=0 if self.parameters.optional else self.parameters.minimum_size, - upper=self.parameters.maximum_size, - name=f'{self.label_full}|size', - ), - 'size', - ) + self.size = self.add( + self._model.add_variables( + lower=0 if self.parameters.optional else extract_data(self.parameters.minimum_size), + upper=extract_data(self.parameters.maximum_size), + name=f'{self.label_full}|size', + coords=self._model.get_coords(time_dim=False), + ), + 'size', + ) # Optional if self.parameters.optional: self.is_invested = self.add( - self._model.add_variables(binary=True, name=f'{self.label_full}|is_invested'), 'is_invested' + self._model.add_variables( + binary=True, + name=f'{self.label_full}|is_invested', + coords=self._model.get_coords(time_dim=False), + ), + 'is_invested', ) self._create_bounds_for_optional_investment() + if self._model.time_series_collection.scenarios is not None: + self._create_bounds_for_scenarios() + # Bounds for defining variable self._create_bounds_for_defining_variable() @@ -153,11 +154,6 @@ def _create_bounds_for_defining_variable(self): ), f'fix_{variable.name}', ) - if self._on_variable is not None: - raise ValueError( - f'Flow {self.label_full} has a fixed relative flow rate and an on_variable.' - f'This combination is currently not supported.' - ) return # eq: defining_variable(t) <= size * upper_bound(t) @@ -182,7 +178,7 @@ def _create_bounds_for_defining_variable(self): # ... mit mega = relative_maximum * maximum_size # äquivalent zu:. # eq: - defining_variable(t) + mega * On(t) + size * relative_minimum(t) <= + mega - mega = lb_relative * self.parameters.maximum_size + mega = self.parameters.maximum_size * lb_relative on = self._on_variable self.add( self._model.add_constraints( @@ -192,6 +188,50 @@ def _create_bounds_for_defining_variable(self): ) # anmerkung: Glg bei Spezialfall relative_minimum = 0 redundant zu OnOff ?? + def _create_bounds_for_scenarios(self): + if isinstance(self.parameters.investment_scenarios, str): + if self.parameters.investment_scenarios == 'individual': + return + raise ValueError(f'Invalid value for investment_scenarios: {self.parameters.investment_scenarios}') + + if self.parameters.investment_scenarios is None: + self.add( + self._model.add_constraints( + self.size.isel(scenario=slice(None, -1)) == self.size.isel(scenario=slice(1, None)), + name=f'{self.label_full}|equalize_size_per_scenario', + ), + 'equalize_size_per_scenario', + ) + return + if not isinstance(self.parameters.investment_scenarios, list): + raise ValueError(f'Invalid value for investment_scenarios: {self.parameters.investment_scenarios}') + if not all(scenario in self._model.time_series_collection.scenarios for scenario in self.parameters.investment_scenarios): + raise ValueError(f'Some scenarios in investment_scenarios are not present in the time_series_collection: ' + f'{self.parameters.investment_scenarios}. This might be due to selecting a subset of ' + f'all scenarios, which is not yet supported.') + + investment_scenarios = self._model.time_series_collection.scenarios.intersection(self.parameters.investment_scenarios) + no_investment_scenarios = self._model.time_series_collection.scenarios.difference(self.parameters.investment_scenarios) + + # eq: size(s) = size(s') for s, s' in investment_scenarios + if len(investment_scenarios) > 1: + self.add( + self._model.add_constraints( + self.size.sel(scenario=investment_scenarios[:-1]) == self.size.sel(scenario=investment_scenarios[1:]), + name=f'{self.label_full}|investment_scenarios', + ), + 'investment_scenarios', + ) + + if len(no_investment_scenarios) >= 1: + self.add( + self._model.add_constraints( + self.size.sel(scenario=no_investment_scenarios) == 0, + name=f'{self.label_full}|no_investment_scenarios', + ), + 'no_investment_scenarios', + ) + class StateModel(Model): """ @@ -203,12 +243,12 @@ def __init__( model: SystemModel, label_of_element: str, defining_variables: List[linopy.Variable], - defining_bounds: List[Tuple[NumericData, NumericData]], - previous_values: List[Optional[NumericData]] = None, + defining_bounds: List[Tuple[TimestepData, TimestepData]], + previous_values: List[Optional[TimestepData]] = None, use_off: bool = True, - on_hours_total_min: Optional[NumericData] = 0, - on_hours_total_max: Optional[NumericData] = None, - effects_per_running_hour: Dict[str, NumericData] = None, + on_hours_total_min: Optional[ScenarioData] = 0, + on_hours_total_max: Optional[ScenarioData] = None, + effects_per_running_hour: Dict[str, TimestepData] = None, label: Optional[str] = None, ): """ @@ -245,16 +285,16 @@ def do_modeling(self): self._model.add_variables( name=f'{self.label_full}|on', binary=True, - coords=self._model.coords, + coords=self._model.get_coords(), ), 'on', ) self.total_on_hours = self.add( self._model.add_variables( - lower=self._on_hours_total_min, - upper=self._on_hours_total_max, - coords=None, + lower=extract_data(self._on_hours_total_min), + upper=extract_data(self._on_hours_total_max), + coords=self._model.get_coords(time_dim=False), name=f'{self.label_full}|on_hours_total', ), 'on_hours_total', @@ -262,7 +302,7 @@ def do_modeling(self): self.add( self._model.add_constraints( - self.total_on_hours == (self.on * self._model.hours_per_step).sum(), + self.total_on_hours == (self.on * self._model.hours_per_step).sum('time'), name=f'{self.label_full}|on_hours_total', ), 'on_hours_total', @@ -276,7 +316,7 @@ def do_modeling(self): self._model.add_variables( name=f'{self.label_full}|off', binary=True, - coords=self._model.coords, + coords=self._model.get_coords(), ), 'off', ) @@ -344,7 +384,7 @@ def previous_off_states(self): return 1 - self.previous_states @staticmethod - def compute_previous_states(previous_values: List[NumericData], epsilon: float = 1e-5) -> np.ndarray: + def compute_previous_states(previous_values: List[TimestepData], epsilon: float = 1e-5) -> np.ndarray: """Computes the previous states {0, 1} of defining variables as a binary array from their previous values.""" if not previous_values or all([val is None for val in previous_values]): return np.array([0]) @@ -385,19 +425,19 @@ def do_modeling(self): # Create switch variables self.switch_on = self.add( - self._model.add_variables(binary=True, name=f'{self.label_full}|switch_on', coords=self._model.coords), + self._model.add_variables(binary=True, name=f'{self.label_full}|switch_on', coords=self._model.get_coords()), 'switch_on', ) self.switch_off = self.add( - self._model.add_variables(binary=True, name=f'{self.label_full}|switch_off', coords=self._model.coords), + self._model.add_variables(binary=True, name=f'{self.label_full}|switch_off', coords=self._model.get_coords()), 'switch_off', ) # Create count variable for number of switches self.switch_on_nr = self.add( self._model.add_variables( - upper=self._switch_on_max, + upper=extract_data(self._switch_on_max), lower=0, name=f'{self.label_full}|switch_on_nr', ), @@ -451,9 +491,9 @@ def __init__( model: SystemModel, label_of_element: str, state_variable: linopy.Variable, - minimum_duration: Optional[NumericData] = None, - maximum_duration: Optional[NumericData] = None, - previous_states: Optional[NumericData] = None, + minimum_duration: Optional[TimestepData] = None, + maximum_duration: Optional[TimestepData] = None, + previous_states: Optional[TimestepData] = None, label: Optional[str] = None, ): """ @@ -475,9 +515,9 @@ def __init__( self._maximum_duration = maximum_duration if isinstance(self._minimum_duration, TimeSeries): - self._minimum_duration = self._minimum_duration.active_data + self._minimum_duration = self._minimum_duration.selected_data if isinstance(self._maximum_duration, TimeSeries): - self._maximum_duration = self._maximum_duration.active_data + self._maximum_duration = self._maximum_duration.selected_data self.duration = None @@ -491,8 +531,8 @@ def do_modeling(self): self.duration = self.add( self._model.add_variables( lower=0, - upper=self._maximum_duration if self._maximum_duration is not None else mega, - coords=self._model.coords, + upper=extract_data(self._maximum_duration, mega), + coords=self._model.get_coords(), name=f'{self.label_full}|hours', ), 'hours', @@ -545,7 +585,7 @@ def do_modeling(self): ) # Handle initial condition - if 0 < self.previous_duration < self._minimum_duration.isel(time=0): + if 0 < self.previous_duration < self._minimum_duration.isel(time=0).max(): self.add( self._model.add_constraints( self._state_variable.isel(time=0) == 1, name=f'{self.label_full}|initial_minimum' @@ -570,12 +610,12 @@ def previous_duration(self) -> Scalar: """Computes the previous duration of the state variable""" #TODO: Allow for other/dynamic timestep resolutions return ConsecutiveStateModel.compute_consecutive_hours_in_state( - self._previous_states, self._model.hours_per_step.isel(time=0).item() + self._previous_states, self._model.hours_per_step.isel(time=0).values.flatten()[0] ) @staticmethod def compute_consecutive_hours_in_state( - binary_values: NumericData, hours_per_timestep: Union[int, float, np.ndarray] + binary_values: TimestepData, hours_per_timestep: Union[int, float, np.ndarray] ) -> Scalar: """ Computes the final consecutive duration in state 'on' (=1) in hours, from a binary array. @@ -634,8 +674,8 @@ def __init__( on_off_parameters: OnOffParameters, label_of_element: str, defining_variables: List[linopy.Variable], - defining_bounds: List[Tuple[NumericData, NumericData]], - previous_values: List[Optional[NumericData]], + defining_bounds: List[Tuple[TimestepData, TimestepData]], + previous_values: List[Optional[TimestepData]], label: Optional[str] = None, ): """ @@ -672,8 +712,8 @@ def do_modeling(self): defining_bounds=self._defining_bounds, previous_values=self._previous_values, use_off=self.parameters.use_off, - on_hours_total_min=self.parameters.on_hours_total_min, - on_hours_total_max=self.parameters.on_hours_total_max, + on_hours_total_min=extract_data(self.parameters.on_hours_total_min), + on_hours_total_max=extract_data(self.parameters.on_hours_total_max), effects_per_running_hour=self.parameters.effects_per_running_hour, ) self.add(self.state_model) @@ -792,7 +832,7 @@ def do_modeling(self): self._model.add_variables( binary=True, name=f'{self.label_full}|inside_piece', - coords=self._model.coords if self._as_time_series else None, + coords=self._model.get_coords(time_dim=self._as_time_series), ), 'inside_piece', ) @@ -802,7 +842,7 @@ def do_modeling(self): lower=0, upper=1, name=f'{self.label_full}|lambda0', - coords=self._model.coords if self._as_time_series else None, + coords=self._model.get_coords(time_dim=self._as_time_series), ), 'lambda0', ) @@ -812,7 +852,7 @@ def do_modeling(self): lower=0, upper=1, name=f'{self.label_full}|lambda1', - coords=self._model.coords if self._as_time_series else None, + coords=self._model.get_coords(time_dim=self._as_time_series), ), 'lambda1', ) @@ -896,7 +936,7 @@ def do_modeling(self): elif self._zero_point is True: self.zero_point = self.add( self._model.add_variables( - coords=self._model.coords, binary=True, name=f'{self.label_full}|zero_point' + coords=self._model.get_coords(), binary=True, name=f'{self.label_full}|zero_point' ), 'zero_point', ) @@ -917,19 +957,20 @@ class ShareAllocationModel(Model): def __init__( self, model: SystemModel, - shares_are_time_series: bool, + has_time_dim: bool, + has_scenario_dim: bool, label_of_element: Optional[str] = None, label: Optional[str] = None, label_full: Optional[str] = None, - total_max: Optional[Scalar] = None, - total_min: Optional[Scalar] = None, - max_per_hour: Optional[NumericData] = None, - min_per_hour: Optional[NumericData] = None, + total_max: Optional[ScenarioData] = None, + total_min: Optional[ScenarioData] = None, + max_per_hour: Optional[TimestepData] = None, + min_per_hour: Optional[TimestepData] = None, ): super().__init__(model, label_of_element=label_of_element, label=label, label_full=label_full) - if not shares_are_time_series: # If the condition is True + if not has_time_dim: # If the condition is True assert max_per_hour is None and min_per_hour is None, ( - 'Both max_per_hour and min_per_hour cannot be used when shares_are_time_series is False' + 'Both max_per_hour and min_per_hour cannot be used when has_time_dim is False' ) self.total_per_timestep: Optional[linopy.Variable] = None self.total: Optional[linopy.Variable] = None @@ -940,8 +981,9 @@ def __init__( self._eq_total: Optional[linopy.Constraint] = None # Parameters - self._shares_are_time_series = shares_are_time_series - self._total_max = total_max if total_min is not None else np.inf + self._has_time_dim = has_time_dim + self._has_scenario_dim = has_scenario_dim + self._total_max = total_max if total_max is not None else np.inf self._total_min = total_min if total_min is not None else -np.inf self._max_per_hour = max_per_hour if max_per_hour is not None else np.inf self._min_per_hour = min_per_hour if min_per_hour is not None else -np.inf @@ -949,7 +991,10 @@ def __init__( def do_modeling(self): self.total = self.add( self._model.add_variables( - lower=self._total_min, upper=self._total_max, coords=None, name=f'{self.label_full}|total' + lower=self._total_min, + upper=self._total_max, + coords=self._model.get_coords(time_dim=False, scenario_dim=self._has_scenario_dim), + name=f'{self.label_full}|total', ), 'total', ) @@ -958,16 +1003,12 @@ def do_modeling(self): self._model.add_constraints(self.total == 0, name=f'{self.label_full}|total'), 'total' ) - if self._shares_are_time_series: + if self._has_time_dim: self.total_per_timestep = self.add( self._model.add_variables( - lower=-np.inf - if (self._min_per_hour is None) - else np.multiply(self._min_per_hour, self._model.hours_per_step), - upper=np.inf - if (self._max_per_hour is None) - else np.multiply(self._max_per_hour, self._model.hours_per_step), - coords=self._model.coords, + lower=-np.inf if (self._min_per_hour is None) else extract_data(self._min_per_hour) * self._model.hours_per_step, + upper=np.inf if (self._max_per_hour is None) else extract_data(self._max_per_hour) * self._model.hours_per_step, + coords=self._model.get_coords(time_dim=True, scenario_dim=self._has_scenario_dim), name=f'{self.label_full}|total_per_timestep', ), 'total_per_timestep', @@ -979,12 +1020,14 @@ def do_modeling(self): ) # Add it to the total - self._eq_total.lhs -= self.total_per_timestep.sum() + self._eq_total.lhs -= self.total_per_timestep.sum(dim='time') def add_share( self, name: str, expression: linopy.LinearExpression, + has_time_dim: bool, + has_scenario_dim: bool, ): """ Add a share to the share allocation model. If the share already exists, the expression is added to the existing share. @@ -996,16 +1039,17 @@ def add_share( name: The name of the share. expression: The expression of the share. Added to the right hand side of the constraint. """ + if has_time_dim and not self._has_time_dim: + raise ValueError('Cannot add share with time_dim=True to a model without time_dim') + if has_scenario_dim and not self._has_scenario_dim: + raise ValueError('Cannot add share with scenario_dim=True to a model without scenario_dim') + if name in self.shares: self.share_constraints[name].lhs -= expression else: self.shares[name] = self.add( self._model.add_variables( - coords=None - if isinstance(expression, linopy.LinearExpression) - and expression.ndim == 0 - or not isinstance(expression, linopy.LinearExpression) - else self._model.coords, + coords=self._model.get_coords(time_dim=has_time_dim, scenario_dim=has_scenario_dim), name=f'{name}->{self.label_full}', ), name, @@ -1013,7 +1057,7 @@ def add_share( self.share_constraints[name] = self.add( self._model.add_constraints(self.shares[name] == expression, name=f'{name}->{self.label_full}'), name ) - if self.shares[name].ndim == 0: + if not has_time_dim: self._eq_total.lhs -= self.shares[name] else: self._eq_total_per_timestep.lhs -= self.shares[name] @@ -1042,7 +1086,12 @@ def __init__( def do_modeling(self): self.shares = { - effect: self.add(self._model.add_variables(coords=None, name=f'{self.label_full}|{effect}'), f'{effect}') + effect: self.add( + self._model.add_variables( + coords=self._model.get_coords(time_dim=False), name=f'{self.label_full}|{effect}' + ), + f'{effect}', + ) for effect in self._piecewise_shares } diff --git a/flixopt/flow_system.py b/flixopt/flow_system.py index 93720de60..591b55e06 100644 --- a/flixopt/flow_system.py +++ b/flixopt/flow_system.py @@ -16,10 +16,17 @@ from rich.pretty import Pretty from . import io as fx_io -from .core import NumericData, NumericDataTS, TimeSeries, TimeSeriesCollection, TimeSeriesData -from .effects import Effect, EffectCollection, EffectTimeSeries, EffectValuesDict, EffectValuesUser +from .core import Scalar, ScenarioData, TimeSeries, TimeSeriesCollection, TimeSeriesData, TimestepData +from .effects import ( + Effect, + EffectCollection, + EffectTimeSeries, + EffectValuesDict, + EffectValuesUserScenario, + EffectValuesUserTimestep, +) from .elements import Bus, Component, Flow -from .structure import CLASS_REGISTRY, Element, SystemModel, get_compact_representation, get_str_representation +from .structure import CLASS_REGISTRY, Element, SystemModel if TYPE_CHECKING: import pyvis @@ -35,23 +42,31 @@ class FlowSystem: def __init__( self, timesteps: pd.DatetimeIndex, + scenarios: Optional[pd.Index] = None, hours_of_last_timestep: Optional[float] = None, hours_of_previous_timesteps: Optional[Union[int, float, np.ndarray]] = None, + scenario_weights: Optional[ScenarioData] = None, ): """ Args: timesteps: The timesteps of the model. + scenarios: The scenarios of the model. hours_of_last_timestep: The duration of the last time step. Uses the last time interval if not specified hours_of_previous_timesteps: The duration of previous timesteps. If None, the first time increment of time_series is used. This is needed to calculate previous durations (for example consecutive_on_hours). If you use an array, take care that its long enough to cover all previous values! + scenario_weights: The weights of the scenarios. If None, all scenarios have the same weight. All weights are normalized to 1. """ self.time_series_collection = TimeSeriesCollection( timesteps=timesteps, + scenarios=scenarios, hours_of_last_timestep=hours_of_last_timestep, hours_of_previous_timesteps=hours_of_previous_timesteps, ) + self.scenario_weights = self.create_time_series( + 'scenario_weights', scenario_weights, has_time_dim=False, has_scenario_dim=True + ) # defaults: self.components: Dict[str, Component] = {} @@ -66,10 +81,15 @@ def from_dataset(cls, ds: xr.Dataset): timesteps_extra = pd.DatetimeIndex(ds.attrs['timesteps_extra'], name='time') hours_of_last_timestep = TimeSeriesCollection.calculate_hours_per_timestep(timesteps_extra).isel(time=-1).item() + scenarios = pd.Index(ds.attrs['scenarios'], name='scenario') if ds.attrs.get('scenarios') is not None else None + scenario_weights = fx_io.insert_dataarray(ds.attrs['scenario_weights'], ds) + flow_system = FlowSystem( timesteps=timesteps_extra[:-1], hours_of_last_timestep=hours_of_last_timestep, hours_of_previous_timesteps=ds.attrs['hours_of_previous_timesteps'], + scenarios=scenarios, + scenario_weights=scenario_weights, ) structure = fx_io.insert_dataarray({key: ds.attrs[key] for key in ['components', 'buses', 'effects']}, ds) @@ -90,11 +110,15 @@ def from_dict(cls, data: Dict) -> 'FlowSystem': """ timesteps_extra = pd.DatetimeIndex(data['timesteps_extra'], name='time') hours_of_last_timestep = TimeSeriesCollection.calculate_hours_per_timestep(timesteps_extra).isel(time=-1).item() + scenarios = pd.Index(data['scenarios'], name='scenario') if data.get('scenarios') is not None else None + scenario_weights = data.get('scenario_weights').selected_data if data.get('scenario_weights') is not None else None flow_system = FlowSystem( timesteps=timesteps_extra[:-1], hours_of_last_timestep=hours_of_last_timestep, hours_of_previous_timesteps=data['hours_of_previous_timesteps'], + scenarios=scenarios, + scenario_weights=scenario_weights, ) flow_system.add_elements(*[Bus.from_dict(bus) for bus in data['buses'].values()]) @@ -170,6 +194,8 @@ def as_dict(self, data_mode: Literal['data', 'name', 'stats'] = 'data') -> Dict: }, 'timesteps_extra': [date.isoformat() for date in self.time_series_collection.timesteps_extra], 'hours_of_previous_timesteps': self.time_series_collection.hours_of_previous_timesteps, + 'scenarios': self.time_series_collection.scenarios.tolist() if self.time_series_collection.scenarios is not None else None, + 'scenario_weights': self.scenario_weights, } if data_mode == 'data': return fx_io.replace_timeseries(data, 'data') @@ -184,7 +210,7 @@ def as_dataset(self, constants_in_dataset: bool = False) -> xr.Dataset: Args: constants_in_dataset: If True, constants are included as Dataset variables. """ - ds = self.time_series_collection.to_dataset(include_constants=constants_in_dataset) + ds = self.time_series_collection.as_dataset() ds.attrs = self.as_dict(data_mode='name') return ds @@ -268,41 +294,80 @@ def network_infos(self) -> Tuple[Dict[str, Dict[str, str]], Dict[str, Dict[str, def transform_data(self): if not self._connected: self._connect_network() + self.scenario_weights = self.create_time_series( + 'scenario_weights', self.scenario_weights, has_time_dim=False, has_scenario_dim=True + ) for element in self.all_elements.values(): element.transform_data(self) def create_time_series( self, name: str, - data: Optional[Union[NumericData, TimeSeriesData, TimeSeries]], - needs_extra_timestep: bool = False, - ) -> Optional[TimeSeries]: + data: Optional[Union[TimestepData, TimeSeriesData, TimeSeries]], + has_time_dim: bool = True, + has_scenario_dim: bool = True, + has_extra_timestep: bool = False, + ) -> Optional[Union[Scalar, TimeSeries]]: """ - Tries to create a TimeSeries from NumericData Data and adds it to the time_series_collection + Tries to create a TimeSeries from TimestepData and adds it to the time_series_collection If the data already is a TimeSeries, nothing happens and the TimeSeries gets reset and returned If the data is a TimeSeriesData, it is converted to a TimeSeries, and the aggregation weights are applied. If the data is None, nothing happens. + + Args: + name: The name of the TimeSeries + data: The data to create a TimeSeries from + has_time_dim: Whether the data has a time dimension + has_scenario_dim: Whether the data has a scenario dimension + has_extra_timestep: Whether the data has an extra timestep """ + if not has_time_dim and not has_scenario_dim: + raise ValueError('At least one of the dimensions must be present') if data is None: return None - elif isinstance(data, TimeSeries): + + if not has_time_dim and self.time_series_collection.scenarios is None: + return data + + if isinstance(data, TimeSeries): data.restore_data() if data in self.time_series_collection: return data - return self.time_series_collection.create_time_series( - data=data.active_data, name=name, needs_extra_timestep=needs_extra_timestep + return self.time_series_collection.add_time_series( + data=data.selected_data, + name=name, + has_time_dim=has_time_dim, + has_scenario_dim=has_scenario_dim, + has_extra_timestep=has_extra_timestep, ) - return self.time_series_collection.create_time_series( - data=data, name=name, needs_extra_timestep=needs_extra_timestep + elif isinstance(data, TimeSeriesData): + data.label = name + return self.time_series_collection.add_time_series( + data=data.data, + name=name, + has_time_dim=has_time_dim, + has_scenario_dim=has_scenario_dim, + has_extra_timestep=has_extra_timestep, + aggregation_weight=data.agg_weight, + aggregation_group=data.agg_group, + ) + return self.time_series_collection.add_time_series( + data=data, + name=name, + has_time_dim=has_time_dim, + has_scenario_dim=has_scenario_dim, + has_extra_timestep=has_extra_timestep, ) def create_effect_time_series( self, label_prefix: Optional[str], - effect_values: EffectValuesUser, + effect_values: Union[EffectValuesUserScenario, EffectValuesUserTimestep], label_suffix: Optional[str] = None, - ) -> Optional[EffectTimeSeries]: + has_time_dim: bool = True, + has_scenario_dim: bool = True, + ) -> Optional[Union[EffectTimeSeries, EffectValuesDict]]: """ Transform EffectValues to EffectTimeSeries. Creates a TimeSeries for each key in the nested_values dictionary, using the value as the data. @@ -310,13 +375,31 @@ def create_effect_time_series( The resulting label of the TimeSeries is the label of the parent_element, followed by the label of the Effect in the nested_values and the label_suffix. If the key in the EffectValues is None, the alias 'Standard_Effect' is used + + Args: + label_prefix: Prefix for the TimeSeries name + effect_values: Dictionary of EffectValues + label_suffix: Suffix for the TimeSeries name + has_time_dim: Whether the data has a time dimension + has_scenario_dim: Whether the data has a scenario dimension """ + if not has_time_dim and not has_scenario_dim: + raise ValueError('At least one of the dimensions must be present') + effect_values: Optional[EffectValuesDict] = self.effects.create_effect_values_dict(effect_values) if effect_values is None: return None + if not has_time_dim and self.time_series_collection.scenarios is None: + return effect_values + return { - effect: self.create_time_series('|'.join(filter(None, [label_prefix, effect, label_suffix])), value) + effect: self.create_time_series( + name='|'.join(filter(None, [label_prefix, effect, label_suffix])), + data=value, + has_time_dim=has_time_dim, + has_scenario_dim=has_scenario_dim, + ) for effect, value in effect_values.items() } diff --git a/flixopt/interface.py b/flixopt/interface.py index c38d6c619..4b7634e96 100644 --- a/flixopt/interface.py +++ b/flixopt/interface.py @@ -4,14 +4,14 @@ """ import logging -from typing import TYPE_CHECKING, Dict, Iterator, List, Optional, Union +from typing import TYPE_CHECKING, Dict, Iterator, List, Literal, Optional, Union from .config import CONFIG -from .core import NumericData, NumericDataTS, Scalar +from .core import NumericDataTS, Scalar, ScenarioData, TimestepData from .structure import Interface, register_class_for_io if TYPE_CHECKING: # for type checking and preventing circular imports - from .effects import EffectValuesUser, EffectValuesUserScalar + from .effects import EffectValuesUserScenario, EffectValuesUserTimestep from .flow_system import FlowSystem @@ -20,7 +20,7 @@ @register_class_for_io class Piece(Interface): - def __init__(self, start: NumericData, end: NumericData): + def __init__(self, start: TimestepData, end: TimestepData): """ Define a Piece, which is part of a Piecewise object. @@ -30,10 +30,15 @@ def __init__(self, start: NumericData, end: NumericData): """ self.start = start self.end = end + self.has_time_dim = False def transform_data(self, flow_system: 'FlowSystem', name_prefix: str): - self.start = flow_system.create_time_series(f'{name_prefix}|start', self.start) - self.end = flow_system.create_time_series(f'{name_prefix}|end', self.end) + self.start = flow_system.create_time_series( + name=f'{name_prefix}|start', data=self.start, has_time_dim=self.has_time_dim, has_scenario_dim=True + ) + self.end = flow_system.create_time_series( + name=f'{name_prefix}|end', data=self.end, has_time_dim=self.has_time_dim, has_scenario_dim=True + ) @register_class_for_io @@ -46,6 +51,17 @@ def __init__(self, pieces: List[Piece]): pieces: The pieces of the piecewise. """ self.pieces = pieces + self._has_time_dim = False + + @property + def has_time_dim(self): + return self._has_time_dim + + @has_time_dim.setter + def has_time_dim(self, value): + self._has_time_dim = value + for piece in self.pieces: + piece.has_time_dim = value def __len__(self): return len(self.pieces) @@ -73,6 +89,18 @@ def __init__(self, piecewises: Dict[str, Piecewise]): piecewises: Dict of Piecewises defining the conversion factors. flow labels as keys, piecewise as values """ self.piecewises = piecewises + self._has_time_dim = True + self.has_time_dim = True # Inital propagation + + @property + def has_time_dim(self): + return self._has_time_dim + + @has_time_dim.setter + def has_time_dim(self, value): + self._has_time_dim = value + for piecewise in self.piecewises.values(): + piecewise.has_time_dim = value def items(self): return self.piecewises.items() @@ -94,12 +122,24 @@ def __init__(self, piecewise_origin: Piecewise, piecewise_shares: Dict[str, Piec """ self.piecewise_origin = piecewise_origin self.piecewise_shares = piecewise_shares + self._has_time_dim = False + self.has_time_dim = False # Inital propagation + + @property + def has_time_dim(self): + return self._has_time_dim + + @has_time_dim.setter + def has_time_dim(self, value): + self._has_time_dim = value + self.piecewise_origin.has_time_dim = value + for piecewise in self.piecewise_shares.values(): + piecewise.has_time_dim = value def transform_data(self, flow_system: 'FlowSystem', name_prefix: str): - raise NotImplementedError('PiecewiseEffects is not yet implemented for non scalar shares') - # self.piecewise_origin.transform_data(flow_system, f'{name_prefix}|PiecewiseEffects|origin') - # for name, piecewise in self.piecewise_shares.items(): - # piecewise.transform_data(flow_system, f'{name_prefix}|PiecewiseEffects|{name}') + self.piecewise_origin.transform_data(flow_system, f'{name_prefix}|PiecewiseEffects|origin') + for effect, piecewise in self.piecewise_shares.items(): + piecewise.transform_data(flow_system, f'{name_prefix}|PiecewiseEffects|{effect}') @register_class_for_io @@ -110,14 +150,15 @@ class InvestParameters(Interface): def __init__( self, - fixed_size: Optional[Union[int, float]] = None, - minimum_size: Optional[Union[int, float]] = None, - maximum_size: Optional[Union[int, float]] = None, + fixed_size: Optional[ScenarioData] = None, + minimum_size: Optional[ScenarioData] = None, + maximum_size: Optional[ScenarioData] = None, optional: bool = True, # Investition ist weglassbar - fix_effects: Optional['EffectValuesUserScalar'] = None, - specific_effects: Optional['EffectValuesUserScalar'] = None, # costs per Flow-Unit/Storage-Size/... + fix_effects: Optional['EffectValuesUserScenario'] = None, + specific_effects: Optional['EffectValuesUserScenario'] = None, # costs per Flow-Unit/Storage-Size/... piecewise_effects: Optional[PiecewiseEffects] = None, - divest_effects: Optional['EffectValuesUserScalar'] = None, + divest_effects: Optional['EffectValuesUserScenario'] = None, + investment_scenarios: Optional[Union[Literal['individual'], List[Union[int, str]]]] = None, ): """ Args: @@ -128,58 +169,99 @@ def __init__( specific_effects: Specific costs, e.g., in €/kW_nominal or €/m²_nominal. Example: {costs: 3, CO2: 0.3} with costs and CO2 representing an Object of class Effect (Attention: Annualize costs to chosen period!) - piecewise_effects: Linear piecewise relation [invest_pieces, cost_pieces]. - Example 1: - [ [5, 25, 25, 100], # size in kW - {costs: [50,250,250,800], # € - PE: [5, 25, 25, 100] # kWh_PrimaryEnergy - } - ] - Example 2 (if only standard-effect): - [ [5, 25, 25, 100], # kW # size in kW - [50,250,250,800] # value for standart effect, typically € - ] # € - (Attention: Annualize costs to chosen period!) - (Args 'specific_effects' and 'fix_effects' can be used in parallel to Investsizepieces) - minimum_size: Min nominal value (only if: size_is_fixed = False). Defaults to CONFIG.modeling.EPSILON. - maximum_size: Max nominal value (only if: size_is_fixed = False). Defaults to CONFIG.modeling.BIG. + piecewise_effects: Define the effects of the investment as a piecewise function of the size of the investment. + minimum_size: Minimum possible size of the investment. + maximum_size: Maximum possible size of the investment. + investment_scenarios: For which scenarios to optimize the size for. + - 'individual': Optimize the size of each scenario individually + - List of scenario names: Optimize the size for the passed scenario names (equal size in all). All other scenarios will have the size 0. + - None: Equals to a list of all scenarios (default) """ - self.fix_effects: EffectValuesUser = fix_effects or {} - self.divest_effects: EffectValuesUser = divest_effects or {} + self.fix_effects: EffectValuesUserScenario = fix_effects if fix_effects is not None else {} + self.divest_effects: EffectValuesUserScenario = divest_effects if divest_effects is not None else {} self.fixed_size = fixed_size self.optional = optional - self.specific_effects: EffectValuesUser = specific_effects or {} + self.specific_effects: EffectValuesUserScenario = specific_effects if specific_effects is not None else {} self.piecewise_effects = piecewise_effects self._minimum_size = minimum_size if minimum_size is not None else CONFIG.modeling.EPSILON self._maximum_size = maximum_size if maximum_size is not None else CONFIG.modeling.BIG # default maximum + self.investment_scenarios = investment_scenarios - def transform_data(self, flow_system: 'FlowSystem'): - self.fix_effects = flow_system.effects.create_effect_values_dict(self.fix_effects) - self.divest_effects = flow_system.effects.create_effect_values_dict(self.divest_effects) - self.specific_effects = flow_system.effects.create_effect_values_dict(self.specific_effects) + def transform_data(self, flow_system: 'FlowSystem', name_prefix: str): + self._plausibility_checks(flow_system) + self.fix_effects = flow_system.create_effect_time_series( + label_prefix=name_prefix, + effect_values=self.fix_effects, + label_suffix='fix_effects', + has_time_dim=False, + has_scenario_dim=True, + ) + self.divest_effects = flow_system.create_effect_time_series( + label_prefix=name_prefix, + effect_values=self.divest_effects, + label_suffix='divest_effects', + has_time_dim=False, + has_scenario_dim=True, + ) + self.specific_effects = flow_system.create_effect_time_series( + label_prefix=name_prefix, + effect_values=self.specific_effects, + label_suffix='specific_effects', + has_time_dim=False, + has_scenario_dim=True, + ) + if self.piecewise_effects is not None: + self.piecewise_effects.has_time_dim = False + self.piecewise_effects.transform_data(flow_system, f'{name_prefix}|PiecewiseEffects') + + self._minimum_size = flow_system.create_time_series( + f'{name_prefix}|minimum_size', self.minimum_size, has_time_dim=False, has_scenario_dim=True + ) + self._maximum_size = flow_system.create_time_series( + f'{name_prefix}|maximum_size', self.maximum_size, has_time_dim=False, has_scenario_dim=True + ) + if self.fixed_size is not None: + self.fixed_size = flow_system.create_time_series( + f'{name_prefix}|fixed_size', self.fixed_size, has_time_dim=False, has_scenario_dim=True + ) + + def _plausibility_checks(self, flow_system): + if isinstance(self.investment_scenarios, list): + if not set(self.investment_scenarios).issubset(flow_system.time_series_collection.scenarios): + raise ValueError( + f'Some scenarios in investment_scenarios are not present in the time_series_collection: ' + f'{set(self.investment_scenarios) - set(flow_system.time_series_collection.scenarios)}' + ) + if self.investment_scenarios is not None: + if not self.optional: + if self.minimum_size is not None or self.fixed_size is not None: + logger.warning( + 'When using investment_scenarios, minimum_size and fixed_size should only ne used if optional is True.' + 'Otherwise the investment cannot be 0 incertain scenarios while being non-zero in others.' + ) @property def minimum_size(self): - return self.fixed_size or self._minimum_size + return self.fixed_size if self.fixed_size is not None else self._minimum_size @property def maximum_size(self): - return self.fixed_size or self._maximum_size + return self.fixed_size if self.fixed_size is not None else self._maximum_size @register_class_for_io class OnOffParameters(Interface): def __init__( self, - effects_per_switch_on: Optional['EffectValuesUser'] = None, - effects_per_running_hour: Optional['EffectValuesUser'] = None, - on_hours_total_min: Optional[int] = None, - on_hours_total_max: Optional[int] = None, - consecutive_on_hours_min: Optional[NumericData] = None, - consecutive_on_hours_max: Optional[NumericData] = None, - consecutive_off_hours_min: Optional[NumericData] = None, - consecutive_off_hours_max: Optional[NumericData] = None, - switch_on_total_max: Optional[int] = None, + effects_per_switch_on: Optional['EffectValuesUserTimestep'] = None, + effects_per_running_hour: Optional['EffectValuesUserTimestep'] = None, + on_hours_total_min: Optional[ScenarioData] = None, + on_hours_total_max: Optional[ScenarioData] = None, + consecutive_on_hours_min: Optional[TimestepData] = None, + consecutive_on_hours_max: Optional[TimestepData] = None, + consecutive_off_hours_min: Optional[TimestepData] = None, + consecutive_off_hours_max: Optional[TimestepData] = None, + switch_on_total_max: Optional[ScenarioData] = None, force_switch_on: bool = False, ): """ @@ -202,8 +284,8 @@ def __init__( switch_on_total_max: max nr of switchOn operations force_switch_on: force creation of switch on variable, even if there is no switch_on_total_max """ - self.effects_per_switch_on: EffectValuesUser = effects_per_switch_on or {} - self.effects_per_running_hour: EffectValuesUser = effects_per_running_hour or {} + self.effects_per_switch_on: EffectValuesUserTimestep = effects_per_switch_on or {} + self.effects_per_running_hour: EffectValuesUserTimestep = effects_per_running_hour or {} self.on_hours_total_min: Scalar = on_hours_total_min self.on_hours_total_max: Scalar = on_hours_total_max self.consecutive_on_hours_min: NumericDataTS = consecutive_on_hours_min @@ -232,6 +314,15 @@ def transform_data(self, flow_system: 'FlowSystem', name_prefix: str): self.consecutive_off_hours_max = flow_system.create_time_series( f'{name_prefix}|consecutive_off_hours_max', self.consecutive_off_hours_max ) + self.on_hours_total_max = flow_system.create_time_series( + f'{name_prefix}|on_hours_total_max', self.on_hours_total_max, has_time_dim=False + ) + self.on_hours_total_min = flow_system.create_time_series( + f'{name_prefix}|on_hours_total_min', self.on_hours_total_min, has_time_dim=False + ) + self.switch_on_total_max = flow_system.create_time_series( + f'{name_prefix}|switch_on_total_max', self.switch_on_total_max, has_time_dim=False + ) @property def use_off(self) -> bool: diff --git a/flixopt/io.py b/flixopt/io.py index 35d927136..05ae6741d 100644 --- a/flixopt/io.py +++ b/flixopt/io.py @@ -23,7 +23,7 @@ def replace_timeseries(obj, mode: Literal['name', 'stats', 'data'] = 'name'): return [replace_timeseries(v, mode) for v in obj] elif isinstance(obj, TimeSeries): # Adjust this based on the actual class if obj.all_equal: - return obj.active_data.values[0].item() + return obj.selected_data.values.max().item() elif mode == 'name': return f'::::{obj.name}' elif mode == 'stats': @@ -44,7 +44,7 @@ def insert_dataarray(obj, ds: xr.Dataset): return [insert_dataarray(v, ds) for v in obj] elif isinstance(obj, str) and obj.startswith('::::'): da = ds[obj[4:]] - if da.isel(time=-1).isnull(): + if 'time' in da.dims and da.isel(time=-1).isnull().any().item(): return da.isel(time=slice(0, -1)) return da else: @@ -79,15 +79,17 @@ def _save_to_yaml(data, output_file='formatted_output.yaml'): output_file (str): Path to output YAML file """ # Process strings to normalize all newlines and handle special patterns - processed_data = _process_complex_strings(data) + processed_data = _normalize_complex_data(data) # Define a custom representer for strings def represent_str(dumper, data): - # Use literal block style (|) for any string with newlines + # Use literal block style (|) for multi-line strings if '\n' in data: + # Clean up formatting for literal block style + data = data.strip() # Remove leading/trailing whitespace return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='|') - # Use quoted style for strings with special characters to ensure proper parsing + # Use quoted style for strings with special characters elif any(char in data for char in ':`{}[]#,&*!|>%@'): return dumper.represent_scalar('tag:yaml.org,2002:str', data, style='"') @@ -97,53 +99,80 @@ def represent_str(dumper, data): # Add the string representer to SafeDumper yaml.add_representer(str, represent_str, Dumper=yaml.SafeDumper) + # Configure dumper options for better formatting + class CustomDumper(yaml.SafeDumper): + def increase_indent(self, flow=False, indentless=False): + return super(CustomDumper, self).increase_indent(flow, False) + # Write to file with settings that ensure proper formatting with open(output_file, 'w', encoding='utf-8') as file: yaml.dump( processed_data, file, - Dumper=yaml.SafeDumper, + Dumper=CustomDumper, sort_keys=False, # Preserve dictionary order default_flow_style=False, # Use block style for mappings - width=float('inf'), # Don't wrap long lines + width=1000, # Set a reasonable line width allow_unicode=True, # Support Unicode characters + indent=2, # Set consistent indentation ) -def _process_complex_strings(data): +def _normalize_complex_data(data): """ - Process dictionary data recursively with comprehensive string normalization. - Handles various types of strings and special formatting. + Recursively normalize strings in complex data structures. + + Handles dictionaries, lists, and strings, applying various text normalization + rules while preserving important formatting elements. Args: - data: The data to process (dict, list, str, or other) + data: Any data type (dict, list, str, or primitive) Returns: - Processed data with normalized strings + Data with all strings normalized according to defined rules """ if isinstance(data, dict): - return {k: _process_complex_strings(v) for k, v in data.items()} + return {key: _normalize_complex_data(value) for key, value in data.items()} + elif isinstance(data, list): - return [_process_complex_strings(item) for item in data] + return [_normalize_complex_data(item) for item in data] + elif isinstance(data, str): - # Step 1: Normalize line endings to \n - normalized = data.replace('\r\n', '\n').replace('\r', '\n') + return _normalize_string_content(data) - # Step 2: Handle escaped newlines with robust regex - normalized = re.sub(r'(? Dict[str, str]: diff --git a/flixopt/linear_converters.py b/flixopt/linear_converters.py index 3fd032632..b096921f0 100644 --- a/flixopt/linear_converters.py +++ b/flixopt/linear_converters.py @@ -165,7 +165,7 @@ def __init__( label, inputs=[P_el, Q_th], outputs=[], - conversion_factors=[{P_el.label: 1, Q_th.label: -specific_electricity_demand}], + conversion_factors=[{P_el.label: -1, Q_th.label: specific_electricity_demand}], on_off_parameters=on_off_parameters, meta_data=meta_data, ) @@ -177,12 +177,12 @@ def __init__( @property def specific_electricity_demand(self): - return -self.conversion_factors[0][self.Q_th.label] + return self.conversion_factors[0][self.Q_th.label] @specific_electricity_demand.setter def specific_electricity_demand(self, value): check_bounds(value, 'specific_electricity_demand', self.label_full, 0, 1) - self.conversion_factors[0][self.Q_th.label] = -value + self.conversion_factors[0][self.Q_th.label] = value @register_class_for_io diff --git a/flixopt/plotting.py b/flixopt/plotting.py index e4c440aaf..91fc5e7e7 100644 --- a/flixopt/plotting.py +++ b/flixopt/plotting.py @@ -209,7 +209,7 @@ def process_colors( def with_plotly( data: pd.DataFrame, - mode: Literal['bar', 'line', 'area'] = 'area', + style: Literal['stacked_bar', 'line', 'area', 'grouped_bar'] = 'stacked_bar', colors: ColorType = 'viridis', title: str = '', ylabel: str = '', @@ -222,7 +222,7 @@ def with_plotly( Args: data: A DataFrame containing the data to plot, where the index represents time (e.g., hours), and each column represents a separate data series. - mode: The plotting mode. Use 'bar' for stacked bar charts, 'line' for stepped lines, + style: The plotting style. Use 'stacked_bar' for stacked bar charts, 'line' for stepped lines, or 'area' for stacked area charts. colors: Color specification, can be: - A string with a colorscale name (e.g., 'viridis', 'plasma') @@ -235,7 +235,8 @@ def with_plotly( Returns: A Plotly figure object containing the generated plot. """ - assert mode in ['bar', 'line', 'area'], f"'mode' must be one of {['bar', 'line', 'area']}" + if style not in ['stacked_bar', 'line', 'area', 'grouped_bar']: + raise ValueError(f"'style' must be one of {['stacked_bar', 'line', 'area', 'grouped_bar']}") if data.empty: return go.Figure() @@ -243,23 +244,40 @@ def with_plotly( fig = fig if fig is not None else go.Figure() - if mode == 'bar': + if style == 'stacked_bar': for i, column in enumerate(data.columns): fig.add_trace( go.Bar( x=data.index, y=data[column], name=column, - marker=dict(color=processed_colors[i]), + marker=dict(color=processed_colors[i], + line=dict(width=0, color='rgba(0,0,0,0)')), #Transparent line with 0 width ) ) fig.update_layout( - barmode='relative' if mode == 'bar' else None, + barmode='relative', bargap=0, # No space between bars - bargroupgap=0, # No space between groups of bars + bargroupgap=0, # No space between grouped bars ) - elif mode == 'line': + if style == 'grouped_bar': + for i, column in enumerate(data.columns): + fig.add_trace( + go.Bar( + x=data.index, + y=data[column], + name=column, + marker=dict(color=processed_colors[i]) + ) + ) + + fig.update_layout( + barmode='group', + bargap=0.2, # No space between bars + bargroupgap=0, # space between grouped bars + ) + elif style == 'line': for i, column in enumerate(data.columns): fig.add_trace( go.Scatter( @@ -270,7 +288,7 @@ def with_plotly( line=dict(shape='hv', color=processed_colors[i]), ) ) - elif mode == 'area': + elif style == 'area': data = data.copy() data[(data > -1e-5) & (data < 1e-5)] = 0 # Preventing issues with plotting # Split columns into positive, negative, and mixed categories @@ -330,14 +348,6 @@ def with_plotly( plot_bgcolor='rgba(0,0,0,0)', # Transparent background paper_bgcolor='rgba(0,0,0,0)', # Transparent paper background font=dict(size=14), # Increase font size for better readability - legend=dict( - orientation='h', # Horizontal legend - yanchor='bottom', - y=-0.3, # Adjusts how far below the plot it appears - xanchor='center', - x=0.5, - title_text=None, # Removes legend title for a cleaner look - ), ) return fig @@ -345,7 +355,7 @@ def with_plotly( def with_matplotlib( data: pd.DataFrame, - mode: Literal['bar', 'line'] = 'bar', + style: Literal['stacked_bar', 'line'] = 'stacked_bar', colors: ColorType = 'viridis', title: str = '', ylabel: str = '', @@ -360,7 +370,7 @@ def with_matplotlib( Args: data: A DataFrame containing the data to plot. The index should represent time (e.g., hours), and each column represents a separate data series. - mode: Plotting mode. Use 'bar' for stacked bar charts or 'line' for stepped lines. + style: Plotting style. Use 'stacked_bar' for stacked bar charts or 'line' for stepped lines. colors: Color specification, can be: - A string with a colormap name (e.g., 'viridis', 'plasma') - A list of color strings (e.g., ['#ff0000', '#00ff00']) @@ -376,19 +386,18 @@ def with_matplotlib( A tuple containing the Matplotlib figure and axes objects used for the plot. Notes: - - If `mode` is 'bar', bars are stacked for both positive and negative values. + - If `style` is 'stacked_bar', bars are stacked for both positive and negative values. Negative values are stacked separately without extra labels in the legend. - - If `mode` is 'line', stepped lines are drawn for each data series. - - The legend is placed below the plot to accommodate multiple data series. + - If `style` is 'line', stepped lines are drawn for each data series. """ - assert mode in ['bar', 'line'], f"'mode' must be one of {['bar', 'line']} for matplotlib" + assert style in ['stacked_bar', 'line'], f"'style' must be one of {['stacked_bar', 'line']} for matplotlib" if fig is None or ax is None: fig, ax = plt.subplots(figsize=figsize) processed_colors = ColorProcessor(engine='matplotlib').process_colors(colors, list(data.columns)) - if mode == 'bar': + if style == 'stacked_bar': cumulative_positive = np.zeros(len(data)) cumulative_negative = np.zeros(len(data)) width = data.index.to_series().diff().dropna().min() # Minimum time difference @@ -419,7 +428,7 @@ def with_matplotlib( ) cumulative_negative += negative_values.values - elif mode == 'line': + elif style == 'line': for i, column in enumerate(data.columns): ax.step(data.index, data[column], where='post', color=processed_colors[i], label=column) @@ -1119,7 +1128,6 @@ def create_pie_trace(data_series, side): paper_bgcolor='rgba(0,0,0,0)', # Transparent paper background font=dict(size=14), margin=dict(t=80, b=50, l=30, r=30), - legend=dict(orientation='h', yanchor='bottom', y=-0.2, xanchor='center', x=0.5, font=dict(size=12)), ) return fig diff --git a/flixopt/results.py b/flixopt/results.py index 223e3708e..bd0abaa5e 100644 --- a/flixopt/results.py +++ b/flixopt/results.py @@ -2,7 +2,8 @@ import json import logging import pathlib -from typing import TYPE_CHECKING, Dict, List, Literal, Optional, Tuple, Union +import warnings +from typing import TYPE_CHECKING, Any, Dict, List, Literal, Optional, Tuple, Union import linopy import matplotlib.pyplot as plt @@ -14,7 +15,8 @@ from . import io as fx_io from . import plotting -from .core import TimeSeriesCollection +from .core import DataConverter, TimeSeriesCollection +from .flow_system import FlowSystem if TYPE_CHECKING: import pyvis @@ -25,6 +27,11 @@ logger = logging.getLogger('flixopt') +class _FlowSystemRestorationError(Exception): + """Exception raised when a FlowSystem cannot be restored from dataset.""" + pass + + class CalculationResults: """Results container for Calculation results. @@ -37,7 +44,7 @@ class CalculationResults: Attributes: solution (xr.Dataset): Dataset containing optimization results. - flow_system (xr.Dataset): Dataset containing the flow system. + flow_system_data (xr.Dataset): Dataset containing the flow system. summary (Dict): Information about the calculation. name (str): Name identifier for the calculation. model (linopy.Model): The optimization model (if available). @@ -92,7 +99,7 @@ def from_file(cls, folder: Union[str, pathlib.Path], name: str): return cls( solution=fx_io.load_dataset_from_netcdf(paths.solution), - flow_system=fx_io.load_dataset_from_netcdf(paths.flow_system), + flow_system_data=fx_io.load_dataset_from_netcdf(paths.flow_system), name=name, folder=folder, model=model, @@ -118,7 +125,7 @@ def from_calculation(cls, calculation: 'Calculation'): """ return cls( solution=calculation.model.solution, - flow_system=calculation.flow_system.as_dataset(constants_in_dataset=True), + flow_system_data=calculation.flow_system.as_dataset(constants_in_dataset=True), summary=calculation.summary, model=calculation.model, name=calculation.name, @@ -128,47 +135,83 @@ def from_calculation(cls, calculation: 'Calculation'): def __init__( self, solution: xr.Dataset, - flow_system: xr.Dataset, + flow_system_data: xr.Dataset, name: str, summary: Dict, folder: Optional[pathlib.Path] = None, model: Optional[linopy.Model] = None, + **kwargs, # To accept old "flow_system" parameter ): """ Args: solution: The solution of the optimization. - flow_system: The flow_system that was used to create the calculation as a datatset. + flow_system_data: The flow_system that was used to create the calculation as a datatset. name: The name of the calculation. summary: Information about the calculation, folder: The folder where the results are saved. model: The linopy model that was used to solve the calculation. + Deprecated: + flow_system: Use flow_system_data instead. """ + # Handle potential old "flow_system" parameter for backward compatibility + if 'flow_system' in kwargs and flow_system_data is None: + flow_system_data = kwargs.pop('flow_system') + warnings.warn( + "The 'flow_system' parameter is deprecated. Use 'flow_system_data' instead." + "Acess is now by '.flow_system_data', while '.flow_system' returns the restored FlowSystem.", + DeprecationWarning, + stacklevel=2, + ) + self.solution = solution - self.flow_system = flow_system + self.flow_system_data = flow_system_data self.summary = summary self.name = name self.model = model self.folder = pathlib.Path(folder) if folder is not None else pathlib.Path.cwd() / 'results' self.components = { - label: ComponentResults.from_json(self, infos) for label, infos in self.solution.attrs['Components'].items() + label: ComponentResults(self, **infos) for label, infos in self.solution.attrs['Components'].items() } - self.buses = {label: BusResults.from_json(self, infos) for label, infos in self.solution.attrs['Buses'].items()} + self.buses = {label: BusResults(self, **infos) for label, infos in self.solution.attrs['Buses'].items()} self.effects = { - label: EffectResults.from_json(self, infos) for label, infos in self.solution.attrs['Effects'].items() + label: EffectResults(self, **infos) for label, infos in self.solution.attrs['Effects'].items() } + if 'Flows' not in self.solution.attrs: + warnings.warn( + 'No Data about flows found in the results. This data is only included since v2.2.0. Some functionality ' + 'is not availlable. We recommend to evaluate your results with a version <2.2.0.', + stacklevel=2, + ) + self.flows = {} + else: + self.flows = { + label: FlowResults(self, **infos) for label, infos in self.solution.attrs.get('Flows', {}).items() + } + self.timesteps_extra = self.solution.indexes['time'] self.hours_per_timestep = TimeSeriesCollection.calculate_hours_per_timestep(self.timesteps_extra) + self.scenarios = self.solution.indexes['scenario'] if 'scenario' in self.solution.indexes else None + + self._effect_share_factors = None + self._flow_system = None + + self._flow_rates = None + self._flow_hours = None + self._sizes = None + self._effects_per_component = {'operation': None, 'invest': None, 'total': None} - def __getitem__(self, key: str) -> Union['ComponentResults', 'BusResults', 'EffectResults']: + def __getitem__(self, key: str) -> Union['ComponentResults', 'BusResults', 'EffectResults', 'FlowResults']: if key in self.components: return self.components[key] if key in self.buses: return self.buses[key] if key in self.effects: return self.effects[key] + if key in self.flows: + return self.flows[key] raise KeyError(f'No element with label {key} found.') @property @@ -195,20 +238,365 @@ def constraints(self) -> linopy.Constraints: raise ValueError('The linopy model is not available.') return self.model.constraints + @property + def effect_share_factors(self): + if self._effect_share_factors is None: + effect_share_factors = self.flow_system.effects.calculate_effect_share_factors() + self._effect_share_factors = {'operation': effect_share_factors[0], + 'invest': effect_share_factors[1]} + return self._effect_share_factors + + @property + def flow_system(self) -> 'FlowSystem': + """ The restored flow_system that was used to create the calculation. + Contains all input parameters.""" + if self._flow_system is None: + try: + from . import FlowSystem + current_logger_level = logger.getEffectiveLevel() + logger.setLevel(logging.CRITICAL) + self._flow_system = FlowSystem.from_dataset(self.flow_system_data) + self._flow_system._connect_network() + logger.setLevel(current_logger_level) + except Exception as e: + logger.critical(f'Not able to restore FlowSystem from dataset. Some functionality is not availlable. {e}') + raise _FlowSystemRestorationError(f'Not able to restore FlowSystem from dataset. {e}') from e + return self._flow_system + def filter_solution( - self, variable_dims: Optional[Literal['scalar', 'time']] = None, element: Optional[str] = None + self, + variable_dims: Optional[Literal['scalar', 'time', 'scenario', 'timeonly', 'scenarioonly']] = None, + element: Optional[str] = None, + timesteps: Optional[pd.DatetimeIndex] = None, + scenarios: Optional[pd.Index] = None, + contains: Optional[Union[str, List[str]]] = None, + startswith: Optional[Union[str, List[str]]] = None, ) -> xr.Dataset: """ Filter the solution to a specific variable dimension and element. If no element is specified, all elements are included. Args: - variable_dims: The dimension of the variables to filter for. + variable_dims: The dimension of which to get variables from. + - 'scalar': Get scalar variables (without dimensions) + - 'time': Get time-dependent variables (with a time dimension) + - 'scenario': Get scenario-dependent variables (with ONLY a scenario dimension) + - 'timeonly': Get time-dependent variables (with ONLY a time dimension) + - 'scenarioonly': Get scenario-dependent variables (with ONLY a scenario dimension) element: The element to filter for. + timesteps: Optional time indexes to select. Can be: + - pd.DatetimeIndex: Multiple timesteps + - str/pd.Timestamp: Single timestep + Defaults to all available timesteps. + scenarios: Optional scenario indexes to select. Can be: + - pd.Index: Multiple scenarios + - str/int: Single scenario (int is treated as a label, not an index position) + Defaults to all available scenarios. + contains: Filter variables that contain this string or strings. + If a list is provided, variables must contain ALL strings in the list. + startswith: Filter variables that start with this string or strings. + If a list is provided, variables must start with ANY of the strings in the list. + """ + return filter_dataset( + self.solution if element is None else self[element].solution, + variable_dims=variable_dims, + timesteps=timesteps, + scenarios=scenarios, + contains=contains, + startswith=startswith, + ) + + def effects_per_component(self, mode: Literal['operation', 'invest', 'total'] = 'total') -> xr.Dataset: + """Returns a dataset containing effect totals for each components (including their flows). + + Args: + mode: Which effects to contain. (operation, invest, total) + + Returns: + An xarray Dataset with an additional component dimension and effects as variables. + """ + if mode not in ['operation', 'invest', 'total']: + raise ValueError(f'Invalid mode {mode}') + if self._effects_per_component[mode] is None: + self._effects_per_component[mode] = self._create_effects_dataset(mode) + return self._effects_per_component[mode] + + def flow_rates( + self, + start: Optional[Union[str, List[str]]] = None, + end: Optional[Union[str, List[str]]] = None, + component: Optional[Union[str, List[str]]] = None, + ) -> xr.DataArray: + """Returns a DataArray containing the flow rates of each Flow. + + Args: + start: Optional source node(s) to filter by. Can be a single node name or a list of names. + end: Optional destination node(s) to filter by. Can be a single node name or a list of names. + component: Optional component(s) to filter by. Can be a single component name or a list of names. + + Further usage: + Convert the dataarray to a dataframe: + >>>results.flow_rates().to_pandas() + Get the max or min over time: + >>>results.flow_rates().max('time') + Sum up the flow rates of flows with the same start and end: + >>>results.flow_rates(end='Fernwärme').groupby('start').sum(dim='flow') + To recombine filtered dataarrays, use `xr.concat` with dim 'flow': + >>>xr.concat([results.flow_rates(start='Fernwärme'), results.flow_rates(end='Fernwärme')], dim='flow') + """ + if self._flow_rates is None: + self._flow_rates = self._assign_flow_coords( + xr.concat([flow.flow_rate.rename(flow.label) for flow in self.flows.values()], + dim=pd.Index(self.flows.keys(), name='flow')) + ).rename('flow_rates') + filters = {k: v for k, v in {'start': start, 'end': end, 'component': component}.items() if v is not None} + return filter_dataarray_by_coord(self._flow_rates, **filters) + + def flow_hours( + self, + start: Optional[Union[str, List[str]]] = None, + end: Optional[Union[str, List[str]]] = None, + component: Optional[Union[str, List[str]]] = None, + ) -> xr.DataArray: + """Returns a DataArray containing the flow hours of each Flow. + + Flow hours represent the total energy/material transferred over time, + calculated by multiplying flow rates by the duration of each timestep. + + Args: + start: Optional source node(s) to filter by. Can be a single node name or a list of names. + end: Optional destination node(s) to filter by. Can be a single node name or a list of names. + component: Optional component(s) to filter by. Can be a single component name or a list of names. + + Further usage: + Convert the dataarray to a dataframe: + >>>results.flow_hours().to_pandas() + Sum up the flow hours over time: + >>>results.flow_hours().sum('time') + Sum up the flow hours of flows with the same start and end: + >>>results.flow_hours(end='Fernwärme').groupby('start').sum(dim='flow') + To recombine filtered dataarrays, use `xr.concat` with dim 'flow': + >>>xr.concat([results.flow_hours(start='Fernwärme'), results.flow_hours(end='Fernwärme')], dim='flow') + + """ + if self._flow_hours is None: + self._flow_hours = (self.flow_rates() * self.hours_per_timestep).rename('flow_hours') + filters = {k: v for k, v in {'start': start, 'end': end, 'component': component}.items() if v is not None} + return filter_dataarray_by_coord(self._flow_hours, **filters) + + def sizes( + self, + start: Optional[Union[str, List[str]]] = None, + end: Optional[Union[str, List[str]]] = None, + component: Optional[Union[str, List[str]]] = None + ) -> xr.DataArray: + """Returns a dataset with the sizes of the Flows. + Args: + start: Optional source node(s) to filter by. Can be a single node name or a list of names. + end: Optional destination node(s) to filter by. Can be a single node name or a list of names. + component: Optional component(s) to filter by. Can be a single component name or a list of names. + + Further usage: + Convert the dataarray to a dataframe: + >>>results.sizes().to_pandas() + To recombine filtered dataarrays, use `xr.concat` with dim 'flow': + >>>xr.concat([results.sizes(start='Fernwärme'), results.sizes(end='Fernwärme')], dim='flow') + + """ + if self._sizes is None: + self._sizes = self._assign_flow_coords( + xr.concat([flow.size.rename(flow.label) for flow in self.flows.values()], + dim=pd.Index(self.flows.keys(), name='flow')) + ).rename('flow_sizes') + filters = {k: v for k, v in {'start': start, 'end': end, 'component': component}.items() if v is not None} + return filter_dataarray_by_coord(self._sizes, **filters) + + def _assign_flow_coords(self, da: xr.DataArray): + # Add start and end coordinates + da = da.assign_coords({ + 'start': ('flow', [flow.start for flow in self.flows.values()]), + 'end': ('flow', [flow.end for flow in self.flows.values()]), + 'component': ('flow', [flow.component for flow in self.flows.values()]), + }) + + # Ensure flow is the last dimension if needed + existing_dims = [d for d in da.dims if d != 'flow'] + da = da.transpose(*(existing_dims + ['flow'])) + return da + + def get_effect_shares( + self, + element: str, + effect: str, + mode: Optional[Literal['operation', 'invest']] = None, + include_flows: bool = False + ) -> xr.Dataset: + """Retrieves individual effect shares for a specific element and effect. + Either for operation, investment, or both modes combined. + Only includes the direct shares. + + Args: + element: The element identifier for which to retrieve effect shares. + effect: The effect identifier for which to retrieve shares. + mode: Optional. The mode to retrieve shares for. Can be 'operation', 'invest', + or None to retrieve both. Defaults to None. + + Returns: + An xarray Dataset containing the requested effect shares. If mode is None, + returns a merged Dataset containing both operation and investment shares. + + Raises: + ValueError: If the specified effect is not available or if mode is invalid. + """ + if effect not in self.effects: + raise ValueError(f'Effect {effect} is not available.') + + if mode is None: + return xr.merge([self.get_effect_shares(element=element, effect=effect, mode='operation', include_flows=include_flows), + self.get_effect_shares(element=element, effect=effect, mode='invest', include_flows=include_flows)]) + + if mode not in ['operation', 'invest']: + raise ValueError(f'Mode {mode} is not available. Choose between "operation" and "invest".') + + ds = xr.Dataset() + + label = f'{element}->{effect}({mode})' + if label in self.solution: + ds = xr.Dataset({label: self.solution[label]}) + + if include_flows: + if element not in self.components: + raise ValueError(f'Only use Components when retrieving Effects including flows. Got {element}') + flows = [label.split('|')[0] for label in self.components[element].inputs + self.components[element].outputs] + return xr.merge( + [ds] + [self.get_effect_shares(element=flow, effect=effect, mode=mode, include_flows=False) + for flow in flows] + ) + + return ds + + def _compute_effect_total( + self, + element: str, + effect: str, + mode: Literal['operation', 'invest', 'total'] = 'total', + include_flows: bool = False, + ) -> xr.DataArray: + """Calculates the total effect for a specific element and effect. + + This method computes the total direct and indirect effects for a given element + and effect, considering the conversion factors between different effects. + + Args: + element: The element identifier for which to calculate total effects. + effect: The effect identifier to calculate. + mode: The calculation mode. Options are: + 'operation': Returns operation-specific effects. + 'invest': Returns investment-specific effects. + 'total': Returns the sum of operation effects (across all timesteps) + and investment effects. Defaults to 'total'. + include_flows: Whether to include effects from flows connected to this element. + + Returns: + An xarray DataArray containing the total effects, named with pattern + '{element}->{effect}' for mode='total' or '{element}->{effect}({mode})' + for other modes. + + Raises: + ValueError: If the specified effect is not available. """ - if element is not None: - return filter_dataset(self[element].solution, variable_dims) - return filter_dataset(self.solution, variable_dims) + if effect not in self.effects: + raise ValueError(f'Effect {effect} is not available.') + + if mode == 'total': + operation = self._compute_effect_total(element=element, effect=effect, mode='operation', include_flows=include_flows) + invest = self._compute_effect_total(element=element, effect=effect, mode='invest', include_flows=include_flows) + if invest.isnull().all() and operation.isnull().all(): + return xr.DataArray(np.nan) + if operation.isnull().all(): + return invest.rename(f'{element}->{effect}') + operation = operation.sum('time') + if invest.isnull().all(): + return operation.rename(f'{element}->{effect}') + if 'time' in operation.indexes: + operation = operation.sum('time') + return invest + operation + + total = xr.DataArray(0) + share_exists = False + + relevant_conversion_factors = { + key[0]: value for key, value in self.effect_share_factors[mode].items() if key[1] == effect + } + relevant_conversion_factors[effect] = 1 # Share to itself is 1 + + for target_effect, conversion_factor in relevant_conversion_factors.items(): + label = f'{element}->{target_effect}({mode})' + if label in self.solution: + share_exists = True + da = self.solution[label] + total = da * conversion_factor + total + + if include_flows: + if element not in self.components: + raise ValueError(f'Only use Components when retrieving Effects including flows. Got {element}') + flows = [label.split('|')[0] for label in + self.components[element].inputs + self.components[element].outputs] + for flow in flows: + label = f'{flow}->{target_effect}({mode})' + if label in self.solution: + share_exists = True + da = self.solution[label] + total = da * conversion_factor + total + if not share_exists: + total = xr.DataArray(np.nan) + return total.rename(f'{element}->{effect}({mode})') + + def _create_effects_dataset(self, mode: Literal['operation', 'invest', 'total'] = 'total') -> xr.Dataset: + """Creates a dataset containing effect totals for all components (including their flows). + The dataset does contain the direct as well as the indirect effects of each component. + + Args: + mode: The calculation mode ('operation', 'invest', or 'total'). + + Returns: + An xarray Dataset with components as dimension and effects as variables. + """ + # Create an empty dataset + ds = xr.Dataset() + + # Add each effect as a variable to the dataset + for effect in self.effects: + # Create a list of DataArrays, one for each component + component_arrays = [ + self._compute_effect_total(element=component, effect=effect, mode=mode, include_flows=True).expand_dims( + component=[component] + ) # Add component dimension to each array + for component in list(self.components) + ] + + # Combine all components into one DataArray for this effect + if component_arrays: + effect_array = xr.concat(component_arrays, dim='component', coords='minimal') + # Add this effect as a variable to the dataset + ds[effect] = effect_array + + # For now include a test to ensure correctness + suffix = { + 'operation': '(operation)|total_per_timestep', + 'invest': '(invest)|total', + 'total': '|total', + } + for effect in self.effects: + label = f'{effect}{suffix[mode]}' + computed = ds[effect].sum('component') + found = self.solution[label] + if not np.allclose(computed.values, found.fillna(0).values): + logger.critical( + f'Results for {effect}({mode}) in effects_dataset doesnt match {label}\n{computed=}\n, {found=}' + ) + + return ds def plot_heatmap( self, @@ -219,10 +607,32 @@ def plot_heatmap( save: Union[bool, pathlib.Path] = False, show: bool = True, engine: plotting.PlottingEngine = 'plotly', + scenario: Optional[Union[str, int]] = None, ) -> Union[plotly.graph_objs.Figure, Tuple[plt.Figure, plt.Axes]]: + """ + Plots a heatmap of the solution of a variable. + + Args: + variable_name: The name of the variable to plot. + heatmap_timeframes: The timeframes to use for the heatmap. + heatmap_timesteps_per_frame: The timesteps per frame to use for the heatmap. + color_map: The color map to use for the heatmap. + save: Whether to save the plot or not. If a path is provided, the plot will be saved at that location. + show: Whether to show the plot or not. + engine: The engine to use for plotting. Can be either 'plotly' or 'matplotlib'. + scenario: The scenario to plot. Defaults to the first scenario. Has no effect without scenarios present + """ + dataarray = self.solution[variable_name] + + scenario_suffix = '' + if 'scenario' in dataarray.indexes: + chosen_scenario = scenario or self.scenarios[0] + dataarray = dataarray.sel(scenario=chosen_scenario).drop_vars('scenario') + scenario_suffix = f'--{chosen_scenario}' + return plot_heatmap( - dataarray=self.solution[variable_name], - name=variable_name, + dataarray=dataarray, + name=f'{variable_name}{scenario_suffix}', folder=self.folder, heatmap_timeframes=heatmap_timeframes, heatmap_timesteps_per_frame=heatmap_timesteps_per_frame, @@ -244,16 +654,9 @@ def plot_network( show: bool = False, ) -> 'pyvis.network.Network': """See flixopt.flow_system.FlowSystem.plot_network""" - try: - from .flow_system import FlowSystem - - flow_system = FlowSystem.from_dataset(self.flow_system) - except Exception as e: - logger.critical(f'Could not reconstruct the flow_system from dataset: {e}') - return None if path is None: path = self.folder / f'{self.name}--network.html' - return flow_system.plot_network(controls=controls, path=path, show=show) + return self.flow_system.plot_network(controls=controls, path=path, show=show) def to_file( self, @@ -286,7 +689,7 @@ def to_file( paths = fx_io.CalculationResultsPaths(folder, name) fx_io.save_dataset_to_netcdf(self.solution, paths.solution, compression=compression) - fx_io.save_dataset_to_netcdf(self.flow_system, paths.flow_system, compression=compression) + fx_io.save_dataset_to_netcdf(self.flow_system_data, paths.flow_system, compression=compression) with open(paths.summary, 'w', encoding='utf-8') as f: yaml.dump(self.summary, f, allow_unicode=True, sort_keys=False, indent=4, width=1000) @@ -307,10 +710,6 @@ def to_file( class _ElementResults: - @classmethod - def from_json(cls, calculation_results, json_data: Dict) -> '_ElementResults': - return cls(calculation_results, json_data['label'], json_data['variables'], json_data['constraints']) - def __init__( self, calculation_results: CalculationResults, label: str, variables: List[str], constraints: List[str] ): @@ -345,28 +744,49 @@ def constraints(self) -> linopy.Constraints: raise ValueError('The linopy model is not available.') return self._calculation_results.model.constraints[self._constraint_names] - def filter_solution(self, variable_dims: Optional[Literal['scalar', 'time']] = None) -> xr.Dataset: + def filter_solution( + self, + variable_dims: Optional[Literal['scalar', 'time', 'scenario', 'timeonly', 'scenarioonly']] = None, + timesteps: Optional[pd.DatetimeIndex] = None, + scenarios: Optional[pd.Index] = None, + contains: Optional[Union[str, List[str]]] = None, + startswith: Optional[Union[str, List[str]]] = None, + ) -> xr.Dataset: """ - Filter the solution of the element by dimension. + Filter the solution to a specific variable dimension and element. + If no element is specified, all elements are included. Args: - variable_dims: The dimension of the variables to filter for. + variable_dims: The dimension of which to get variables from. + - 'scalar': Get scalar variables (without dimensions) + - 'time': Get time-dependent variables (with a time dimension) + - 'scenario': Get scenario-dependent variables (with ONLY a scenario dimension) + - 'timeonly': Get time-dependent variables (with ONLY a time dimension) + - 'scenarioonly': Get scenario-dependent variables (with ONLY a scenario dimension) + timesteps: Optional time indexes to select. Can be: + - pd.DatetimeIndex: Multiple timesteps + - str/pd.Timestamp: Single timestep + Defaults to all available timesteps. + scenarios: Optional scenario indexes to select. Can be: + - pd.Index: Multiple scenarios + - str/int: Single scenario (int is treated as a label, not an index position) + Defaults to all available scenarios. + contains: Filter variables that contain this string or strings. + If a list is provided, variables must contain ALL strings in the list. + startswith: Filter variables that start with this string or strings. + If a list is provided, variables must start with ANY of the strings in the list. """ - return filter_dataset(self.solution, variable_dims) + return filter_dataset( + self.solution, + variable_dims=variable_dims, + timesteps=timesteps, + scenarios=scenarios, + contains=contains, + startswith=startswith, + ) class _NodeResults(_ElementResults): - @classmethod - def from_json(cls, calculation_results, json_data: Dict) -> '_NodeResults': - return cls( - calculation_results, - json_data['label'], - json_data['variables'], - json_data['constraints'], - json_data['inputs'], - json_data['outputs'], - ) - def __init__( self, calculation_results: CalculationResults, @@ -375,10 +795,12 @@ def __init__( constraints: List[str], inputs: List[str], outputs: List[str], + flows: List[str], ): super().__init__(calculation_results, label, variables, constraints) self.inputs = inputs self.outputs = outputs + self.flows = flows def plot_node_balance( self, @@ -386,28 +808,47 @@ def plot_node_balance( show: bool = True, colors: plotting.ColorType = 'viridis', engine: plotting.PlottingEngine = 'plotly', + scenario: Optional[Union[str, int]] = None, + mode: Literal['flow_rate', 'flow_hours'] = 'flow_rate', + style: Literal['area', 'stacked_bar', 'line'] = 'stacked_bar', + drop_suffix: bool = True, ) -> Union[plotly.graph_objs.Figure, Tuple[plt.Figure, plt.Axes]]: """ Plots the node balance of the Component or Bus. Args: save: Whether to save the plot or not. If a path is provided, the plot will be saved at that location. show: Whether to show the plot or not. + colors: The colors to use for the plot. See `flixopt.plotting.ColorType` for options. engine: The engine to use for plotting. Can be either 'plotly' or 'matplotlib'. + scenario: The scenario to plot. Defaults to the first scenario. Has no effect without scenarios present + mode: The mode to use for the dataset. Can be 'flow_rate' or 'flow_hours'. + - 'flow_rate': Returns the flow_rates of the Node. + - 'flow_hours': Returns the flow_hours of the Node. [flow_hours(t) = flow_rate(t) * dt(t)]. Renames suffixes to |flow_hours. + drop_suffix: Whether to drop the suffix from the variable names. """ + ds = self.node_balance(with_last_timestep=True, mode=mode, drop_suffix=drop_suffix) + + title = f'{self.label} (flow rates)' if mode == 'flow_rate' else f'{self.label} (flow hours)' + + if 'scenario' in ds.indexes: + chosen_scenario = scenario or self._calculation_results.scenarios[0] + ds = ds.sel(scenario=chosen_scenario).drop_vars('scenario') + title = f'{title} - {chosen_scenario}' + if engine == 'plotly': figure_like = plotting.with_plotly( - self.node_balance(with_last_timestep=True).to_dataframe(), + ds.to_dataframe(), colors=colors, - mode='area', - title=f'Flow rates of {self.label}', + style=style, + title=title, ) default_filetype = '.html' elif engine == 'matplotlib': figure_like = plotting.with_matplotlib( - self.node_balance(with_last_timestep=True).to_dataframe(), + ds.to_dataframe(), colors=colors, - mode='bar', - title=f'Flow rates of {self.label}', + style=style, + title=title, ) default_filetype = '.png' else: @@ -415,7 +856,7 @@ def plot_node_balance( return plotting.export_figure( figure_like=figure_like, - default_path=self._calculation_results.folder / f'{self.label} (flow rates)', + default_path=self._calculation_results.folder / title, default_filetype=default_filetype, user_path=None if isinstance(save, bool) else pathlib.Path(save), show=show, @@ -430,6 +871,7 @@ def plot_node_balance_pie( save: Union[bool, pathlib.Path] = False, show: bool = True, engine: plotting.PlottingEngine = 'plotly', + scenario: Optional[Union[str, int]] = None, ) -> plotly.graph_objects.Figure: """ Plots a pie chart of the flow hours of the inputs and outputs of buses or components. @@ -441,32 +883,40 @@ def plot_node_balance_pie( save: Whether to save the figure. show: Whether to show the figure. engine: Plotting engine to use. Only 'plotly' is implemented atm. + scenario: If scenarios are present: The scenario to plot. If None, the first scenario is used. + drop_suffix: Whether to drop the suffix from the variable names. """ - inputs = ( - sanitize_dataset( - ds=self.solution[self.inputs], - threshold=1e-5, - drop_small_vars=True, - zero_small_values=True, - ) - * self._calculation_results.hours_per_timestep + inputs = sanitize_dataset( + ds=self.solution[self.inputs] * self._calculation_results.hours_per_timestep, + threshold=1e-5, + drop_small_vars=True, + zero_small_values=True, + drop_suffix='|', ) - outputs = ( - sanitize_dataset( - ds=self.solution[self.outputs], - threshold=1e-5, - drop_small_vars=True, - zero_small_values=True, - ) - * self._calculation_results.hours_per_timestep + outputs = sanitize_dataset( + ds=self.solution[self.outputs] * self._calculation_results.hours_per_timestep, + threshold=1e-5, + drop_small_vars=True, + zero_small_values=True, + drop_suffix='|', ) + inputs = inputs.sum('time') + outputs = outputs.sum('time') + + title = f'{self.label} (total flow hours)' + + if 'scenario' in inputs.indexes: + chosen_scenario = scenario or self._calculation_results.scenarios[0] + inputs = inputs.sel(scenario=chosen_scenario).drop_vars('scenario') + outputs = outputs.sel(scenario=chosen_scenario).drop_vars('scenario') + title = f'{title} - {chosen_scenario}' if engine == 'plotly': figure_like = plotting.dual_pie_with_plotly( - inputs.to_dataframe().sum(), - outputs.to_dataframe().sum(), + data_left=inputs.to_pandas(), + data_right=outputs.to_pandas(), colors=colors, - title=f'Flow hours of {self.label}', + title=title, text_info=text_info, subtitles=('Inputs', 'Outputs'), legend_title='Flows', @@ -476,10 +926,10 @@ def plot_node_balance_pie( elif engine == 'matplotlib': logger.debug('Parameter text_info is not supported for matplotlib') figure_like = plotting.dual_pie_with_matplotlib( - inputs.to_dataframe().sum(), - outputs.to_dataframe().sum(), + data_left=inputs.to_pandas(), + data_right=outputs.to_pandas(), colors=colors, - title=f'Total flow hours of {self.label}', + title=title, subtitles=('Inputs', 'Outputs'), legend_title='Flows', lower_percentage_group=lower_percentage_group, @@ -490,7 +940,7 @@ def plot_node_balance_pie( return plotting.export_figure( figure_like=figure_like, - default_path=self._calculation_results.folder / f'{self.label} (total flow hours)', + default_path=self._calculation_results.folder / title, default_filetype=default_filetype, user_path=None if isinstance(save, bool) else pathlib.Path(save), show=show, @@ -503,9 +953,25 @@ def node_balance( negate_outputs: bool = False, threshold: Optional[float] = 1e-5, with_last_timestep: bool = False, + mode: Literal['flow_rate', 'flow_hours'] = 'flow_rate', + drop_suffix: bool = False, ) -> xr.Dataset: - return sanitize_dataset( - ds=self.solution[self.inputs + self.outputs], + """ + Returns a dataset with the node balance of the Component or Bus. + Args: + negate_inputs: Whether to negate the input flow_rates of the Node. + negate_outputs: Whether to negate the output flow_rates of the Node. + threshold: The threshold for small values. Variables with all values below the threshold are dropped. + with_last_timestep: Whether to include the last timestep in the dataset. + mode: The mode to use for the dataset. Can be 'flow_rate' or 'flow_hours'. + - 'flow_rate': Returns the flow_rates of the Node. + - 'flow_hours': Returns the flow_hours of the Node. [flow_hours(t) = flow_rate(t) * dt(t)]. Renames suffixes to |flow_hours. + drop_suffix: Whether to drop the suffix from the variable names. + """ + ds = self.solution[self.inputs + self.outputs] + + ds = sanitize_dataset( + ds=ds, threshold=threshold, timesteps=self._calculation_results.timesteps_extra if with_last_timestep else None, negate=( @@ -517,8 +983,15 @@ def node_balance( if negate_inputs else None ), + drop_suffix='|' if drop_suffix else None, ) + if mode == 'flow_hours': + ds = ds * self._calculation_results.hours_per_timestep + ds = ds.rename_vars({var: var.replace('flow_rate', 'flow_hours') for var in ds.data_vars}) + + return ds + class BusResults(_NodeResults): """Results for a Bus""" @@ -548,6 +1021,8 @@ def plot_charge_state( show: bool = True, colors: plotting.ColorType = 'viridis', engine: plotting.PlottingEngine = 'plotly', + style: Literal['area', 'stacked_bar', 'line'] = 'stacked_bar', + scenario: Optional[Union[str, int]] = None, ) -> plotly.graph_objs.Figure: """ Plots the charge state of a Storage. @@ -556,37 +1031,56 @@ def plot_charge_state( show: Whether to show the plot or not. colors: The c engine: Plotting engine to use. Only 'plotly' is implemented atm. + style: The plotting mode for the flow_rate + scenario: The scenario to plot. Defaults to the first scenario. Has no effect without scenarios present Raises: ValueError: If the Component is not a Storage. """ - if engine != 'plotly': - raise NotImplementedError( - f'Plotting engine "{engine}" not implemented for ComponentResults.plot_charge_state.' - ) - if not self.is_storage: raise ValueError(f'Cant plot charge_state. "{self.label}" is not a storage') - fig = plotting.with_plotly( - self.node_balance(with_last_timestep=True).to_dataframe(), - colors=colors, - mode='area', - title=f'Operation Balance of {self.label}', - ) + ds = self.node_balance(with_last_timestep=True) + charge_state = self.charge_state + + scenario_suffix = '' + if 'scenario' in ds.indexes: + chosen_scenario = scenario or self._calculation_results.scenarios[0] + ds = ds.sel(scenario=chosen_scenario).drop_vars('scenario') + charge_state = charge_state.sel(scenario=chosen_scenario).drop_vars('scenario') + scenario_suffix = f'--{chosen_scenario}' + if engine == 'plotly': + fig = plotting.with_plotly( + ds.to_dataframe(), + colors=colors, + style=style, + title=f'Operation Balance of {self.label}{scenario_suffix}', + ) - # TODO: Use colors for charge state? + # TODO: Use colors for charge state? - charge_state = self.charge_state.to_dataframe() - fig.add_trace( - plotly.graph_objs.Scatter( - x=charge_state.index, y=charge_state.values.flatten(), mode='lines', name=self._charge_state + charge_state = charge_state.to_dataframe() + fig.add_trace( + plotly.graph_objs.Scatter( + x=charge_state.index, y=charge_state.values.flatten(), mode='lines', name=self._charge_state + ) + ) + elif engine=='matplotlib': + fig, ax = plotting.with_matplotlib( + ds.to_dataframe(), + colors=colors, + style=style, + title=f'Operation Balance of {self.label}{scenario_suffix}', ) - ) + + charge_state = charge_state.to_dataframe() + ax.plot(charge_state.index, charge_state.values.flatten(), label=self._charge_state) + fig.tight_layout() + fig = fig, ax return plotting.export_figure( fig, - default_path=self._calculation_results.folder / f'{self.label} (charge state)', + default_path=self._calculation_results.folder / f'{self.label} (charge state){scenario_suffix}', default_filetype='.html', user_path=None if isinstance(save, bool) else pathlib.Path(save), show=show, @@ -633,6 +1127,45 @@ def get_shares_from(self, element: str): return self.solution[[name for name in self._variable_names if name.startswith(f'{element}->')]] +class FlowResults(_ElementResults): + def __init__( + self, + calculation_results: CalculationResults, + label: str, + variables: List[str], + constraints: List[str], + start: str, + end: str, + component: str, + ): + super().__init__(calculation_results, label, variables, constraints) + self.start = start + self.end = end + self.component = component + + @property + def flow_rate(self) -> xr.DataArray: + return self.solution[f'{self.label}|flow_rate'] + + @property + def flow_hours(self) -> xr.DataArray: + return (self.flow_rate * self._calculation_results.hours_per_timestep).rename(f'{self.label}|flow_hours') + + @property + def size(self) -> xr.DataArray: + name = f'{self.label}|size' + if name in self.solution: + return self.solution[name] + try: + return DataConverter.as_dataarray( + self._calculation_results.flow_system.flows[self.label].size, + scenarios=self._calculation_results.scenarios + ).rename(name) + except _FlowSystemRestorationError: + logger.critical(f'Size of flow {self.label}.size not availlable. Returning NaN') + return xr.DataArray(np.nan).rename(name) + + class SegmentedCalculationResults: """ Class to store the results of a SegmentedCalculation. @@ -824,6 +1357,7 @@ def sanitize_dataset( negate: Optional[List[str]] = None, drop_small_vars: bool = True, zero_small_values: bool = False, + drop_suffix: Optional[str] = None, ) -> xr.Dataset: """ Sanitizes a dataset by handling small values (dropping or zeroing) and optionally reindexing the time axis. @@ -835,6 +1369,7 @@ def sanitize_dataset( negate: The variables to negate. If None, no variables are negated. drop_small_vars: If True, drops variables where all values are below threshold. zero_small_values: If True, sets values below threshold to zero. + drop_suffix: Drop suffix of data var names. Split by the provided str. Returns: xr.Dataset: The sanitized dataset. @@ -873,26 +1408,177 @@ def sanitize_dataset( if timesteps is not None and not ds.indexes['time'].equals(timesteps): ds = ds.reindex({'time': timesteps}, fill_value=np.nan) + if drop_suffix is not None: + if not isinstance(drop_suffix, str): + raise ValueError(f'Only pass str values to drop suffixes. Got {drop_suffix}') + unique_dict = {} + for var in ds.data_vars: + new_name = var.split(drop_suffix)[0] + + # If name already exists, keep original name + if new_name in unique_dict.values(): + unique_dict[var] = var + else: + unique_dict[var] = new_name + ds = ds.rename(unique_dict) + return ds def filter_dataset( ds: xr.Dataset, - variable_dims: Optional[Literal['scalar', 'time']] = None, + variable_dims: Optional[Literal['scalar', 'time', 'scenario', 'timeonly', 'scenarioonly']] = None, + timesteps: Optional[Union[pd.DatetimeIndex, str, pd.Timestamp]] = None, + scenarios: Optional[Union[pd.Index, str, int]] = None, + contains: Optional[Union[str, List[str]]] = None, + startswith: Optional[Union[str, List[str]]] = None, ) -> xr.Dataset: """ - Filters a dataset by its dimensions. + Filters a dataset by its dimensions, indexes, and with string filters for variable names. Args: ds: The dataset to filter. - variable_dims: The dimension of the variables to filter for. + variable_dims: The dimension of which to get variables from. + - 'scalar': Get scalar variables (without dimensions) + - 'time': Get time-dependent variables (with a time dimension) + - 'scenario': Get scenario-dependent variables (with ONLY a scenario dimension) + - 'timeonly': Get time-dependent variables (with ONLY a time dimension) + - 'scenarioonly': Get scenario-dependent variables (with ONLY a scenario dimension) + timesteps: Optional time indexes to select. Can be: + - pd.DatetimeIndex: Multiple timesteps + - str/pd.Timestamp: Single timestep + Defaults to all available timesteps. + scenarios: Optional scenario indexes to select. Can be: + - pd.Index: Multiple scenarios + - str/int: Single scenario (int is treated as a label, not an index position) + Defaults to all available scenarios. + contains: Filter variables that contain this string or strings. + If a list is provided, variables must contain ALL strings in the list. + startswith: Filter variables that start with this string or strings. + If a list is provided, variables must start with ANY of the strings in the list. + + Returns: + Filtered dataset with specified variables and indexes. """ - if variable_dims is None: - return ds + # First filter by dimensions + filtered_ds = ds.copy() + if variable_dims is not None: + if variable_dims == 'scalar': + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if not filtered_ds[v].dims]] + elif variable_dims == 'time': + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if 'time' in filtered_ds[v].dims]] + elif variable_dims == 'scenario': + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if 'scenario' in filtered_ds[v].dims]] + elif variable_dims == 'timeonly': + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if filtered_ds[v].dims == ('time',)]] + elif variable_dims == 'scenarioonly': + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if filtered_ds[v].dims == ('scenario',)]] + else: + raise ValueError(f'Unknown variable_dims "{variable_dims}" for filter_dataset') + + # Filter by 'contains' parameter + if contains is not None: + if isinstance(contains, str): + # Single string - keep variables that contain this string + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if contains in v]] + elif isinstance(contains, list) and all(isinstance(s, str) for s in contains): + # List of strings - keep variables that contain ALL strings in the list + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if all(s in v for s in contains)]] + else: + raise TypeError(f"'contains' must be a string or list of strings, got {type(contains)}") + + # Filter by 'startswith' parameter + if startswith is not None: + if isinstance(startswith, str): + # Single string - keep variables that start with this string + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if v.startswith(startswith)]] + elif isinstance(startswith, list) and all(isinstance(s, str) for s in startswith): + # List of strings - keep variables that start with ANY of the strings in the list + filtered_ds = filtered_ds[[v for v in filtered_ds.data_vars if any(v.startswith(s) for s in startswith)]] + else: + raise TypeError(f"'startswith' must be a string or list of strings, got {type(startswith)}") - if variable_dims == 'scalar': - return ds[[name for name, da in ds.data_vars.items() if len(da.dims) == 0]] - elif variable_dims == 'time': - return ds[[name for name, da in ds.data_vars.items() if 'time' in da.dims]] - else: - raise ValueError(f'Not allowed value for "filter_dataset()": {variable_dims=}') + # Handle time selection if needed + if timesteps is not None and 'time' in filtered_ds.dims: + try: + filtered_ds = filtered_ds.sel(time=timesteps) + except KeyError as e: + available_times = set(filtered_ds.indexes['time']) + requested_times = set([timesteps]) if not isinstance(timesteps, pd.Index) else set(timesteps) + missing_times = requested_times - available_times + raise ValueError( + f'Timesteps not found in dataset: {missing_times}. Available times: {available_times}' + ) from e + + # Handle scenario selection if needed + if scenarios is not None and 'scenario' in filtered_ds.dims: + try: + filtered_ds = filtered_ds.sel(scenario=scenarios) + except KeyError as e: + available_scenarios = set(filtered_ds.indexes['scenario']) + requested_scenarios = set([scenarios]) if not isinstance(scenarios, pd.Index) else set(scenarios) + missing_scenarios = requested_scenarios - available_scenarios + raise ValueError( + f'Scenarios not found in dataset: {missing_scenarios}. Available scenarios: {available_scenarios}' + ) from e + + return filtered_ds + + +def filter_dataarray_by_coord( + da: xr.DataArray, + **kwargs: Optional[Union[str, List[str]]] +) -> xr.DataArray: + """Filter flows by node and component attributes. + + Filters are applied in the order they are specified. All filters must match for an edge to be included. + + To recombine filtered dataarrays, use `xr.concat`. + + xr.concat([res.sizes(start='Fernwärme'), res.sizes(end='Fernwärme')], dim='flow') + + Args: + da: Flow DataArray with network metadata coordinates. + **kwargs: Coord filters as name=value pairs. + + Returns: + Filtered DataArray with matching edges. + + Raises: + AttributeError: If required coordinates are missing. + ValueError: If specified nodes don't exist or no matches found. + """ + # Helper function to process filters + def apply_filter(array, coord_name: str, coord_values: Union[Any, List[Any]]): + # Verify coord exists + if coord_name not in array.coords: + raise AttributeError(f"Missing required coordinate '{coord_name}'") + + # Convert single value to list + val_list = [coord_values] if isinstance(coord_values, str) else coord_values + + # Verify coord_values exist + available = set(array[coord_name].values) + missing = [v for v in val_list if v not in available] + if missing: + raise ValueError(f"{coord_name.title()} value(s) not found: {missing}") + + # Apply filter + return array.where( + array[coord_name].isin(val_list) if isinstance(coord_values, list) else array[coord_name] == coord_values, + drop=True + ) + + # Apply filters from kwargs + filters = {k: v for k, v in kwargs.items() if v is not None} + try: + for coord, values in filters.items(): + da = apply_filter(da, coord, values) + except ValueError as e: + raise ValueError(f"No edges match criteria: {filters}") from e + + # Verify results exist + if da.size == 0: + raise ValueError(f"No edges match criteria: {filters}") + + return da diff --git a/flixopt/structure.py b/flixopt/structure.py index 1d0f2324f..8d4779457 100644 --- a/flixopt/structure.py +++ b/flixopt/structure.py @@ -19,7 +19,7 @@ from rich.pretty import Pretty from .config import CONFIG -from .core import NumericData, Scalar, TimeSeries, TimeSeriesCollection, TimeSeriesData +from .core import Scalar, TimeSeries, TimeSeriesCollection, TimeSeriesData, TimestepData if TYPE_CHECKING: # for type checking and preventing circular imports from .effects import EffectCollectionModel @@ -58,6 +58,7 @@ def __init__(self, flow_system: 'FlowSystem'): self.flow_system = flow_system self.time_series_collection = flow_system.time_series_collection self.effects: Optional[EffectCollectionModel] = None + self.scenario_weights = self._calculate_scenario_weights(flow_system.scenario_weights) def do_modeling(self): self.effects = self.flow_system.effects.create_model(self) @@ -69,6 +70,24 @@ def do_modeling(self): for bus_model in bus_models: # Buses after Components, because FlowModels are created in ComponentModels bus_model.do_modeling() + def _calculate_scenario_weights(self, weights: Optional[TimeSeries] = None) -> xr.DataArray: + """Calculates the weights of the scenarios. If None, all scenarios have the same weight. All weights are normalized to 1. + If no scenarios are present, s single weight of 1 is returned. + """ + if weights is not None and not isinstance(weights, TimeSeries): + raise TypeError(f'Weights must be a TimeSeries or None, got {type(weights)}') + if self.time_series_collection.scenarios is None: + return xr.DataArray(1) + if weights is None: + weights = xr.DataArray( + np.ones(len(self.time_series_collection.scenarios)), + coords={'scenario': self.time_series_collection.scenarios} + ) + elif isinstance(weights, TimeSeries): + weights = weights.selected_data + + return weights / weights.sum() + @property def solution(self): solution = super().solution @@ -87,6 +106,10 @@ def solution(self): effect.label_full: effect.model.results_structure() for effect in sorted(self.flow_system.effects, key=lambda effect: effect.label_full.upper()) }, + 'Flows': { + flow.label_full: flow.model.results_structure() + for flow in sorted(self.flow_system.flows.values(), key=lambda flow: flow.label_full.upper()) + }, } return solution.reindex(time=self.time_series_collection.timesteps_extra) @@ -98,13 +121,40 @@ def hours_per_step(self): def hours_of_previous_timesteps(self): return self.time_series_collection.hours_of_previous_timesteps - @property - def coords(self) -> Tuple[pd.DatetimeIndex]: - return (self.time_series_collection.timesteps,) + def get_coords( + self, scenario_dim=True, time_dim=True, extra_timestep=False + ) -> Optional[Union[Tuple[pd.Index], Tuple[pd.Index, pd.Index]]]: + """ + Returns the coordinates of the model - @property - def coords_extra(self) -> Tuple[pd.DatetimeIndex]: - return (self.time_series_collection.timesteps_extra,) + Args: + scenario_dim: If True, the scenario dimension is included in the coordinates + time_dim: If True, the time dimension is included in the coordinates + extra_timestep: If True, the extra timesteps are used instead of the regular timesteps + + Returns: + The coordinates of the model. Might also be None if no scenarios are present and time_dim is False + """ + if not scenario_dim and not time_dim: + return None + scenarios = self.time_series_collection.scenarios + timesteps = ( + self.time_series_collection.timesteps if not extra_timestep else self.time_series_collection.timesteps_extra + ) + + if scenario_dim and time_dim: + if scenarios is None: + return (timesteps,) + return timesteps, scenarios + + if scenario_dim and not time_dim: + if scenarios is None: + return None + return (scenarios,) + if time_dim and not scenario_dim: + return (timesteps,) + + raise ValueError(f'Cannot get coordinates with both {scenario_dim=} and {time_dim=}') class Interface: @@ -449,8 +499,7 @@ def __init__(self, model: SystemModel, element: Element): def results_structure(self): return { - 'label': self.label, - 'label_full': self.label_full, + 'label': self.label_full, 'variables': list(self.variables), 'constraints': list(self.constraints), } @@ -534,9 +583,12 @@ def copy_and_convert_datatypes(data: Any, use_numpy: bool = True, use_element_la return copy_and_convert_datatypes(data.tolist(), use_numpy, use_element_label) elif isinstance(data, TimeSeries): - return copy_and_convert_datatypes(data.active_data, use_numpy, use_element_label) + return copy_and_convert_datatypes(data.selected_data, use_numpy, use_element_label) elif isinstance(data, TimeSeriesData): return copy_and_convert_datatypes(data.data, use_numpy, use_element_label) + elif isinstance(data, (pd.Series, pd.DataFrame)): + #TODO: This can be improved + return copy_and_convert_datatypes(data.values, use_numpy, use_element_label) elif isinstance(data, Interface): if use_element_label and isinstance(data, Element): @@ -584,6 +636,8 @@ def describe_numpy_arrays(arr: np.ndarray) -> Union[str, List]: def normalized_center_of_mass(array: Any) -> float: # position in array (0 bis 1 normiert) + if array.ndim >= 2: # No good way to calculate center of mass for 2D arrays + return np.nan positions = np.linspace(0, 1, len(array)) # weights w_i # mass center if np.sum(array) == 0: diff --git a/flixopt/utils.py b/flixopt/utils.py index bb6e8ec40..3e65328a4 100644 --- a/flixopt/utils.py +++ b/flixopt/utils.py @@ -11,15 +11,6 @@ logger = logging.getLogger('flixopt') -def is_number(number_alias: Union[int, float, str]): - """Returns True is string is a number.""" - try: - float(number_alias) - return True - except ValueError: - return False - - def round_floats(obj, decimals=2): if isinstance(obj, dict): return {k: round_floats(v, decimals) for k, v in obj.items()} @@ -27,6 +18,12 @@ def round_floats(obj, decimals=2): return [round_floats(v, decimals) for v in obj] elif isinstance(obj, float): return round(obj, decimals) + elif isinstance(obj, int): + return obj + elif isinstance(obj, np.ndarray): + return np.round(obj, decimals).tolist() + elif isinstance(obj, xr.DataArray): + return obj.round(decimals).values.tolist() return obj diff --git a/tests/conftest.py b/tests/conftest.py index 5399be72a..b2ceb1c1d 100644 --- a/tests/conftest.py +++ b/tests/conftest.py @@ -121,6 +121,80 @@ def simple_flow_system() -> fx.FlowSystem: return flow_system +@pytest.fixture +def simple_flow_system_scenarios() -> fx.FlowSystem: + """ + Create a simple energy system for testing + """ + base_thermal_load = np.array([30.0, 0.0, 90.0, 110, 110, 20, 20, 20, 20]) + base_electrical_price = np.array([0.08, 0.1, 0.15]) + base_timesteps = pd.date_range('2020-01-01', periods=9, freq='h', name='time') + # Define effects + costs = fx.Effect('costs', '€', 'Kosten', is_standard=True, is_objective=True) + co2 = fx.Effect( + 'CO2', + 'kg', + 'CO2_e-Emissionen', + specific_share_to_other_effects_operation={costs.label: 0.2}, + maximum_operation_per_hour=1000, + ) + + # Create components + boiler = fx.linear_converters.Boiler( + 'Boiler', + eta=0.5, + Q_th=fx.Flow( + 'Q_th', + bus='Fernwärme', + size=50, + relative_minimum=5 / 50, + relative_maximum=1, + on_off_parameters=fx.OnOffParameters(), + ), + Q_fu=fx.Flow('Q_fu', bus='Gas'), + ) + + chp = fx.linear_converters.CHP( + 'CHP_unit', + eta_th=0.5, + eta_el=0.4, + P_el=fx.Flow('P_el', bus='Strom', size=60, relative_minimum=5 / 60, on_off_parameters=fx.OnOffParameters()), + Q_th=fx.Flow('Q_th', bus='Fernwärme'), + Q_fu=fx.Flow('Q_fu', bus='Gas'), + ) + + storage = fx.Storage( + 'Speicher', + charging=fx.Flow('Q_th_load', bus='Fernwärme', size=1e4), + discharging=fx.Flow('Q_th_unload', bus='Fernwärme', size=1e4), + capacity_in_flow_hours=fx.InvestParameters(fix_effects=20, fixed_size=30, optional=False), + initial_charge_state=0, + relative_maximum_charge_state=1 / 100 * np.array([80.0, 70.0, 80.0, 80, 80, 80, 80, 80, 80, 80]), + eta_charge=0.9, + eta_discharge=1, + relative_loss_per_hour=0.08, + prevent_simultaneous_charge_and_discharge=True, + ) + + heat_load = fx.Sink( + 'Wärmelast', sink=fx.Flow('Q_th_Last', bus='Fernwärme', size=1, fixed_relative_profile=base_thermal_load) + ) + + gas_tariff = fx.Source( + 'Gastarif', source=fx.Flow('Q_Gas', bus='Gas', size=1000, effects_per_flow_hour={'costs': 0.04, 'CO2': 0.3}) + ) + + electricity_feed_in = fx.Sink( + 'Einspeisung', sink=fx.Flow('P_el', bus='Strom', effects_per_flow_hour=-1 * base_electrical_price) + ) + + # Create flow system + flow_system = fx.FlowSystem(base_timesteps, scenarios=pd.Index(['A', 'B', 'C']), scenario_weights=np.array([0.5, 0.25, 0.25])) + flow_system.add_elements(fx.Bus('Strom'), fx.Bus('Fernwärme'), fx.Bus('Gas')) + flow_system.add_elements(storage, costs, co2, boiler, heat_load, gas_tariff, electricity_feed_in, chp) + + return flow_system + @pytest.fixture def basic_flow_system() -> fx.FlowSystem: diff --git a/tests/run_all_tests.py b/tests/run_all_tests.py index 5597a47f3..83b6dfacf 100644 --- a/tests/run_all_tests.py +++ b/tests/run_all_tests.py @@ -7,4 +7,4 @@ import pytest if __name__ == '__main__': - pytest.main(['test_functional.py', '--disable-warnings']) + pytest.main(['test_integration.py', '--disable-warnings']) diff --git a/tests/test_cycle_detection.py b/tests/test_cycle_detection.py new file mode 100644 index 000000000..71c775b99 --- /dev/null +++ b/tests/test_cycle_detection.py @@ -0,0 +1,226 @@ +import pytest + +from flixopt.effects import detect_cycles + + +def test_empty_graph(): + """Test that an empty graph has no cycles.""" + assert detect_cycles({}) == [] + + +def test_single_node(): + """Test that a graph with a single node and no edges has no cycles.""" + assert detect_cycles({"A": []}) == [] + + +def test_self_loop(): + """Test that a graph with a self-loop has a cycle.""" + cycles = detect_cycles({"A": ["A"]}) + assert len(cycles) == 1 + assert cycles[0] == ["A", "A"] + + +def test_simple_cycle(): + """Test that a simple cycle is detected.""" + graph = { + "A": ["B"], + "B": ["C"], + "C": ["A"] + } + cycles = detect_cycles(graph) + assert len(cycles) == 1 + assert cycles[0] == ["A", "B", "C", "A"] or cycles[0] == ["B", "C", "A", "B"] or cycles[0] == ["C", "A", "B", "C"] + + +def test_no_cycles(): + """Test that a directed acyclic graph has no cycles.""" + graph = { + "A": ["B", "C"], + "B": ["D", "E"], + "C": ["F"], + "D": [], + "E": [], + "F": [] + } + assert detect_cycles(graph) == [] + + +def test_multiple_cycles(): + """Test that a graph with multiple cycles is detected.""" + graph = { + "A": ["B", "D"], + "B": ["C"], + "C": ["A"], + "D": ["E"], + "E": ["D"] + } + cycles = detect_cycles(graph) + assert len(cycles) == 2 + + # Check that both cycles are detected (order might vary) + cycle_strings = [",".join(cycle) for cycle in cycles] + assert any("A,B,C,A" in s for s in cycle_strings) or any("B,C,A,B" in s for s in cycle_strings) or any( + "C,A,B,C" in s for s in cycle_strings) + assert any("D,E,D" in s for s in cycle_strings) or any("E,D,E" in s for s in cycle_strings) + + +def test_hidden_cycle(): + """Test that a cycle hidden deep in the graph is detected.""" + graph = { + "A": ["B", "C"], + "B": ["D"], + "C": ["E"], + "D": ["F"], + "E": ["G"], + "F": ["H"], + "G": ["I"], + "H": ["J"], + "I": ["K"], + "J": ["L"], + "K": ["M"], + "L": ["N"], + "M": ["N"], + "N": ["O"], + "O": ["P"], + "P": ["Q"], + "Q": ["O"] # Hidden cycle O->P->Q->O + } + cycles = detect_cycles(graph) + assert len(cycles) == 1 + + # Check that the O-P-Q cycle is detected + cycle = cycles[0] + assert "O" in cycle and "P" in cycle and "Q" in cycle + + # Check that they appear in the correct order + o_index = cycle.index("O") + p_index = cycle.index("P") + q_index = cycle.index("Q") + + # Check the cycle order is correct (allowing for different starting points) + cycle_len = len(cycle) + assert (p_index == (o_index + 1) % cycle_len and q_index == (p_index + 1) % cycle_len) or \ + (q_index == (o_index + 1) % cycle_len and p_index == (q_index + 1) % cycle_len) or \ + (o_index == (p_index + 1) % cycle_len and q_index == (o_index + 1) % cycle_len) + + +def test_disconnected_graph(): + """Test with a disconnected graph.""" + graph = { + "A": ["B"], + "B": ["C"], + "C": [], + "D": ["E"], + "E": ["F"], + "F": [] + } + assert detect_cycles(graph) == [] + + +def test_disconnected_graph_with_cycle(): + """Test with a disconnected graph containing a cycle in one component.""" + graph = { + "A": ["B"], + "B": ["C"], + "C": [], + "D": ["E"], + "E": ["F"], + "F": ["D"] # Cycle in D->E->F->D + } + cycles = detect_cycles(graph) + assert len(cycles) == 1 + + # Check that the D-E-F cycle is detected + cycle = cycles[0] + assert "D" in cycle and "E" in cycle and "F" in cycle + + # Check if they appear in the correct order + d_index = cycle.index("D") + e_index = cycle.index("E") + f_index = cycle.index("F") + + # Check the cycle order is correct (allowing for different starting points) + cycle_len = len(cycle) + assert (e_index == (d_index + 1) % cycle_len and f_index == (e_index + 1) % cycle_len) or \ + (f_index == (d_index + 1) % cycle_len and e_index == (f_index + 1) % cycle_len) or \ + (d_index == (e_index + 1) % cycle_len and f_index == (d_index + 1) % cycle_len) + + +def test_complex_dag(): + """Test with a complex directed acyclic graph.""" + graph = { + "A": ["B", "C", "D"], + "B": ["E", "F"], + "C": ["E", "G"], + "D": ["G", "H"], + "E": ["I", "J"], + "F": ["J", "K"], + "G": ["K", "L"], + "H": ["L", "M"], + "I": ["N"], + "J": ["N", "O"], + "K": ["O", "P"], + "L": ["P", "Q"], + "M": ["Q"], + "N": ["R"], + "O": ["R", "S"], + "P": ["S"], + "Q": ["S"], + "R": [], + "S": [] + } + assert detect_cycles(graph) == [] + + +def test_missing_node_in_connections(): + """Test behavior when a node referenced in edges doesn't have its own key.""" + graph = { + "A": ["B", "C"], + "B": ["D"] + # C and D don't have their own entries + } + assert detect_cycles(graph) == [] + + +def test_non_string_keys(): + """Test with non-string keys to ensure the algorithm is generic.""" + graph = { + 1: [2, 3], + 2: [4], + 3: [4], + 4: [] + } + assert detect_cycles(graph) == [] + + graph_with_cycle = { + 1: [2], + 2: [3], + 3: [1] + } + cycles = detect_cycles(graph_with_cycle) + assert len(cycles) == 1 + assert cycles[0] == [1, 2, 3, 1] or cycles[0] == [2, 3, 1, 2] or cycles[0] == [3, 1, 2, 3] + + +def test_complex_network_with_many_nodes(): + """Test with a large network to check performance and correctness.""" + graph = {} + # Create a large DAG + for i in range(100): + # Connect each node to the next few nodes + graph[i] = [j for j in range(i + 1, min(i + 5, 100))] + + # No cycles in this arrangement + assert detect_cycles(graph) == [] + + # Add a single back edge to create a cycle + graph[99] = [0] # This creates a cycle + cycles = detect_cycles(graph) + assert len(cycles) >= 1 + # The cycle might include many nodes, but must contain both 0 and 99 + any_cycle_has_both = any(0 in cycle and 99 in cycle for cycle in cycles) + assert any_cycle_has_both + + +if __name__ == "__main__": + pytest.main(["-v"]) diff --git a/tests/test_dataconverter.py b/tests/test_dataconverter.py index 49f1438e7..0484d4aac 100644 --- a/tests/test_dataconverter.py +++ b/tests/test_dataconverter.py @@ -3,69 +3,712 @@ import pytest import xarray as xr -from flixopt.core import ConversionError, DataConverter # Adjust this import to match your project structure +from flixopt.core import ( # Adjust this import to match your project structure + ConversionError, + DataConverter, + TimeSeries, +) @pytest.fixture -def sample_time_index(request): - index = pd.date_range('2024-01-01', periods=5, freq='D', name='time') - return index +def sample_time_index(): + return pd.date_range('2024-01-01', periods=5, freq='D', name='time') -def test_scalar_conversion(sample_time_index): - # Test scalar conversion - result = DataConverter.as_dataarray(42, sample_time_index) - assert isinstance(result, xr.DataArray) - assert result.shape == (len(sample_time_index),) - assert result.dims == ('time',) - assert np.all(result.values == 42) +@pytest.fixture +def sample_scenario_index(): + return pd.Index(['baseline', 'high_demand', 'low_price'], name='scenario') + + +@pytest.fixture +def multi_index(sample_time_index, sample_scenario_index): + """Create a sample MultiIndex combining scenarios and times.""" + return pd.MultiIndex.from_product([sample_scenario_index, sample_time_index], names=['scenario', 'time']) + + +class TestSingleDimensionConversion: + """Tests for converting data without scenarios (1D: time only).""" + + def test_scalar_conversion(self, sample_time_index): + """Test converting a scalar value.""" + # Test with integer + result = DataConverter.as_dataarray(42, sample_time_index) + assert isinstance(result, xr.DataArray) + assert result.shape == (len(sample_time_index),) + assert result.dims == ('time',) + assert np.all(result.values == 42) + + # Test with float + result = DataConverter.as_dataarray(42.5, sample_time_index) + assert np.all(result.values == 42.5) + + # Test with numpy scalar types + result = DataConverter.as_dataarray(np.int64(42), sample_time_index) + assert np.all(result.values == 42) + result = DataConverter.as_dataarray(np.float32(42.5), sample_time_index) + assert np.all(result.values == 42.5) + + def test_ndarray_conversion(self, sample_time_index): + """Test converting a numpy ndarray.""" + # Test with integer 1D array + arr_1d = np.array([1, 2, 3, 4, 5]) + result = DataConverter.as_dataarray(arr_1d, sample_time_index) + assert result.shape == (5,) + assert result.dims == ('time',) + assert np.array_equal(result.values, arr_1d) + + # Test with float 1D array + arr_1d = np.array([1.1, 2.2, 3.3, 4.4, 5.5]) + result = DataConverter.as_dataarray(arr_1d, sample_time_index) + assert np.array_equal(result.values, arr_1d) + + # Test with array containing NaN + arr_1d = np.array([1, np.nan, 3, np.nan, 5]) + result = DataConverter.as_dataarray(arr_1d, sample_time_index) + assert np.array_equal(np.isnan(result.values), np.isnan(arr_1d)) + assert np.array_equal(result.values[~np.isnan(result.values)], arr_1d[~np.isnan(arr_1d)]) + + def test_dataarray_conversion(self, sample_time_index): + """Test converting an existing xarray DataArray.""" + # Create original DataArray + original = xr.DataArray(data=np.array([1, 2, 3, 4, 5]), coords={'time': sample_time_index}, dims=['time']) + + # Convert and check + result = DataConverter.as_dataarray(original, sample_time_index) + assert result.shape == (5,) + assert result.dims == ('time',) + assert np.array_equal(result.values, original.values) + + # Ensure it's a copy + result[0] = 999 + assert original[0].item() == 1 # Original should be unchanged + + # Test with different time coordinates but same length + different_times = pd.date_range('2025-01-01', periods=5, freq='D', name='time') + original = xr.DataArray(data=np.array([1, 2, 3, 4, 5]), coords={'time': different_times}, dims=['time']) + + # Should raise an error for mismatched time coordinates + with pytest.raises(ConversionError): + DataConverter.as_dataarray(original, sample_time_index) + + +class TestMultiDimensionConversion: + """Tests for converting data with scenarios (2D: scenario × time).""" + + def test_scalar_with_scenarios(self, sample_time_index, sample_scenario_index): + """Test converting scalar values with scenario dimension.""" + # Test with integer + result = DataConverter.as_dataarray(42, sample_time_index, sample_scenario_index) + + assert isinstance(result, xr.DataArray) + assert result.shape == (len(sample_time_index), len(sample_scenario_index)) + assert result.dims == ('time', 'scenario') + assert np.all(result.values == 42) + assert set(result.coords['scenario'].values) == set(sample_scenario_index.values) + assert set(result.coords['time'].values) == set(sample_time_index.values) + + # Test with float + result = DataConverter.as_dataarray(42.5, sample_time_index, sample_scenario_index) + assert np.all(result.values == 42.5) + + def test_1d_array_with_scenarios(self, sample_time_index, sample_scenario_index): + """Test converting 1D array with scenario dimension (broadcasting).""" + # Create 1D array matching timesteps length + arr_1d = np.array([1, 2, 3, 4, 5]) + + # Convert with scenarios + result = DataConverter.as_dataarray(arr_1d, sample_time_index, sample_scenario_index) + + assert result.shape == (len(sample_time_index), len(sample_scenario_index)) + assert result.dims == ('time', 'scenario') + + # Each scenario should have the same values (broadcasting) + for scenario in sample_scenario_index: + scenario_slice = result.sel(scenario=scenario) + assert np.array_equal(scenario_slice.values, arr_1d) + + def test_2d_array_with_scenarios(self, sample_time_index, sample_scenario_index): + """Test converting 2D array with scenario dimension.""" + # Create 2D array with different values per scenario + arr_2d = np.array( + [ + [1, 2, 3, 4, 5], # baseline scenario + [6, 7, 8, 9, 10], # high_demand scenario + [11, 12, 13, 14, 15], # low_price scenario + ] + ) + + # Convert to DataArray + result = DataConverter.as_dataarray(arr_2d.T, sample_time_index, sample_scenario_index) + + assert result.shape == (5, 3) + assert result.dims == ('time', 'scenario') + + # Check that each scenario has correct values + assert np.array_equal(result.sel(scenario='baseline').values, arr_2d[0]) + assert np.array_equal(result.sel(scenario='high_demand').values, arr_2d[1]) + assert np.array_equal(result.sel(scenario='low_price').values, arr_2d[2]) + + def test_dataarray_with_scenarios(self, sample_time_index, sample_scenario_index): + """Test converting an existing DataArray with scenarios.""" + # Create a multi-scenario DataArray + original = xr.DataArray( + data=np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]), + coords={'scenario': sample_scenario_index, 'time': sample_time_index}, + dims=['scenario', 'time'], + ) + + # Test conversion + result = DataConverter.as_dataarray(original, sample_time_index, sample_scenario_index) + + assert result.shape == (5, 3) + assert result.dims == ('time', 'scenario') + assert np.array_equal(result.values, original.values.T) + + # Ensure it's a copy + result.loc[:, 'baseline'] = 999 + assert original.sel(scenario='baseline')[0].item() == 1 # Original should be unchanged + + +class TestSeriesConversion: + """Tests for converting pandas Series to DataArray.""" + + def test_series_single_dimension(self, sample_time_index): + """Test converting a pandas Series with time index.""" + # Create a Series with matching time index + series = pd.Series([10, 20, 30, 40, 50], index=sample_time_index) + + # Convert and check + result = DataConverter.as_dataarray(series, sample_time_index) + assert isinstance(result, xr.DataArray) + assert result.shape == (5,) + assert result.dims == ('time',) + assert np.array_equal(result.values, series.values) + assert np.array_equal(result.coords['time'].values, sample_time_index.values) + + # Test with scenario index + scenario_index = pd.Index(['baseline', 'high_demand', 'low_price'], name='scenario') + series = pd.Series([100, 200, 300], index=scenario_index) + + result = DataConverter.as_dataarray(series, scenarios=scenario_index) + assert result.shape == (3,) + assert result.dims == ('scenario',) + assert np.array_equal(result.values, series.values) + assert np.array_equal(result.coords['scenario'].values, scenario_index.values) + + def test_series_mismatched_index(self, sample_time_index): + """Test converting a Series with mismatched index.""" + # Create Series with different time index + different_times = pd.date_range('2025-01-01', periods=5, freq='D', name='time') + series = pd.Series([10, 20, 30, 40, 50], index=different_times) + + # Should raise error for mismatched index + with pytest.raises(ConversionError): + DataConverter.as_dataarray(series, sample_time_index) + + def test_series_broadcast_to_scenarios(self, sample_time_index, sample_scenario_index): + """Test broadcasting a time-indexed Series across scenarios.""" + # Create a Series with time index + series = pd.Series([10, 20, 30, 40, 50], index=sample_time_index) + + # Convert with scenarios + result = DataConverter.as_dataarray(series, sample_time_index, sample_scenario_index) + + assert result.shape == (5, 3) + assert result.dims == ('time', 'scenario') + + # Check broadcasting - each scenario should have the same values + for scenario in sample_scenario_index: + scenario_slice = result.sel(scenario=scenario) + assert np.array_equal(scenario_slice.values, series.values) + + def test_series_broadcast_to_time(self, sample_time_index, sample_scenario_index): + """Test broadcasting a scenario-indexed Series across time.""" + # Create a Series with scenario index + series = pd.Series([100, 200, 300], index=sample_scenario_index) + + # Convert with time + result = DataConverter.as_dataarray(series, sample_time_index, sample_scenario_index) + + assert result.shape == (5, 3) + assert result.dims == ('time', 'scenario') + + # Check broadcasting - each time should have the same scenario values + for time in sample_time_index: + time_slice = result.sel(time=time) + assert np.array_equal(time_slice.values, series.values) + + def test_series_dimension_order(self, sample_time_index, sample_scenario_index): + """Test that dimension order is respected with Series conversions.""" + # Create custom dimensions tuple with reversed order + dims = ('scenario', 'time',) + coords = {'time': sample_time_index, 'scenario': sample_scenario_index} + + # Time-indexed series + series = pd.Series([10, 20, 30, 40, 50], index=sample_time_index) + with pytest.raises(ConversionError, match="only supports time and scenario dimensions"): + _ = DataConverter._convert_series(series, coords, dims) + + # Scenario-indexed series + series = pd.Series([100, 200, 300], index=sample_scenario_index) + with pytest.raises(ConversionError, match="only supports time and scenario dimensions"): + _ = DataConverter._convert_series(series, coords, dims) + + +class TestDataFrameConversion: + """Tests for converting pandas DataFrame to DataArray.""" + + def test_dataframe_single_column(self, sample_time_index): + """Test converting a DataFrame with a single column.""" + # Create DataFrame with one column + df = pd.DataFrame({'value': [10, 20, 30, 40, 50]}, index=sample_time_index) + + # Convert and check + result = DataConverter.as_dataarray(df, sample_time_index) + assert isinstance(result, xr.DataArray) + assert result.shape == (5,) + assert result.dims == ('time',) + assert np.array_equal(result.values, df['value'].values) + + def test_dataframe_multi_column_fails(self, sample_time_index): + """Test that converting a multi-column DataFrame to 1D fails.""" + # Create DataFrame with multiple columns + df = pd.DataFrame({'val1': [10, 20, 30, 40, 50], 'val2': [15, 25, 35, 45, 55]}, index=sample_time_index) + + # Should raise error + with pytest.raises(ConversionError): + DataConverter.as_dataarray(df, sample_time_index) + + def test_dataframe_time_scenario(self, sample_time_index, sample_scenario_index): + """Test converting a DataFrame with time index and scenario columns.""" + # Create DataFrame with time as index and scenarios as columns + data = {'baseline': [10, 20, 30, 40, 50], 'high_demand': [15, 25, 35, 45, 55], 'low_price': [5, 15, 25, 35, 45]} + df = pd.DataFrame(data, index=sample_time_index) + + # Make sure columns are named properly + df.columns.name = 'scenario' + + # Convert and check + result = DataConverter.as_dataarray(df, sample_time_index, sample_scenario_index) + + assert result.shape == (5, 3) + assert result.dims == ('time', 'scenario') + assert np.array_equal(result.values, df.values) + + # Check values for specific scenarios + assert np.array_equal(result.sel(scenario='baseline').values, df['baseline'].values) + assert np.array_equal(result.sel(scenario='high_demand').values, df['high_demand'].values) + + def test_dataframe_mismatched_coordinates(self, sample_time_index, sample_scenario_index): + """Test conversion fails with mismatched coordinates.""" + # Create DataFrame with different time index + different_times = pd.date_range('2025-01-01', periods=5, freq='D', name='time') + data = {'baseline': [10, 20, 30, 40, 50], 'high_demand': [15, 25, 35, 45, 55], 'low_price': [5, 15, 25, 35, 45]} + df = pd.DataFrame(data, index=different_times) + df.columns = sample_scenario_index + + # Should raise error + with pytest.raises(ConversionError): + DataConverter.as_dataarray(df, sample_time_index, sample_scenario_index) + + # Create DataFrame with different scenario columns + different_scenarios = pd.Index(['scenario1', 'scenario2', 'scenario3'], name='scenario') + data = {'scenario1': [10, 20, 30, 40, 50], 'scenario2': [15, 25, 35, 45, 55], 'scenario3': [5, 15, 25, 35, 45]} + df = pd.DataFrame(data, index=sample_time_index) + df.columns = different_scenarios + + # Should raise error + with pytest.raises(ConversionError): + DataConverter.as_dataarray(df, sample_time_index, sample_scenario_index) + + def test_ensure_copy(self, sample_time_index, sample_scenario_index): + """Test that the returned DataArray is a copy.""" + # Create DataFrame + data = {'baseline': [10, 20, 30, 40, 50], 'high_demand': [15, 25, 35, 45, 55], 'low_price': [5, 15, 25, 35, 45]} + df = pd.DataFrame(data, index=sample_time_index) + df.columns = sample_scenario_index + # Convert + result = DataConverter.as_dataarray(df, sample_time_index, sample_scenario_index) -def test_series_conversion(sample_time_index): - series = pd.Series([1, 2, 3, 4, 5], index=sample_time_index) + # Modify the result + result.loc[dict(time=sample_time_index[0], scenario='baseline')] = 999 - # Test Series conversion - result = DataConverter.as_dataarray(series, sample_time_index) - assert isinstance(result, xr.DataArray) - assert result.shape == (5,) - assert result.dims == ('time',) - assert np.array_equal(result.values, series.values) + # Original should be unchanged + assert df.loc[sample_time_index[0], 'baseline'] == 10 -def test_dataframe_conversion(sample_time_index): - # Create a single-column DataFrame - df = pd.DataFrame({'A': [1, 2, 3, 4, 5]}, index=sample_time_index) +class TestInvalidInputs: + """Tests for invalid inputs and error handling.""" - # Test DataFrame conversion - result = DataConverter.as_dataarray(df, sample_time_index) - assert isinstance(result, xr.DataArray) - assert result.shape == (5,) - assert result.dims == ('time',) - assert np.array_equal(result.values.flatten(), df['A'].values) + def test_time_index_validation(self): + """Test validation of time index.""" + # Test with unnamed index + unnamed_index = pd.date_range('2024-01-01', periods=5, freq='D') + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, unnamed_index) + # Test with empty index + empty_index = pd.DatetimeIndex([], name='time') + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, empty_index) -def test_ndarray_conversion(sample_time_index): - # Test 1D array conversion - arr_1d = np.array([1, 2, 3, 4, 5]) - result = DataConverter.as_dataarray(arr_1d, sample_time_index) - assert result.shape == (5,) - assert result.dims == ('time',) - assert np.array_equal(result.values, arr_1d) + # Test with non-DatetimeIndex + wrong_type_index = pd.Index([1, 2, 3, 4, 5], name='time') + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, wrong_type_index) + def test_scenario_index_validation(self, sample_time_index): + """Test validation of scenario index.""" + # Test with unnamed scenario index + unnamed_index = pd.Index(['baseline', 'high_demand']) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, sample_time_index, unnamed_index) -def test_dataarray_conversion(sample_time_index): - # Create a DataArray - original = xr.DataArray(data=np.array([1, 2, 3, 4, 5]), coords={'time': sample_time_index}, dims=['time']) + # Test with empty scenario index + empty_index = pd.Index([], name='scenario') + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, sample_time_index, empty_index) - # Test DataArray conversion - result = DataConverter.as_dataarray(original, sample_time_index) - assert result.shape == (5,) - assert result.dims == ('time',) - assert np.array_equal(result.values, original.values) + # Test with non-Index scenario + with pytest.raises(ConversionError): + DataConverter.as_dataarray(42, sample_time_index, ['baseline', 'high_demand']) - # Ensure it's a copy - result[0] = 999 - assert original[0].item() == 1 # Original should be unchanged + def test_invalid_data_types(self, sample_time_index, sample_scenario_index): + """Test handling of invalid data types.""" + # Test invalid input type (string) + with pytest.raises(ConversionError): + DataConverter.as_dataarray('invalid_string', sample_time_index) + + # Test invalid input type with scenarios + with pytest.raises(ConversionError): + DataConverter.as_dataarray('invalid_string', sample_time_index, sample_scenario_index) + + # Test unsupported complex object + with pytest.raises(ConversionError): + DataConverter.as_dataarray(object(), sample_time_index) + + # Test None value + with pytest.raises(ConversionError): + DataConverter.as_dataarray(None, sample_time_index) + + def test_mismatched_input_dimensions(self, sample_time_index, sample_scenario_index): + """Test handling of mismatched input dimensions.""" + # Test mismatched Series index + mismatched_series = pd.Series( + [1, 2, 3, 4, 5, 6], index=pd.date_range('2025-01-01', periods=6, freq='D', name='time') + ) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(mismatched_series, sample_time_index) + + # Test DataFrame with multiple columns + df_multi_col = pd.DataFrame({'A': [1, 2, 3, 4, 5], 'B': [6, 7, 8, 9, 10]}, index=sample_time_index) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(df_multi_col, sample_time_index) + + # Test mismatched array shape for time-only + with pytest.raises(ConversionError): + DataConverter.as_dataarray(np.array([1, 2, 3]), sample_time_index) # Wrong length + + # Test mismatched array shape for scenario × time + # Array shape should be (n_scenarios, n_timesteps) + wrong_shape_array = np.array( + [ + [1, 2, 3, 4], # Missing a timestep + [5, 6, 7, 8], + [9, 10, 11, 12], + ] + ) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(wrong_shape_array, sample_time_index, sample_scenario_index) + + # Test array with too many dimensions + with pytest.raises(ConversionError): + # 3D array not allowed + DataConverter.as_dataarray(np.ones((3, 5, 2)), sample_time_index, sample_scenario_index) + + def test_dataarray_dimension_mismatch(self, sample_time_index, sample_scenario_index): + """Test handling of mismatched DataArray dimensions.""" + # Create DataArray with wrong dimensions + wrong_dims = xr.DataArray(data=np.array([1, 2, 3, 4, 5]), coords={'wrong_dim': range(5)}, dims=['wrong_dim']) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(wrong_dims, sample_time_index) + + # Create DataArray with scenario but no time + wrong_dims_2 = xr.DataArray(data=np.array([1, 2, 3]), coords={'scenario': ['a', 'b', 'c']}, dims=['scenario']) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(wrong_dims_2, sample_time_index, sample_scenario_index) + + # Create DataArray with right dims but wrong length + wrong_length = xr.DataArray( + data=np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]), + coords={ + 'scenario': sample_scenario_index, + 'time': pd.date_range('2024-01-01', periods=3, freq='D', name='time'), + }, + dims=['scenario', 'time'], + ) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(wrong_length, sample_time_index, sample_scenario_index) + +class TestDataArrayBroadcasting: + """Tests for broadcasting DataArrays.""" + def test_broadcast_1d_array_to_2d(self, sample_time_index, sample_scenario_index): + """Test broadcasting a 1D array to all scenarios.""" + arr_1d = np.array([1, 2, 3, 4, 5]) + + xr.testing.assert_equal( + DataConverter.as_dataarray(arr_1d, sample_time_index, sample_scenario_index), + xr.DataArray( + np.array([arr_1d] * 3).T, + coords=(sample_time_index, sample_scenario_index) + ) + ) + + arr_1d = np.array([1, 2, 3]) + xr.testing.assert_equal( + DataConverter.as_dataarray(arr_1d, sample_time_index, sample_scenario_index), + xr.DataArray( + np.array([arr_1d] * 5), + coords=(sample_time_index, sample_scenario_index) + ) + ) + + def test_broadcast_1d_array_to_1d(self, sample_time_index,): + """Test broadcasting a 1D array to all scenarios.""" + arr_1d = np.array([1, 2, 3, 4, 5]) + + xr.testing.assert_equal( + DataConverter.as_dataarray(arr_1d, sample_time_index), + xr.DataArray( + arr_1d, + coords=(sample_time_index,) + ) + ) + + arr_1d = np.array([1, 2, 3]) + with pytest.raises(ConversionError): + DataConverter.as_dataarray(arr_1d, sample_time_index) + + +class TestEdgeCases: + """Tests for edge cases and special scenarios.""" + + def test_single_timestep(self, sample_scenario_index): + """Test with a single timestep.""" + # Test with only one timestep + single_timestep = pd.DatetimeIndex(['2024-01-01'], name='time') + + # Scalar conversion + result = DataConverter.as_dataarray(42, single_timestep) + assert result.shape == (1,) + assert result.dims == ('time',) + + # With scenarios + result_with_scenarios = DataConverter.as_dataarray(42, single_timestep, sample_scenario_index) + assert result_with_scenarios.shape == (1, len(sample_scenario_index)) + assert result_with_scenarios.dims == ('time', 'scenario') + + def test_single_scenario(self, sample_time_index): + """Test with a single scenario.""" + # Test with only one scenario + single_scenario = pd.Index(['baseline'], name='scenario') + + # Scalar conversion with single scenario + result = DataConverter.as_dataarray(42, sample_time_index, single_scenario) + assert result.shape == (len(sample_time_index), 1) + assert result.dims == ('time', 'scenario') + + # Array conversion with single scenario + arr = np.array([1, 2, 3, 4, 5]) + result_arr = DataConverter.as_dataarray(arr, sample_time_index, single_scenario) + assert result_arr.shape == (5, 1) + assert np.array_equal(result_arr.sel(scenario='baseline').values, arr) + + # 2D array with single scenario + arr_2d = np.array([[1, 2, 3, 4, 5]]) # Note the extra dimension + result_arr_2d = DataConverter.as_dataarray(arr_2d.T, sample_time_index, single_scenario) + assert result_arr_2d.shape == (5, 1) + assert np.array_equal(result_arr_2d.sel(scenario='baseline').values, arr_2d[0]) + + def test_different_scenario_order(self, sample_time_index): + """Test that scenario order is preserved.""" + # Test with different scenario orders + scenarios1 = pd.Index(['a', 'b', 'c'], name='scenario') + scenarios2 = pd.Index(['c', 'b', 'a'], name='scenario') + + # Create DataArray with first order + data = np.array( + [ + [1, 2, 3, 4, 5], # a + [6, 7, 8, 9, 10], # b + [11, 12, 13, 14, 15], # c + ] + ).T + + result1 = DataConverter.as_dataarray(data, sample_time_index, scenarios1) + assert np.array_equal(result1.sel(scenario='a').values, [1, 2, 3, 4, 5]) + assert np.array_equal(result1.sel(scenario='c').values, [11, 12, 13, 14, 15]) + + # Create DataArray with second order + result2 = DataConverter.as_dataarray(data, sample_time_index, scenarios2) + # First row should match 'c' now + assert np.array_equal(result2.sel(scenario='c').values, [1, 2, 3, 4, 5]) + # Last row should match 'a' now + assert np.array_equal(result2.sel(scenario='a').values, [11, 12, 13, 14, 15]) + + def test_all_nan_data(self, sample_time_index, sample_scenario_index): + """Test handling of all-NaN data.""" + # Create array of all NaNs + all_nan_array = np.full(5, np.nan) + result = DataConverter.as_dataarray(all_nan_array, sample_time_index) + assert np.all(np.isnan(result.values)) + + # With scenarios + result = DataConverter.as_dataarray(all_nan_array, sample_time_index, sample_scenario_index) + assert result.shape == (len(sample_time_index), len(sample_scenario_index)) + assert np.all(np.isnan(result.values)) + + # Series of all NaNs + result = DataConverter.as_dataarray( + np.array([np.nan, np.nan, np.nan, np.nan, np.nan]), sample_time_index, sample_scenario_index + ) + assert np.all(np.isnan(result.values)) + + def test_mixed_data_types(self, sample_time_index, sample_scenario_index): + """Test conversion of mixed integer and float data.""" + # Create array with mixed types + mixed_array = np.array([1, 2.5, 3, 4.5, 5]) + result = DataConverter.as_dataarray(mixed_array, sample_time_index) + + # Result should be float dtype + assert np.issubdtype(result.dtype, np.floating) + assert np.array_equal(result.values, mixed_array) + + # With scenarios + result = DataConverter.as_dataarray(mixed_array, sample_time_index, sample_scenario_index) + assert np.issubdtype(result.dtype, np.floating) + for scenario in sample_scenario_index: + assert np.array_equal(result.sel(scenario=scenario).values, mixed_array) + + +class TestFunctionalUseCase: + """Tests for realistic use cases combining multiple features.""" + + def test_large_dataset(self, sample_scenario_index): + """Test with a larger dataset to ensure performance.""" + # Create a larger timestep array (e.g., hourly for a year) + large_timesteps = pd.date_range( + '2024-01-01', + periods=8760, # Hours in a year + freq='H', + name='time', + ) + + # Create large 2D array (3 scenarios × 8760 hours) + large_data = np.random.rand(len(sample_scenario_index), len(large_timesteps)) + + # Convert and check + result = DataConverter.as_dataarray(large_data.T, large_timesteps, sample_scenario_index) + + assert result.shape == (len(large_timesteps), len(sample_scenario_index)) + assert result.dims == ('time', 'scenario') + assert np.array_equal(result.values, large_data.T) + + +class TestMultiScenarioArrayConversion: + """Tests specifically focused on array conversion with scenarios.""" + + def test_1d_array_broadcasting(self, sample_time_index, sample_scenario_index): + """Test that 1D arrays are properly broadcast to all scenarios.""" + arr_1d = np.array([1, 2, 3, 4, 5]) + result = DataConverter.as_dataarray(arr_1d, sample_time_index, sample_scenario_index) + + assert result.shape == (len(sample_time_index), len(sample_scenario_index)) + + # Each scenario should have identical values + for i, scenario in enumerate(sample_scenario_index): + assert np.array_equal(result.sel(scenario=scenario).values, arr_1d) + + # Modify one scenario's values + result.loc[dict(scenario=scenario)] = np.ones(len(sample_time_index)) * i + + # Ensure modifications are isolated to each scenario + for i, scenario in enumerate(sample_scenario_index): + assert np.all(result.sel(scenario=scenario).values == i) + + def test_2d_array_different_shapes(self, sample_time_index): + """Test different scenario shapes with 2D arrays.""" + # Test with 1 scenario + single_scenario = pd.Index(['baseline'], name='scenario') + arr_1_scenario = np.array([[1, 2, 3, 4, 5]]) + + result = DataConverter.as_dataarray(arr_1_scenario.T, sample_time_index, single_scenario) + assert result.shape == (len(sample_time_index), 1) + + # Test with 2 scenarios + two_scenarios = pd.Index(['baseline', 'high_demand'], name='scenario') + arr_2_scenarios = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10]]) + + result = DataConverter.as_dataarray(arr_2_scenarios.T, sample_time_index, two_scenarios) + assert result.shape == (len(sample_time_index), 2) + assert np.array_equal(result.sel(scenario='baseline').values, arr_2_scenarios[0]) + assert np.array_equal(result.sel(scenario='high_demand').values, arr_2_scenarios[1]) + + # Test mismatched scenarios count + three_scenarios = pd.Index(['a', 'b', 'c'], name='scenario') + with pytest.raises(ConversionError): + DataConverter.as_dataarray(arr_2_scenarios, sample_time_index, three_scenarios) + + def test_array_handling_edge_cases(self, sample_time_index, sample_scenario_index): + """Test array edge cases.""" + # Test with boolean array + bool_array = np.array([True, False, True, False, True]) + result = DataConverter.as_dataarray(bool_array, sample_time_index, sample_scenario_index) + assert result.dtype == bool + assert result.shape == (len(sample_time_index), len(sample_scenario_index)) + + # Test with array containing infinite values + inf_array = np.array([1, np.inf, 3, -np.inf, 5]) + result = DataConverter.as_dataarray(inf_array, sample_time_index, sample_scenario_index) + for scenario in sample_scenario_index: + scenario_data = result.sel(scenario=scenario) + assert np.isinf(scenario_data[1].item()) + assert np.isinf(scenario_data[3].item()) + assert scenario_data[3].item() < 0 # Negative infinity + + +class TestScenarioReindexing: + """Tests for reindexing and coordinate preservation in DataConverter.""" + + def test_preserving_scenario_order(self, sample_time_index): + """Test that scenario order is preserved in converted DataArrays.""" + # Define scenarios in a specific order + scenarios = pd.Index(['scenario3', 'scenario1', 'scenario2'], name='scenario') + + # Create 2D array + data = np.array( + [ + [1, 2, 3, 4, 5], # scenario3 + [6, 7, 8, 9, 10], # scenario1 + [11, 12, 13, 14, 15], # scenario2 + ] + ) + + # Convert to DataArray + result = DataConverter.as_dataarray(data.T, sample_time_index, scenarios) + + # Verify order of scenarios is preserved + assert list(result.coords['scenario'].values) == list(scenarios) + + # Verify data for each scenario + assert np.array_equal(result.sel(scenario='scenario3').values, data[0]) + assert np.array_equal(result.sel(scenario='scenario1').values, data[1]) + assert np.array_equal(result.sel(scenario='scenario2').values, data[2]) + + +if __name__ == '__main__': + pytest.main() def test_invalid_inputs(sample_time_index): @@ -100,12 +743,12 @@ def test_time_index_validation(): # Test with empty index empty_index = pd.DatetimeIndex([], name='time') - with pytest.raises(ValueError): + with pytest.raises(ConversionError): DataConverter.as_dataarray(42, empty_index) # Test with non-DatetimeIndex wrong_type_index = pd.Index([1, 2, 3, 4, 5], name='time') - with pytest.raises(ValueError): + with pytest.raises(ConversionError): DataConverter.as_dataarray(42, wrong_type_index) diff --git a/tests/test_effect.py b/tests/test_effect.py index 5cbc04ac6..ff85d0556 100644 --- a/tests/test_effect.py +++ b/tests/test_effect.py @@ -5,10 +5,10 @@ import flixopt as fx -from .conftest import assert_conequal, assert_var_equal, create_linopy_model +from .conftest import assert_conequal, assert_var_equal, create_calculation_and_solve, create_linopy_model -class TestBusModel: +class TestEffectModel: """Test the FlowModel class.""" def test_minimal(self, basic_flow_system_linopy): @@ -140,3 +140,85 @@ def test_shares(self, basic_flow_system_linopy): ) +class TestEffectResults: + def test_shares(self, basic_flow_system_linopy): + flow_system = basic_flow_system_linopy + flow_system.effects['Costs'].specific_share_to_other_effects_operation['Effect1'] = 0.5 + flow_system.add_elements( + fx.Effect('Effect1', '€', 'Testing Effect', + specific_share_to_other_effects_operation={ + 'Effect2': 1.1, + 'Effect3': 1.2 + }, + specific_share_to_other_effects_invest={ + 'Effect2': 2.1, + 'Effect3': 2.2 + } + ), + fx.Effect('Effect2', '€', 'Testing Effect', specific_share_to_other_effects_operation={'Effect3': 5}), + fx.Effect('Effect3', '€', 'Testing Effect'), + fx.linear_converters.Boiler( + 'Boiler', eta=0.5, Q_th=fx.Flow('Q_th', bus='Fernwärme', ),Q_fu=fx.Flow('Q_fu', bus='Gas'), + ) + ) + + results = create_calculation_and_solve(flow_system, fx.solvers.HighsSolver(0.01, 60), 'Sim1').results + + effect_share_factors = { + 'operation': { + ('Costs', 'Effect1'): 0.5, + ('Costs', 'Effect2'): 0.5 * 1.1, + ('Costs', 'Effect3'): 0.5 * 1.1 * 5 + 0.5 * 1.2, #This is where the issue lies + ('Effect1', 'Effect2'): 1.1, + ('Effect1', 'Effect3'): 1.2 + 1.1 * 5, + ('Effect2', 'Effect3'): 5, + }, + 'invest': { + ('Effect1', 'Effect2'): 2.1, + ('Effect1', 'Effect3'): 2.2, + } + } + for key, value in effect_share_factors['operation'].items(): + np.testing.assert_allclose(results.effect_share_factors['operation'][key].values, value) + + for key, value in effect_share_factors['invest'].items(): + np.testing.assert_allclose(results.effect_share_factors['invest'][key].values, value) + + xr.testing.assert_allclose(results.effects_per_component('operation').sum('component')['Costs'], + results.solution['Costs(operation)|total_per_timestep'].fillna(0)) + + xr.testing.assert_allclose(results.effects_per_component('operation').sum('component')['Effect1'], + results.solution['Effect1(operation)|total_per_timestep'].fillna(0)) + + xr.testing.assert_allclose(results.effects_per_component('operation').sum('component')['Effect2'], + results.solution['Effect2(operation)|total_per_timestep'].fillna(0)) + + xr.testing.assert_allclose(results.effects_per_component('operation').sum('component')['Effect3'], + results.solution['Effect3(operation)|total_per_timestep'].fillna(0)) + + # Invest mode checks + xr.testing.assert_allclose(results.effects_per_component('invest').sum('component')['Costs'], + results.solution['Costs(invest)|total']) + + xr.testing.assert_allclose(results.effects_per_component('invest').sum('component')['Effect1'], + results.solution['Effect1(invest)|total']) + + xr.testing.assert_allclose(results.effects_per_component('invest').sum('component')['Effect2'], + results.solution['Effect2(invest)|total']) + + xr.testing.assert_allclose(results.effects_per_component('invest').sum('component')['Effect3'], + results.solution['Effect3(invest)|total']) + + # Total mode checks + xr.testing.assert_allclose(results.effects_per_component('total').sum('component')['Costs'], + results.solution['Costs|total']) + + xr.testing.assert_allclose(results.effects_per_component('total').sum('component')['Effect1'], + results.solution['Effect1|total']) + + xr.testing.assert_allclose(results.effects_per_component('total').sum('component')['Effect2'], + results.solution['Effect2|total']) + + xr.testing.assert_allclose(results.effects_per_component('total').sum('component')['Effect3'], + results.solution['Effect3|total']) + diff --git a/tests/test_effects_shares_summation.py b/tests/test_effects_shares_summation.py new file mode 100644 index 000000000..e2dada7e9 --- /dev/null +++ b/tests/test_effects_shares_summation.py @@ -0,0 +1,236 @@ +from typing import Dict, Tuple + +import numpy as np +import pytest +import xarray as xr + +from flixopt.effects import calculate_all_conversion_paths + + +def test_direct_conversions(): + """Test direct conversions with simple scalar values.""" + conversion_dict = { + 'A': {'B': xr.DataArray(2.0)}, + 'B': {'C': xr.DataArray(3.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Check direct conversions + assert ('A', 'B') in result + assert ('B', 'C') in result + assert result[('A', 'B')].item() == 2.0 + assert result[('B', 'C')].item() == 3.0 + + # Check indirect conversion + assert ('A', 'C') in result + assert result[('A', 'C')].item() == 6.0 # 2.0 * 3.0 + + +def test_multiple_paths(): + """Test multiple paths between nodes that should be summed.""" + conversion_dict = { + 'A': {'B': xr.DataArray(2.0), 'C': xr.DataArray(3.0)}, + 'B': {'D': xr.DataArray(4.0)}, + 'C': {'D': xr.DataArray(5.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # A to D should sum two paths: A->B->D (2*4=8) and A->C->D (3*5=15) + assert ('A', 'D') in result + assert result[('A', 'D')].item() == 8.0 + 15.0 + + +def test_xarray_conversions(): + """Test with xarray DataArrays that have dimensions.""" + # Create DataArrays with a time dimension + time_points = [1, 2, 3] + a_to_b = xr.DataArray([2.0, 2.1, 2.2], dims=['time'], coords={'time': time_points}) + b_to_c = xr.DataArray([3.0, 3.1, 3.2], dims=['time'], coords={'time': time_points}) + + conversion_dict = { + 'A': {'B': a_to_b}, + 'B': {'C': b_to_c} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Check indirect conversion preserves dimensions + assert ('A', 'C') in result + assert result[('A', 'C')].dims == ('time',) + + # Check values at each time point + for i, t in enumerate(time_points): + expected = a_to_b.values[i] * b_to_c.values[i] + assert pytest.approx(result[('A', 'C')].sel(time=t).item()) == expected + + +def test_long_paths(): + """Test with longer paths (more than one intermediate node).""" + conversion_dict = { + 'A': {'B': xr.DataArray(2.0)}, + 'B': {'C': xr.DataArray(3.0)}, + 'C': {'D': xr.DataArray(4.0)}, + 'D': {'E': xr.DataArray(5.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Check the full path A->B->C->D->E + assert ('A', 'E') in result + expected = 2.0 * 3.0 * 4.0 * 5.0 # 120.0 + assert result[('A', 'E')].item() == expected + + +def test_diamond_paths(): + """Test with a diamond shape graph with multiple paths to the same destination.""" + conversion_dict = { + 'A': {'B': xr.DataArray(2.0), 'C': xr.DataArray(3.0)}, + 'B': {'D': xr.DataArray(4.0)}, + 'C': {'D': xr.DataArray(5.0)}, + 'D': {'E': xr.DataArray(6.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # A to E should go through both paths: + # A->B->D->E (2*4*6=48) and A->C->D->E (3*5*6=90) + assert ('A', 'E') in result + expected = 48.0 + 90.0 # 138.0 + assert result[('A', 'E')].item() == expected + + +def test_effect_shares_example(): + """Test the specific example from the effects share factors test.""" + # Create the conversion dictionary based on test example + conversion_dict = { + 'Costs': {'Effect1': xr.DataArray(0.5)}, + 'Effect1': {'Effect2': xr.DataArray(1.1), 'Effect3': xr.DataArray(1.2)}, + 'Effect2': {'Effect3': xr.DataArray(5.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Test direct paths + assert result[('Costs', 'Effect1')].item() == 0.5 + assert result[('Effect1', 'Effect2')].item() == 1.1 + assert result[('Effect2', 'Effect3')].item() == 5.0 + + # Test indirect paths + # Costs -> Effect2 = Costs -> Effect1 -> Effect2 = 0.5 * 1.1 + assert result[('Costs', 'Effect2')].item() == 0.5 * 1.1 + + # Costs -> Effect3 has two paths: + # 1. Costs -> Effect1 -> Effect3 = 0.5 * 1.2 = 0.6 + # 2. Costs -> Effect1 -> Effect2 -> Effect3 = 0.5 * 1.1 * 5 = 2.75 + # Total = 0.6 + 2.75 = 3.35 + assert result[('Costs', 'Effect3')].item() == 0.5 * 1.2 + 0.5 * 1.1 * 5 + + # Effect1 -> Effect3 has two paths: + # 1. Effect1 -> Effect2 -> Effect3 = 1.1 * 5.0 = 5.5 + # 2. Effect1 -> Effect3 = 1.2 + # Total = 0.6 + 2.75 = 3.35 + assert result[('Effect1', 'Effect3')].item() == 1.2 + 1.1 * 5.0 + + +def test_empty_conversion_dict(): + """Test with an empty conversion dictionary.""" + result = calculate_all_conversion_paths({}) + assert len(result) == 0 + + +def test_no_indirect_paths(): + """Test with a dictionary that has no indirect paths.""" + conversion_dict = { + 'A': {'B': xr.DataArray(2.0)}, + 'C': {'D': xr.DataArray(3.0)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Only direct paths should exist + assert len(result) == 2 + assert ('A', 'B') in result + assert ('C', 'D') in result + assert result[('A', 'B')].item() == 2.0 + assert result[('C', 'D')].item() == 3.0 + + +def test_complex_network(): + """Test with a complex network of many nodes and multiple paths, without circular references.""" + # Create a directed acyclic graph with many nodes + # Structure resembles a layered network with multiple paths + conversion_dict = { + 'A': {'B': xr.DataArray(1.5), 'C': xr.DataArray(2.0), 'D': xr.DataArray(0.5)}, + 'B': {'E': xr.DataArray(3.0), 'F': xr.DataArray(1.2)}, + 'C': {'E': xr.DataArray(0.8), 'G': xr.DataArray(2.5)}, + 'D': {'G': xr.DataArray(1.8), 'H': xr.DataArray(3.2)}, + 'E': {'I': xr.DataArray(0.7), 'J': xr.DataArray(1.4)}, + 'F': {'J': xr.DataArray(2.2), 'K': xr.DataArray(0.9)}, + 'G': {'K': xr.DataArray(1.6), 'L': xr.DataArray(2.8)}, + 'H': {'L': xr.DataArray(0.4), 'M': xr.DataArray(1.1)}, + 'I': {'N': xr.DataArray(2.3)}, + 'J': {'N': xr.DataArray(1.9), 'O': xr.DataArray(0.6)}, + 'K': {'O': xr.DataArray(3.5), 'P': xr.DataArray(1.3)}, + 'L': {'P': xr.DataArray(2.7), 'Q': xr.DataArray(0.8)}, + 'M': {'Q': xr.DataArray(2.1)}, + 'N': {'R': xr.DataArray(1.7)}, + 'O': {'R': xr.DataArray(2.9), 'S': xr.DataArray(1.0)}, + 'P': {'S': xr.DataArray(2.4)}, + 'Q': {'S': xr.DataArray(1.5)} + } + + result = calculate_all_conversion_paths(conversion_dict) + + # Check some direct paths + assert result[('A', 'B')].item() == 1.5 + assert result[('D', 'H')].item() == 3.2 + assert result[('G', 'L')].item() == 2.8 + + # Check some two-step paths + assert result[('A', 'E')].item() == 1.5 * 3.0 + 2.0 * 0.8 # A->B->E + A->C->E + assert result[('B', 'J')].item() == 3.0 * 1.4 + 1.2 * 2.2 # B->E->J + B->F->J + + # Check some three-step paths + # A->B->E->I + # A->C->E->I + expected_a_to_i = 1.5 * 3.0 * 0.7 + 2.0 * 0.8 * 0.7 + assert pytest.approx(result[('A', 'I')].item()) == expected_a_to_i + + # Check some four-step paths + # A->B->E->I->N + # A->C->E->I->N + expected_a_to_n = 1.5 * 3.0 * 0.7 * 2.3 + 2.0 * 0.8 * 0.7 * 2.3 + expected_a_to_n += 1.5 * 3.0 * 1.4 * 1.9 + 2.0 * 0.8 * 1.4 * 1.9 # A->B->E->J->N + A->C->E->J->N + expected_a_to_n += 1.5 * 1.2 * 2.2 * 1.9 # A->B->F->J->N + assert pytest.approx(result[('A', 'N')].item()) == expected_a_to_n + + # Check a very long path from A to S + # This should include: + # A->B->E->J->O->S + # A->B->F->K->O->S + # A->C->E->J->O->S + # A->C->G->K->O->S + # A->D->G->K->O->S + # A->D->H->L->P->S + # A->D->H->M->Q->S + # And many more + assert ('A', 'S') in result + + # There are many paths to R from A - check their existence + assert ('A', 'R') in result + + # Check that there's no direct path from A to R + # But there should be indirect paths + assert ('A', 'R') in result + assert 'A' not in conversion_dict.get('R', {}) + + # Count the number of paths calculated to verify algorithm explored all connections + # In a DAG with 19 nodes (A through S), the maximum number of pairs is 19*18 = 342 + # But we won't have all possible connections due to the structure + # Just verify we have a reasonable number + assert len(result) > 50 + +if __name__ == '__main__': + pytest.main() diff --git a/tests/test_io.py b/tests/test_io.py index 2e6c61ccf..541e5b2a4 100644 --- a/tests/test_io.py +++ b/tests/test_io.py @@ -11,10 +11,11 @@ flow_system_long, flow_system_segments_of_flows_2, simple_flow_system, + simple_flow_system_scenarios, ) -@pytest.fixture(params=[flow_system_base, flow_system_segments_of_flows_2, simple_flow_system, flow_system_long]) +@pytest.fixture(params=[flow_system_base, simple_flow_system_scenarios, flow_system_segments_of_flows_2, simple_flow_system, flow_system_long]) def flow_system(request): fs = request.getfixturevalue(request.param.__name__) if isinstance(fs, fx.FlowSystem): @@ -27,6 +28,7 @@ def test_flow_system_file_io(flow_system, highs_solver): calculation_0 = fx.FullCalculation('IO', flow_system=flow_system) calculation_0.do_modeling() calculation_0.solve(highs_solver) + calculation_0.flow_system.plot_network() calculation_0.results.to_file() paths = CalculationResultsPaths(calculation_0.folder, calculation_0.name) @@ -35,6 +37,7 @@ def test_flow_system_file_io(flow_system, highs_solver): calculation_1 = fx.FullCalculation('Loaded_IO', flow_system=flow_system_1) calculation_1.do_modeling() calculation_1.solve(highs_solver) + calculation_1.flow_system.plot_network() assert_almost_equal_numeric( calculation_0.results.model.objective.value, diff --git a/tests/test_plots.py b/tests/test_plots.py index 840b4e7b3..4e00f9a51 100644 --- a/tests/test_plots.py +++ b/tests/test_plots.py @@ -53,15 +53,15 @@ def get_sample_data( def test_bar_plots(self): data = self.get_sample_data(nr_of_columns=10, nr_of_periods=1, time_steps_per_period=24) - plotly.offline.plot(plotting.with_plotly(data, 'bar')) - plotting.with_matplotlib(data, 'bar') + plotly.offline.plot(plotting.with_plotly(data, 'stacked_bar')) + plotting.with_matplotlib(data, 'stacked_bar') plt.show() data = self.get_sample_data( nr_of_columns=10, nr_of_periods=5, time_steps_per_period=24, drop_fraction_of_indices=0.3 ) - plotly.offline.plot(plotting.with_plotly(data, 'bar')) - plotting.with_matplotlib(data, 'bar') + plotly.offline.plot(plotting.with_plotly(data, 'stacked_bar')) + plotting.with_matplotlib(data, 'stacked_bar') plt.show() def test_line_plots(self): diff --git a/tests/test_results_plots.py b/tests/test_results_plots.py index 855944a48..fe8d27c3b 100644 --- a/tests/test_results_plots.py +++ b/tests/test_results_plots.py @@ -58,12 +58,7 @@ def test_results_plots(flow_system, plotting_engine, show, save, color_spec): ) results['Speicher'].plot_node_balance_pie(engine=plotting_engine, save=save, show=show, colors=color_spec) - - if plotting_engine == 'matplotlib': - with pytest.raises(NotImplementedError): - results['Speicher'].plot_charge_state(engine=plotting_engine) - else: - results['Speicher'].plot_charge_state(engine=plotting_engine) + results['Speicher'].plot_charge_state(engine=plotting_engine) plt.close('all') diff --git a/tests/test_scenarios.py b/tests/test_scenarios.py new file mode 100644 index 000000000..4abdafa5f --- /dev/null +++ b/tests/test_scenarios.py @@ -0,0 +1,332 @@ +import matplotlib.pyplot as plt +import numpy as np +import pandas as pd +import pytest +import xarray as xr +from linopy.testing import assert_linequal + +import flixopt as fx +from flixopt.commons import Effect, FullCalculation, InvestParameters, Sink, Source, Storage, TimeSeriesData, solvers +from flixopt.elements import Bus, Flow +from flixopt.flow_system import FlowSystem + +from .conftest import create_calculation_and_solve, create_linopy_model + + +@pytest.fixture +def test_system(): + """Create a basic test system with scenarios.""" + # Create a two-day time index with hourly resolution + timesteps = pd.date_range( + "2023-01-01", periods=48, freq="h", name="time" + ) + + # Create two scenarios + scenarios = pd.Index(["Scenario A", "Scenario B"], name="scenario") + + # Create scenario weights as TimeSeriesData + # Using TimeSeriesData to avoid conversion issues + scenario_weights = TimeSeriesData(np.array([0.7, 0.3])) + + # Create a flow system with scenarios + flow_system = FlowSystem( + timesteps=timesteps, + scenarios=scenarios, + scenario_weights=scenario_weights # Use TimeSeriesData for weights + ) + + # Create demand profiles that differ between scenarios + # Scenario A: Higher demand in first day, lower in second day + # Scenario B: Lower demand in first day, higher in second day + demand_profile_a = np.concatenate([ + np.sin(np.linspace(0, 2*np.pi, 24)) * 5 + 10, # Day 1, max ~15 + np.sin(np.linspace(0, 2*np.pi, 24)) * 2 + 5 # Day 2, max ~7 + ]) + + demand_profile_b = np.concatenate([ + np.sin(np.linspace(0, 2*np.pi, 24)) * 2 + 5, # Day 1, max ~7 + np.sin(np.linspace(0, 2*np.pi, 24)) * 5 + 10 # Day 2, max ~15 + ]) + + # Stack the profiles into a 2D array (time, scenario) + demand_profiles = np.column_stack([demand_profile_a, demand_profile_b]) + + # Create the necessary model elements + # Create buses + electricity_bus = Bus("Electricity") + + # Create a demand sink with scenario-dependent profiles + demand = Flow( + label="Demand", + bus=electricity_bus.label_full, + fixed_relative_profile=demand_profiles + ) + demand_sink = Sink("Demand", sink=demand) + + # Create a power source with investment option + power_gen = Flow( + label="Generation", + bus=electricity_bus.label_full, + size=InvestParameters( + minimum_size=0, + maximum_size=20, + specific_effects={"Costs": 100} # €/kW + ), + effects_per_flow_hour={"Costs": 20} # €/MWh + ) + generator = Source("Generator", source=power_gen) + + # Create a storage for electricity + storage_charge = Flow( + label="Charge", + bus=electricity_bus.label_full, + size=10 + ) + storage_discharge = Flow( + label="Discharge", + bus=electricity_bus.label_full, + size=10 + ) + storage = Storage( + label="Battery", + charging=storage_charge, + discharging=storage_discharge, + capacity_in_flow_hours=InvestParameters( + minimum_size=0, + maximum_size=50, + specific_effects={"Costs": 50} # €/kWh + ), + eta_charge=0.95, + eta_discharge=0.95, + initial_charge_state="lastValueOfSim" + ) + + # Create effects and objective + cost_effect = Effect( + label="Costs", + unit="€", + description="Total costs", + is_standard=True, + is_objective=True + ) + + # Add all elements to the flow system + flow_system.add_elements( + electricity_bus, + generator, + demand_sink, + storage, + cost_effect + ) + + # Return the created system and its components + return { + "flow_system": flow_system, + "timesteps": timesteps, + "scenarios": scenarios, + "electricity_bus": electricity_bus, + "demand": demand, + "demand_sink": demand_sink, + "generator": generator, + "power_gen": power_gen, + "storage": storage, + "storage_charge": storage_charge, + "storage_discharge": storage_discharge, + "cost_effect": cost_effect + } + +@pytest.fixture +def flow_system_complex_scenarios() -> fx.FlowSystem: + """ + Helper method to create a base model with configurable parameters + """ + thermal_load = np.array([30, 0, 90, 110, 110, 20, 20, 20, 20]) + electrical_load = np.array([40, 40, 40, 40, 40, 40, 40, 40, 40]) + flow_system = fx.FlowSystem(pd.date_range('2020-01-01', periods=9, freq='h', name='time'), + pd.Index(['A', 'B', 'C'], name='scenario')) + # Define the components and flow_system + flow_system.add_elements( + fx.Effect('costs', '€', 'Kosten', is_standard=True, is_objective=True), + fx.Effect('CO2', 'kg', 'CO2_e-Emissionen', specific_share_to_other_effects_operation={'costs': 0.2}), + fx.Effect('PE', 'kWh_PE', 'Primärenergie', maximum_total=3.5e3), + fx.Bus('Strom'), + fx.Bus('Fernwärme'), + fx.Bus('Gas'), + fx.Sink('Wärmelast', sink=fx.Flow('Q_th_Last', 'Fernwärme', size=1, fixed_relative_profile=thermal_load)), + fx.Source( + 'Gastarif', source=fx.Flow('Q_Gas', 'Gas', size=1000, effects_per_flow_hour={'costs': 0.04, 'CO2': 0.3}) + ), + fx.Sink('Einspeisung', sink=fx.Flow('P_el', 'Strom', effects_per_flow_hour=-1 * electrical_load)), + ) + + boiler = fx.linear_converters.Boiler( + 'Kessel', + eta=0.5, + on_off_parameters=fx.OnOffParameters(effects_per_running_hour={'costs': 0, 'CO2': 1000}), + Q_th=fx.Flow( + 'Q_th', + bus='Fernwärme', + load_factor_max=1.0, + load_factor_min=0.1, + relative_minimum=5 / 50, + relative_maximum=1, + previous_flow_rate=50, + size=fx.InvestParameters( + fix_effects=1000, fixed_size=50, optional=False, specific_effects={'costs': 10, 'PE': 2} + ), + on_off_parameters=fx.OnOffParameters( + on_hours_total_min=0, + on_hours_total_max=1000, + consecutive_on_hours_max=10, + consecutive_on_hours_min=1, + consecutive_off_hours_max=10, + effects_per_switch_on=0.01, + switch_on_total_max=1000, + ), + flow_hours_total_max=1e6, + ), + Q_fu=fx.Flow('Q_fu', bus='Gas', size=200, relative_minimum=0, relative_maximum=1), + ) + + invest_speicher = fx.InvestParameters( + fix_effects=0, + piecewise_effects=fx.PiecewiseEffects( + piecewise_origin=fx.Piecewise([fx.Piece(5, 25), fx.Piece(25, 100)]), + piecewise_shares={ + 'costs': fx.Piecewise([fx.Piece(50, 250), fx.Piece(250, 800)]), + 'PE': fx.Piecewise([fx.Piece(5, 25), fx.Piece(25, 100)]), + }, + ), + optional=False, + specific_effects={'costs': 0.01, 'CO2': 0.01}, + minimum_size=0, + maximum_size=1000, + ) + speicher = fx.Storage( + 'Speicher', + charging=fx.Flow('Q_th_load', bus='Fernwärme', size=1e4), + discharging=fx.Flow('Q_th_unload', bus='Fernwärme', size=1e4), + capacity_in_flow_hours=invest_speicher, + initial_charge_state=0, + maximal_final_charge_state=10, + eta_charge=0.9, + eta_discharge=1, + relative_loss_per_hour=0.08, + prevent_simultaneous_charge_and_discharge=True, + ) + + flow_system.add_elements(boiler, speicher) + + return flow_system + + +@pytest.fixture +def flow_system_piecewise_conversion_scenarios(flow_system_complex_scenarios) -> fx.FlowSystem: + """ + Use segments/Piecewise with numeric data + """ + flow_system = flow_system_complex_scenarios + + flow_system.add_elements( + fx.LinearConverter( + 'KWK', + inputs=[fx.Flow('Q_fu', bus='Gas')], + outputs=[ + fx.Flow('P_el', bus='Strom', size=60, relative_maximum=55, previous_flow_rate=10), + fx.Flow('Q_th', bus='Fernwärme'), + ], + piecewise_conversion=fx.PiecewiseConversion( + { + 'P_el': fx.Piecewise( + [ + fx.Piece(np.linspace(5, 6, len(flow_system.time_series_collection.timesteps)), 30), + fx.Piece(40, np.linspace(60, 70, len(flow_system.time_series_collection.timesteps))), + ] + ), + 'Q_th': fx.Piecewise([fx.Piece(6, 35), fx.Piece(45, 100)]), + 'Q_fu': fx.Piecewise([fx.Piece(12, 70), fx.Piece(90, 200)]), + } + ), + on_off_parameters=fx.OnOffParameters(effects_per_switch_on=0.01), + ) + ) + + return flow_system + + +def test_scenario_weights(flow_system_piecewise_conversion_scenarios): + """Test that scenario weights are correctly used in the model.""" + scenarios = flow_system_piecewise_conversion_scenarios.time_series_collection.scenarios + weights = np.linspace(0.5, 1, len(scenarios)) / np.sum(np.linspace(0.5, 1, len(scenarios))) + flow_system_piecewise_conversion_scenarios.scenario_weights = weights + model = create_linopy_model(flow_system_piecewise_conversion_scenarios) + np.testing.assert_allclose(model.scenario_weights.values, weights) + assert_linequal(model.objective.expression, + (model.variables['costs|total'] * weights).sum() + model.variables['Penalty|total']) + assert np.isclose(model.scenario_weights.sum().item(), 1.0) + +def test_scenario_dimensions_in_variables(flow_system_piecewise_conversion_scenarios): + """Test that all time variables are correctly broadcasted to scenario dimensions.""" + model = create_linopy_model(flow_system_piecewise_conversion_scenarios) + for var in model.variables: + assert model.variables[var].dims in [('time', 'scenario'), ('scenario',), ()] + +def test_full_scenario_optimization(flow_system_piecewise_conversion_scenarios): + """Test a full optimization with scenarios and verify results.""" + scenarios = flow_system_piecewise_conversion_scenarios.time_series_collection.scenarios + weights = np.linspace(0.5, 1, len(scenarios)) / np.sum(np.linspace(0.5, 1, len(scenarios))) + flow_system_piecewise_conversion_scenarios.scenario_weights = weights + calc = create_calculation_and_solve(flow_system_piecewise_conversion_scenarios, + solver=fx.solvers.GurobiSolver(mip_gap=0.01, time_limit_seconds=60), + name='test_full_scenario') + calc.results.to_file() + + res = fx.results.CalculationResults.from_file('results', 'test_full_scenario') + fx.FlowSystem.from_dataset(res.flow_system_data) + calc = create_calculation_and_solve( + flow_system_piecewise_conversion_scenarios, + solver=fx.solvers.GurobiSolver(mip_gap=0.01, time_limit_seconds=60), + name='test_full_scenario', + ) + +@pytest.mark.skip(reason="This test is taking too long with highs and is too big for gurobipy free") +def test_io_persistance(flow_system_piecewise_conversion_scenarios): + """Test a full optimization with scenarios and verify results.""" + scenarios = flow_system_piecewise_conversion_scenarios.time_series_collection.scenarios + weights = np.linspace(0.5, 1, len(scenarios)) / np.sum(np.linspace(0.5, 1, len(scenarios))) + flow_system_piecewise_conversion_scenarios.scenario_weights = weights + calc = create_calculation_and_solve(flow_system_piecewise_conversion_scenarios, + solver=fx.solvers.HighsSolver(mip_gap=0.001, time_limit_seconds=60), + name='test_full_scenario') + calc.results.to_file() + + res = fx.results.CalculationResults.from_file('results', 'test_full_scenario') + flow_system_2 = fx.FlowSystem.from_dataset(res.flow_system_data) + calc_2 = create_calculation_and_solve( + flow_system_2, + solver=fx.solvers.HighsSolver(mip_gap=0.001, time_limit_seconds=60), + name='test_full_scenario_2', + ) + + np.testing.assert_allclose(calc.results.objective, calc_2.results.objective, rtol=0.001) + + +def test_scenarios_selection(flow_system_piecewise_conversion_scenarios): + flow_system = flow_system_piecewise_conversion_scenarios + scenarios = flow_system_piecewise_conversion_scenarios.time_series_collection.scenarios + weights = np.linspace(0.5, 1, len(scenarios)) / np.sum(np.linspace(0.5, 1, len(scenarios))) + flow_system_piecewise_conversion_scenarios.scenario_weights = weights + calc = fx.FullCalculation(flow_system=flow_system_piecewise_conversion_scenarios, + selected_scenarios=flow_system.time_series_collection.scenarios[0:2], + name='test_full_scenario') + calc.do_modeling() + calc.solve(fx.solvers.GurobiSolver(mip_gap=0.01, time_limit_seconds=60)) + + calc.results.to_file() + flow_system_2 = fx.FlowSystem.from_dataset(calc.results.flow_system_data) + + assert calc.results.solution.indexes['scenario'].equals(flow_system.time_series_collection.scenarios[0:2]) + + assert flow_system_2.time_series_collection.scenarios.equals(flow_system.time_series_collection.scenarios[0:2]) + + np.testing.assert_allclose(flow_system_2.scenario_weights.selected_data.values, weights[0:2]) diff --git a/tests/test_timeseries.py b/tests/test_timeseries.py index a8bc5fa85..bb35231a6 100644 --- a/tests/test_timeseries.py +++ b/tests/test_timeseries.py @@ -8,7 +8,7 @@ import pytest import xarray as xr -from flixopt.core import ConversionError, DataConverter, TimeSeries, TimeSeriesCollection, TimeSeriesData +from flixopt.core import ConversionError, DataConverter, TimeSeries, TimeSeriesCollection @pytest.fixture @@ -44,13 +44,13 @@ def test_initialization(self, simple_dataarray): # Check data initialization assert isinstance(ts.stored_data, xr.DataArray) assert ts.stored_data.equals(simple_dataarray) - assert ts.active_data.equals(simple_dataarray) + assert ts.selected_data.equals(simple_dataarray) # Check backup was created assert ts._backup.equals(simple_dataarray) # Check active timesteps - assert ts.active_timesteps.equals(simple_dataarray.indexes['time']) + assert ts._valid_selector == {} # No selections initially def test_initialization_with_aggregation_params(self, simple_dataarray): """Test initialization with aggregation parameters.""" @@ -66,60 +66,60 @@ def test_initialization_validation(self, sample_timesteps): """Test validation during initialization.""" # Test missing time dimension invalid_data = xr.DataArray([1, 2, 3], dims=['invalid_dim']) - with pytest.raises(ValueError, match='must have a "time" index'): + with pytest.raises(ValueError, match='DataArray dimensions must be subset of'): TimeSeries(invalid_data, name='Invalid Series') # Test multi-dimensional data multi_dim_data = xr.DataArray( [[1, 2, 3], [4, 5, 6]], coords={'dim1': [0, 1], 'time': sample_timesteps[:3]}, dims=['dim1', 'time'] ) - with pytest.raises(ValueError, match='dimensions of DataArray must be 1'): + with pytest.raises(ValueError, match='DataArray dimensions must be subset of'): TimeSeries(multi_dim_data, name='Multi-dim Series') - def test_active_timesteps_getter_setter(self, sample_timeseries, sample_timesteps): - """Test active_timesteps getter and setter.""" - # Initial state should use all timesteps - assert sample_timeseries.active_timesteps.equals(sample_timesteps) + def test_selection_methods(self, sample_timeseries, sample_timesteps): + """Test selection methods.""" + # Initial state should have no selections + assert sample_timeseries._selected_timesteps is None + assert sample_timeseries._selected_scenarios is None # Set to a subset subset_index = sample_timesteps[1:3] - sample_timeseries.active_timesteps = subset_index - assert sample_timeseries.active_timesteps.equals(subset_index) + sample_timeseries.set_selection(timesteps=subset_index) + assert sample_timeseries._selected_timesteps.equals(subset_index) # Active data should reflect the subset - assert sample_timeseries.active_data.equals(sample_timeseries.stored_data.sel(time=subset_index)) + assert sample_timeseries.selected_data.equals(sample_timeseries.stored_data.sel(time=subset_index)) - # Reset to full index - sample_timeseries.active_timesteps = None - assert sample_timeseries.active_timesteps.equals(sample_timesteps) - - # Test invalid type - with pytest.raises(TypeError, match='must be a pandas DatetimeIndex'): - sample_timeseries.active_timesteps = 'invalid' + # Clear selection + sample_timeseries.set_selection() + assert sample_timeseries._selected_timesteps is None + assert sample_timeseries.selected_data.equals(sample_timeseries.stored_data) def test_reset(self, sample_timeseries, sample_timesteps): """Test reset method.""" # Set to subset first subset_index = sample_timesteps[1:3] - sample_timeseries.active_timesteps = subset_index + sample_timeseries.set_selection(timesteps=subset_index) # Reset sample_timeseries.reset() - # Should be back to full index - assert sample_timeseries.active_timesteps.equals(sample_timesteps) - assert sample_timeseries.active_data.equals(sample_timeseries.stored_data) + # Should be back to full index (all selections cleared) + assert sample_timeseries._selected_timesteps is None + assert sample_timeseries.selected_data.equals(sample_timeseries.stored_data) def test_restore_data(self, sample_timeseries, simple_dataarray): """Test restore_data method.""" # Modify the stored data - new_data = xr.DataArray([1, 2, 3, 4, 5], coords={'time': sample_timeseries.active_timesteps}, dims=['time']) + new_data = xr.DataArray( + [1, 2, 3, 4, 5], coords={'time': sample_timeseries.stored_data.coords['time']}, dims=['time'] + ) # Store original data for comparison original_data = sample_timeseries.stored_data - # Set new data - sample_timeseries.stored_data = new_data + # Update data + sample_timeseries.update_stored_data(new_data) assert sample_timeseries.stored_data.equals(new_data) # Restore from backup @@ -127,42 +127,42 @@ def test_restore_data(self, sample_timeseries, simple_dataarray): # Should be back to original data assert sample_timeseries.stored_data.equals(original_data) - assert sample_timeseries.active_data.equals(original_data) + assert sample_timeseries.selected_data.equals(original_data) - def test_stored_data_setter(self, sample_timeseries, sample_timesteps): - """Test stored_data setter with different data types.""" + def test_update_stored_data(self, sample_timeseries, sample_timesteps): + """Test update_stored_data method with different data types.""" # Test with a Series series_data = pd.Series([5, 6, 7, 8, 9], index=sample_timesteps) - sample_timeseries.stored_data = series_data + sample_timeseries.update_stored_data(series_data) assert np.array_equal(sample_timeseries.stored_data.values, series_data.values) # Test with a single-column DataFrame df_data = pd.DataFrame({'col1': [15, 16, 17, 18, 19]}, index=sample_timesteps) - sample_timeseries.stored_data = df_data + sample_timeseries.update_stored_data(df_data) assert np.array_equal(sample_timeseries.stored_data.values, df_data['col1'].values) # Test with a NumPy array array_data = np.array([25, 26, 27, 28, 29]) - sample_timeseries.stored_data = array_data + sample_timeseries.update_stored_data(array_data) assert np.array_equal(sample_timeseries.stored_data.values, array_data) # Test with a scalar - sample_timeseries.stored_data = 42 + sample_timeseries.update_stored_data(42) assert np.all(sample_timeseries.stored_data.values == 42) # Test with another DataArray another_dataarray = xr.DataArray([30, 31, 32, 33, 34], coords={'time': sample_timesteps}, dims=['time']) - sample_timeseries.stored_data = another_dataarray + sample_timeseries.update_stored_data(another_dataarray) assert sample_timeseries.stored_data.equals(another_dataarray) def test_stored_data_setter_no_change(self, sample_timeseries): - """Test stored_data setter when data doesn't change.""" + """Test update_stored_data method when data doesn't change.""" # Get current data current_data = sample_timeseries.stored_data current_backup = sample_timeseries._backup # Set the same data - sample_timeseries.stored_data = current_data + sample_timeseries.update_stored_data(current_data) # Backup shouldn't change assert sample_timeseries._backup is current_backup # Should be the same object @@ -229,35 +229,37 @@ def test_all_equal(self, sample_timesteps): def test_arithmetic_operations(self, sample_timeseries): """Test arithmetic operations.""" # Create a second TimeSeries for testing - data2 = xr.DataArray([1, 2, 3, 4, 5], coords={'time': sample_timeseries.active_timesteps}, dims=['time']) + data2 = xr.DataArray( + [1, 2, 3, 4, 5], coords={'time': sample_timeseries.stored_data.coords['time']}, dims=['time'] + ) ts2 = TimeSeries(data2, 'Second Series') # Test operations between two TimeSeries objects assert np.array_equal( - (sample_timeseries + ts2).values, sample_timeseries.active_data.values + ts2.active_data.values + (sample_timeseries + ts2).values, sample_timeseries.selected_data.values + ts2.selected_data.values ) assert np.array_equal( - (sample_timeseries - ts2).values, sample_timeseries.active_data.values - ts2.active_data.values + (sample_timeseries - ts2).values, sample_timeseries.selected_data.values - ts2.selected_data.values ) assert np.array_equal( - (sample_timeseries * ts2).values, sample_timeseries.active_data.values * ts2.active_data.values + (sample_timeseries * ts2).values, sample_timeseries.selected_data.values * ts2.selected_data.values ) assert np.array_equal( - (sample_timeseries / ts2).values, sample_timeseries.active_data.values / ts2.active_data.values + (sample_timeseries / ts2).values, sample_timeseries.selected_data.values / ts2.selected_data.values ) # Test operations with DataArrays - assert np.array_equal((sample_timeseries + data2).values, sample_timeseries.active_data.values + data2.values) - assert np.array_equal((data2 + sample_timeseries).values, data2.values + sample_timeseries.active_data.values) + assert np.array_equal((sample_timeseries + data2).values, sample_timeseries.selected_data.values + data2.values) + assert np.array_equal((data2 + sample_timeseries).values, data2.values + sample_timeseries.selected_data.values) # Test operations with scalars - assert np.array_equal((sample_timeseries + 5).values, sample_timeseries.active_data.values + 5) - assert np.array_equal((5 + sample_timeseries).values, 5 + sample_timeseries.active_data.values) + assert np.array_equal((sample_timeseries + 5).values, sample_timeseries.selected_data.values + 5) + assert np.array_equal((5 + sample_timeseries).values, 5 + sample_timeseries.selected_data.values) # Test unary operations - assert np.array_equal((-sample_timeseries).values, -sample_timeseries.active_data.values) - assert np.array_equal((+sample_timeseries).values, +sample_timeseries.active_data.values) - assert np.array_equal((abs(sample_timeseries)).values, abs(sample_timeseries.active_data.values)) + assert np.array_equal((-sample_timeseries).values, -sample_timeseries.selected_data.values) + assert np.array_equal((+sample_timeseries).values, +sample_timeseries.selected_data.values) + assert np.array_equal((abs(sample_timeseries)).values, abs(sample_timeseries.selected_data.values)) def test_comparison_operations(self, sample_timesteps): """Test comparison operations.""" @@ -279,327 +281,466 @@ def test_comparison_operations(self, sample_timesteps): def test_numpy_ufunc(self, sample_timeseries): """Test numpy ufunc compatibility.""" # Test basic numpy functions - assert np.array_equal(np.add(sample_timeseries, 5).values, np.add(sample_timeseries.active_data, 5).values) + assert np.array_equal(np.add(sample_timeseries, 5).values, np.add(sample_timeseries.selected_data, 5).values) assert np.array_equal( - np.multiply(sample_timeseries, 2).values, np.multiply(sample_timeseries.active_data, 2).values + np.multiply(sample_timeseries, 2).values, np.multiply(sample_timeseries.selected_data, 2).values ) # Test with two TimeSeries objects - data2 = xr.DataArray([1, 2, 3, 4, 5], coords={'time': sample_timeseries.active_timesteps}, dims=['time']) + data2 = xr.DataArray( + [1, 2, 3, 4, 5], coords={'time': sample_timeseries.stored_data.coords['time']}, dims=['time'] + ) ts2 = TimeSeries(data2, 'Second Series') assert np.array_equal( - np.add(sample_timeseries, ts2).values, np.add(sample_timeseries.active_data, ts2.active_data).values + np.add(sample_timeseries, ts2).values, np.add(sample_timeseries.selected_data, ts2.selected_data).values ) def test_sel_and_isel_properties(self, sample_timeseries): """Test sel and isel properties.""" # Test that sel property works - selected = sample_timeseries.sel(time=sample_timeseries.active_timesteps[0]) - assert selected.item() == sample_timeseries.active_data.values[0] + selected = sample_timeseries.sel(time=sample_timeseries.stored_data.coords['time'][0]) + assert selected.item() == sample_timeseries.selected_data.values[0] # Test that isel property works indexed = sample_timeseries.isel(time=0) - assert indexed.item() == sample_timeseries.active_data.values[0] + assert indexed.item() == sample_timeseries.selected_data.values[0] + + +@pytest.fixture +def sample_scenario_index(): + """Create a sample scenario index with the required 'scenario' name.""" + return pd.Index(['baseline', 'high_demand', 'low_price'], name='scenario') + + +@pytest.fixture +def simple_scenario_dataarray(sample_timesteps, sample_scenario_index): + """Create a DataArray with both scenario and time dimensions.""" + data = np.array( + [ + [10, 20, 30, 40, 50], # baseline + [15, 25, 35, 45, 55], # high_demand + [5, 15, 25, 35, 45], # low_price + ] + ) + return xr.DataArray( + data=data, coords={'scenario': sample_scenario_index, 'time': sample_timesteps}, dims=['scenario', 'time'] + ) @pytest.fixture -def sample_collection(sample_timesteps): +def sample_scenario_timeseries(simple_scenario_dataarray): + """Create a sample TimeSeries object with scenario dimension.""" + return TimeSeries(simple_scenario_dataarray, name='Test Scenario Series') + + +@pytest.fixture +def sample_allocator(sample_timesteps): """Create a sample TimeSeriesCollection.""" return TimeSeriesCollection(sample_timesteps) @pytest.fixture -def populated_collection(sample_collection): - """Create a TimeSeriesCollection with test data.""" - # Add a constant time series - sample_collection.create_time_series(42, 'constant_series') - - # Add a varying time series - varying_data = np.array([10, 20, 30, 40, 50]) - sample_collection.create_time_series(varying_data, 'varying_series') - - # Add a time series with extra timestep - sample_collection.create_time_series( - np.array([1, 2, 3, 4, 5, 6]), 'extra_timestep_series', needs_extra_timestep=True - ) +def sample_scenario_allocator(sample_timesteps, sample_scenario_index): + """Create a sample TimeSeriesCollection with scenarios.""" + return TimeSeriesCollection(sample_timesteps, scenarios=sample_scenario_index) - # Add series with aggregation settings - sample_collection.create_time_series( - TimeSeriesData(np.array([5, 5, 5, 5, 5]), agg_group='group1'), 'group1_series1' - ) - sample_collection.create_time_series( - TimeSeriesData(np.array([6, 6, 6, 6, 6]), agg_group='group1'), 'group1_series2' - ) - sample_collection.create_time_series( - TimeSeriesData(np.array([10, 10, 10, 10, 10]), agg_weight=0.5), 'weighted_series' - ) - return sample_collection +class TestTimeSeriesWithScenarios: + """Test suite for TimeSeries class with scenarios.""" + + def test_initialization_with_scenarios(self, simple_scenario_dataarray): + """Test initialization of TimeSeries with scenario dimension.""" + ts = TimeSeries(simple_scenario_dataarray, name='Scenario Series') + + # Check basic properties + assert ts.name == 'Scenario Series' + assert ts.has_scenario_dim is True + assert ts._selected_scenarios is None # No selection initially + + # Check data initialization + assert isinstance(ts.stored_data, xr.DataArray) + assert ts.stored_data.equals(simple_scenario_dataarray) + assert ts.selected_data.equals(simple_scenario_dataarray) + + # Check backup was created + assert ts._backup.equals(simple_scenario_dataarray) + + def test_reset_with_scenarios(self, sample_scenario_timeseries, simple_scenario_dataarray): + """Test reset method with scenarios.""" + # Get original full indexes + full_timesteps = simple_scenario_dataarray.coords['time'] + full_scenarios = simple_scenario_dataarray.coords['scenario'] + + # Set to subset timesteps and scenarios + subset_timesteps = full_timesteps[1:3] + subset_scenarios = full_scenarios[:2] + + sample_scenario_timeseries.set_selection(timesteps=subset_timesteps, scenarios=subset_scenarios) + + # Verify subsets were set + assert sample_scenario_timeseries._selected_timesteps.equals(subset_timesteps) + assert sample_scenario_timeseries._selected_scenarios.equals(subset_scenarios) + assert sample_scenario_timeseries.selected_data.shape == (len(subset_scenarios), len(subset_timesteps)) + + # Reset + sample_scenario_timeseries.reset() + + # Should be back to full indexes + assert sample_scenario_timeseries._selected_timesteps is None + assert sample_scenario_timeseries._selected_scenarios is None + assert sample_scenario_timeseries.selected_data.shape == (len(full_scenarios), len(full_timesteps)) + + def test_scenario_selection(self, sample_scenario_timeseries, sample_scenario_index): + """Test scenario selection.""" + # Initial state should use all scenarios + assert sample_scenario_timeseries._selected_scenarios is None + + # Set to a subset + subset_index = sample_scenario_index[:2] # First two scenarios + sample_scenario_timeseries.set_selection(scenarios=subset_index) + assert sample_scenario_timeseries._selected_scenarios.equals(subset_index) + + # Active data should reflect the subset + assert sample_scenario_timeseries.selected_data.equals( + sample_scenario_timeseries.stored_data.sel(scenario=subset_index) + ) + + # Clear selection + sample_scenario_timeseries.set_selection() + assert sample_scenario_timeseries._selected_scenarios is None + + def test_all_equal_with_scenarios(self, sample_timesteps, sample_scenario_index): + """Test all_equal property with scenarios.""" + # All values equal across all scenarios + equal_data = np.full((3, 5), 5) # All values are 5 + equal_dataarray = xr.DataArray( + data=equal_data, + coords={'scenario': sample_scenario_index, 'time': sample_timesteps}, + dims=['scenario', 'time'], + ) + ts_equal = TimeSeries(equal_dataarray, 'Equal Scenario Series') + assert ts_equal.all_equal is True + + # Equal within each scenario but different between scenarios + per_scenario_equal = np.array( + [ + [5, 5, 5, 5, 5], # baseline - all 5 + [10, 10, 10, 10, 10], # high_demand - all 10 + [15, 15, 15, 15, 15], # low_price - all 15 + ] + ) + per_scenario_dataarray = xr.DataArray( + data=per_scenario_equal, + coords={'scenario': sample_scenario_index, 'time': sample_timesteps}, + dims=['scenario', 'time'], + ) + ts_per_scenario = TimeSeries(per_scenario_dataarray, 'Per-Scenario Equal Series') + assert ts_per_scenario.all_equal is False + + def test_arithmetic_with_scenarios(self, sample_scenario_timeseries, sample_timesteps, sample_scenario_index): + """Test arithmetic operations with scenarios.""" + # Create a second TimeSeries with scenarios + data2 = np.ones((3, 5)) # All ones + second_dataarray = xr.DataArray( + data=data2, coords={'scenario': sample_scenario_index, 'time': sample_timesteps}, dims=['scenario', 'time'] + ) + ts2 = TimeSeries(second_dataarray, 'Second Series') + + # Test operations between two scenario TimeSeries objects + result = sample_scenario_timeseries + ts2 + assert result.shape == (3, 5) + assert result.dims == ('scenario', 'time') + + # First scenario values should be increased by 1 + baseline_original = sample_scenario_timeseries.sel(scenario='baseline').values + baseline_result = result.sel(scenario='baseline').values + assert np.array_equal(baseline_result, baseline_original + 1) -class TestTimeSeriesCollection: - """Test suite for TimeSeriesCollection.""" +class TestTimeSeriesAllocator: + """Test suite for TimeSeriesCollection class.""" def test_initialization(self, sample_timesteps): """Test basic initialization.""" - collection = TimeSeriesCollection(sample_timesteps) + allocator = TimeSeriesCollection(sample_timesteps) - assert collection.all_timesteps.equals(sample_timesteps) - assert len(collection.all_timesteps_extra) == len(sample_timesteps) + 1 - assert isinstance(collection.all_hours_per_timestep, xr.DataArray) - assert len(collection) == 0 + assert allocator.timesteps.equals(sample_timesteps) + assert len(allocator.timesteps_extra) == len(sample_timesteps) + 1 + assert isinstance(allocator.hours_per_timestep, xr.DataArray) + assert len(allocator._time_series) == 0 def test_initialization_with_custom_hours(self, sample_timesteps): """Test initialization with custom hour settings.""" # Test with last timestep duration last_timestep_hours = 12 - collection = TimeSeriesCollection(sample_timesteps, hours_of_last_timestep=last_timestep_hours) + allocator = TimeSeriesCollection(sample_timesteps, hours_of_last_timestep=last_timestep_hours) # Verify the last timestep duration - extra_step_delta = collection.all_timesteps_extra[-1] - collection.all_timesteps_extra[-2] + extra_step_delta = allocator.timesteps_extra[-1] - allocator.timesteps_extra[-2] assert extra_step_delta == pd.Timedelta(hours=last_timestep_hours) # Test with previous timestep duration hours_per_step = 8 - collection2 = TimeSeriesCollection(sample_timesteps, hours_of_previous_timesteps=hours_per_step) + allocator2 = TimeSeriesCollection(sample_timesteps, hours_of_previous_timesteps=hours_per_step) - assert collection2.hours_of_previous_timesteps == hours_per_step + assert allocator2.hours_of_previous_timesteps == hours_per_step - def test_create_time_series(self, sample_collection): - """Test creating time series.""" + def test_add_time_series(self, sample_allocator, sample_timesteps): + """Test adding time series.""" # Test scalar - ts1 = sample_collection.create_time_series(42, 'scalar_series') + ts1 = sample_allocator.add_time_series('scalar_series', 42) assert ts1.name == 'scalar_series' - assert np.all(ts1.active_data.values == 42) + assert np.all(ts1.selected_data.values == 42) # Test numpy array data = np.array([1, 2, 3, 4, 5]) - ts2 = sample_collection.create_time_series(data, 'array_series') - assert np.array_equal(ts2.active_data.values, data) + ts2 = sample_allocator.add_time_series('array_series', data) + assert np.array_equal(ts2.selected_data.values, data) - # Test with TimeSeriesData - ts3 = sample_collection.create_time_series(TimeSeriesData(10, agg_weight=0.7), 'weighted_series') - assert ts3.aggregation_weight == 0.7 + # Test with existing TimeSeries + existing_ts = TimeSeries.from_datasource(10, 'original_name', sample_timesteps, aggregation_weight=0.7) + ts3 = sample_allocator.add_time_series('weighted_series', existing_ts) + assert ts3.name == 'weighted_series' # Name changed + assert ts3.aggregation_weight == 0.7 # Weight preserved # Test with extra timestep - ts4 = sample_collection.create_time_series(5, 'extra_series', needs_extra_timestep=True) - assert ts4.needs_extra_timestep - assert len(ts4.active_data) == len(sample_collection.timesteps_extra) + ts4 = sample_allocator.add_time_series('extra_series', 5, has_extra_timestep=True) + assert ts4.name == 'extra_series' + assert ts4.has_extra_timestep + assert len(ts4.selected_data) == len(sample_allocator.timesteps_extra) # Test duplicate name - with pytest.raises(ValueError, match='already exists'): - sample_collection.create_time_series(1, 'scalar_series') + with pytest.raises(KeyError, match='already exists'): + sample_allocator.add_time_series('scalar_series', 1) - def test_access_time_series(self, populated_collection): + def test_access_time_series(self, sample_allocator): """Test accessing time series.""" + # Add a few time series + sample_allocator.add_time_series('series1', 42) + sample_allocator.add_time_series('series2', np.array([1, 2, 3, 4, 5])) + # Test __getitem__ - ts = populated_collection['varying_series'] - assert ts.name == 'varying_series' + ts = sample_allocator['series1'] + assert ts.name == 'series1' # Test __contains__ with string - assert 'constant_series' in populated_collection - assert 'nonexistent_series' not in populated_collection + assert 'series1' in sample_allocator + assert 'nonexistent_series' not in sample_allocator # Test __contains__ with TimeSeries object - assert populated_collection['varying_series'] in populated_collection - - # Test __iter__ - names = [ts.name for ts in populated_collection] - assert len(names) == 6 - assert 'varying_series' in names + assert sample_allocator['series2'] in sample_allocator # Test access to non-existent series - with pytest.raises(KeyError): - populated_collection['nonexistent_series'] - - def test_constants_and_non_constants(self, populated_collection): - """Test constants and non_constants properties.""" - # Test constants - constants = populated_collection.constants - assert len(constants) == 4 # constant_series, group1_series1, group1_series2, weighted_series - assert all(ts.all_equal for ts in constants) - - # Test non_constants - non_constants = populated_collection.non_constants - assert len(non_constants) == 2 # varying_series, extra_timestep_series - assert all(not ts.all_equal for ts in non_constants) - - # Test modifying a series changes the results - populated_collection['constant_series'].stored_data = np.array([1, 2, 3, 4, 5]) - updated_constants = populated_collection.constants - assert len(updated_constants) == 3 # One less constant - assert 'constant_series' not in [ts.name for ts in updated_constants] - - def test_timesteps_properties(self, populated_collection, sample_timesteps): - """Test timestep-related properties.""" - # Test default (all) timesteps - assert populated_collection.timesteps.equals(sample_timesteps) - assert len(populated_collection.timesteps_extra) == len(sample_timesteps) + 1 - - # Test activating a subset - subset = sample_timesteps[1:3] - populated_collection.activate_timesteps(subset) - - assert populated_collection.timesteps.equals(subset) - assert len(populated_collection.timesteps_extra) == len(subset) + 1 - - # Check that time series were updated - assert populated_collection['varying_series'].active_timesteps.equals(subset) - assert populated_collection['extra_timestep_series'].active_timesteps.equals( - populated_collection.timesteps_extra - ) - - # Test reset - populated_collection.reset() - assert populated_collection.timesteps.equals(sample_timesteps) + with pytest.raises(ValueError): + sample_allocator['nonexistent_series'] - def test_to_dataframe_and_dataset(self, populated_collection): - """Test conversion to DataFrame and Dataset.""" - # Test to_dataset - ds = populated_collection.to_dataset() - assert isinstance(ds, xr.Dataset) - assert len(ds.data_vars) == 6 + def test_selection_propagation(self, sample_allocator, sample_timesteps): + """Test that selections propagate to TimeSeries.""" + # Add a few time series + ts1 = sample_allocator.add_time_series('series1', 42) + ts2 = sample_allocator.add_time_series('series2', np.array([1, 2, 3, 4, 5])) + ts3 = sample_allocator.add_time_series('series3', 5, has_extra_timestep=True) - # Test to_dataframe with different filters - df_all = populated_collection.to_dataframe(filtered='all') - assert len(df_all.columns) == 6 + # Initially no selections + assert ts1._selected_timesteps is None + assert ts2._selected_timesteps is None + assert ts3._selected_timesteps is None - df_constant = populated_collection.to_dataframe(filtered='constant') - assert len(df_constant.columns) == 4 + # Apply selection + subset_timesteps = sample_timesteps[1:3] + sample_allocator.set_selection(timesteps=subset_timesteps) - df_non_constant = populated_collection.to_dataframe(filtered='non_constant') - assert len(df_non_constant.columns) == 2 + # Check selection propagated to regular time series + assert ts1._selected_timesteps.equals(subset_timesteps) + assert ts2._selected_timesteps.equals(subset_timesteps) - # Test invalid filter - with pytest.raises(ValueError): - populated_collection.to_dataframe(filtered='invalid') - - def test_calculate_aggregation_weights(self, populated_collection): - """Test aggregation weight calculation.""" - weights = populated_collection.calculate_aggregation_weights() - - # Group weights should be 0.5 each (1/2) - assert populated_collection.group_weights['group1'] == 0.5 - - # Series in group1 should have weight 0.5 - assert weights['group1_series1'] == 0.5 - assert weights['group1_series2'] == 0.5 - - # Series with explicit weight should have that weight - assert weights['weighted_series'] == 0.5 - - # Series without group or weight should have weight 1 - assert weights['constant_series'] == 1 - - def test_insert_new_data(self, populated_collection, sample_timesteps): - """Test inserting new data.""" - # Create new data - new_data = pd.DataFrame( - { - 'constant_series': [100, 100, 100, 100, 100], - 'varying_series': [5, 10, 15, 20, 25], - # extra_timestep_series is omitted to test partial updates - }, - index=sample_timesteps, - ) + # Check selection with extra timestep + assert ts3._selected_timesteps is not None + assert len(ts3._selected_timesteps) == len(subset_timesteps) + 1 - # Insert data - populated_collection.insert_new_data(new_data) - - # Verify updates - assert np.all(populated_collection['constant_series'].active_data.values == 100) - assert np.array_equal(populated_collection['varying_series'].active_data.values, np.array([5, 10, 15, 20, 25])) - - # Series not in the DataFrame should be unchanged - assert np.array_equal( - populated_collection['extra_timestep_series'].active_data.values[:-1], np.array([1, 2, 3, 4, 5]) - ) + # Clear selection + sample_allocator.set_selection() - # Test with mismatched index - bad_index = pd.date_range('2023-02-01', periods=5, freq='D', name='time') - bad_data = pd.DataFrame({'constant_series': [1, 1, 1, 1, 1]}, index=bad_index) - - with pytest.raises(ValueError, match='must match collection timesteps'): - populated_collection.insert_new_data(bad_data) - - def test_restore_data(self, populated_collection): - """Test restoring original data.""" - # Capture original data - original_values = {name: ts.stored_data.copy() for name, ts in populated_collection.time_series_data.items()} - - # Modify data - new_data = pd.DataFrame( - { - name: np.ones(len(populated_collection.timesteps)) * 999 - for name in populated_collection.time_series_data - if not populated_collection[name].needs_extra_timestep - }, - index=populated_collection.timesteps, - ) + # Check selection cleared + assert ts1._selected_timesteps is None + assert ts2._selected_timesteps is None + assert ts3._selected_timesteps is None - populated_collection.insert_new_data(new_data) + def test_update_time_series(self, sample_allocator): + """Test updating a time series.""" + # Add a time series + ts = sample_allocator.add_time_series('series', 42) - # Verify data was changed - assert np.all(populated_collection['constant_series'].active_data.values == 999) + # Update it + sample_allocator.update_time_series('series', np.array([1, 2, 3, 4, 5])) - # Restore data - populated_collection.restore_data() + # Check update was applied + assert np.array_equal(ts.selected_data.values, np.array([1, 2, 3, 4, 5])) - # Verify data was restored - for name, original in original_values.items(): - restored = populated_collection[name].stored_data - assert np.array_equal(restored.values, original.values) + # Test updating non-existent series + with pytest.raises(KeyError): + sample_allocator.update_time_series('nonexistent', 42) - def test_class_method_with_uniform_timesteps(self): - """Test the with_uniform_timesteps class method.""" - collection = TimeSeriesCollection.with_uniform_timesteps( - start_time=pd.Timestamp('2023-01-01'), periods=24, freq='h', hours_per_step=1 - ) + def test_as_dataset(self, sample_allocator): + """Test as_dataset method.""" + # Add some time series + sample_allocator.add_time_series('series1', 42) + sample_allocator.add_time_series('series2', np.array([1, 2, 3, 4, 5])) - assert len(collection.timesteps) == 24 - assert collection.hours_of_previous_timesteps == 1 - assert (collection.timesteps[1] - collection.timesteps[0]) == pd.Timedelta(hours=1) + # Get dataset + ds = sample_allocator.as_dataset(with_extra_timestep=False) - def test_hours_per_timestep(self, populated_collection): - """Test hours_per_timestep calculation.""" - # Standard case - uniform timesteps - hours = populated_collection.hours_per_timestep.values - assert np.allclose(hours, 24) # Default is daily timesteps + # Check dataset contents + assert isinstance(ds, xr.Dataset) + assert 'series1' in ds + assert 'series2' in ds + assert np.all(ds['series1'].values == 42) + assert np.array_equal(ds['series2'].values, np.array([1, 2, 3, 4, 5])) - # Create non-uniform timesteps - non_uniform_times = pd.DatetimeIndex( - [ - pd.Timestamp('2023-01-01'), - pd.Timestamp('2023-01-02'), - pd.Timestamp('2023-01-03 12:00:00'), # 1.5 days from previous - pd.Timestamp('2023-01-04'), # 0.5 days from previous - pd.Timestamp('2023-01-06'), # 2 days from previous - ], - name='time', - ) - collection = TimeSeriesCollection(non_uniform_times) - hours = collection.hours_per_timestep.values +class TestTimeSeriesAllocatorWithScenarios: + """Test suite for TimeSeriesCollection with scenarios.""" - # Expected hours between timestamps - expected = np.array([24, 36, 12, 48, 48]) - assert np.allclose(hours, expected) + def test_initialization_with_scenarios(self, sample_timesteps, sample_scenario_index): + """Test initialization with scenarios.""" + allocator = TimeSeriesCollection(sample_timesteps, scenarios=sample_scenario_index) - def test_validation_and_errors(self, sample_timesteps): - """Test validation and error handling.""" - # Test non-DatetimeIndex - with pytest.raises(TypeError, match='must be a pandas DatetimeIndex'): - TimeSeriesCollection(pd.Index([1, 2, 3, 4, 5])) + assert allocator.timesteps.equals(sample_timesteps) + assert allocator.scenarios.equals(sample_scenario_index) + assert len(allocator._time_series) == 0 - # Test too few timesteps - with pytest.raises(ValueError, match='must contain at least 2 timestamps'): - TimeSeriesCollection(pd.DatetimeIndex([pd.Timestamp('2023-01-01')], name='time')) + def test_add_time_series_with_scenarios(self, sample_scenario_allocator): + """Test creating time series with scenarios.""" + # Test scalar (broadcasts to all scenarios) + ts1 = sample_scenario_allocator.add_time_series('scalar_series', 42) + assert ts1.has_scenario_dim + assert ts1.name == 'scalar_series' + assert ts1.selected_data.shape == (5, 3) # 5 timesteps, 3 scenarios + assert np.all(ts1.selected_data.values == 42) - # Test invalid active_timesteps - collection = TimeSeriesCollection(sample_timesteps) - invalid_timesteps = pd.date_range('2024-01-01', periods=3, freq='D', name='time') + # Test 1D array (broadcasts to all scenarios) + data = np.array([1, 2, 3, 4, 5]) + ts2 = sample_scenario_allocator.add_time_series('array_series', data) + assert ts2.has_scenario_dim + assert ts2.selected_data.shape == (5, 3) + # Each scenario should have the same values + for scenario in sample_scenario_allocator.scenarios: + assert np.array_equal(ts2.sel(scenario=scenario).values, data) + + # Test 2D array (one row per scenario) + data_2d = np.array([[10, 20, 30, 40, 50], [15, 25, 35, 45, 55], [5, 15, 25, 35, 45]]).T + ts3 = sample_scenario_allocator.add_time_series('scenario_specific_series', data_2d) + assert ts3.has_scenario_dim + assert ts3.selected_data.shape == (5, 3) + # Each scenario should have its own values + assert np.array_equal(ts3.sel(scenario='baseline').values, data_2d[:,0]) + assert np.array_equal(ts3.sel(scenario='high_demand').values, data_2d[:,1]) + assert np.array_equal(ts3.sel(scenario='low_price').values, data_2d[:,2]) + + def test_selection_propagation_with_scenarios( + self, sample_scenario_allocator, sample_timesteps, sample_scenario_index + ): + """Test scenario selection propagation.""" + # Add some time series + ts1 = sample_scenario_allocator.add_time_series('series1', 42) + ts2 = sample_scenario_allocator.add_time_series('series2', np.array([1, 2, 3, 4, 5])) + + # Initial state - no selections + assert ts1._selected_scenarios is None + assert ts2._selected_scenarios is None + + # Select scenarios + subset_scenarios = sample_scenario_index[:2] + sample_scenario_allocator.set_selection(scenarios=subset_scenarios) + + # Check selections propagated + assert ts1._selected_scenarios.equals(subset_scenarios) + assert ts2._selected_scenarios.equals(subset_scenarios) + + # Check data is filtered + assert ts1.selected_data.shape == (5, 2) # 5 timesteps, 2 scenarios + assert ts2.selected_data.shape == (5, 2) + + # Apply combined selection + subset_timesteps = sample_timesteps[1:3] + sample_scenario_allocator.set_selection(timesteps=subset_timesteps, scenarios=subset_scenarios) + + # Check combined selection applied + assert ts1._selected_timesteps.equals(subset_timesteps) + assert ts1._selected_scenarios.equals(subset_scenarios) + assert ts1.selected_data.shape == (2, 2) # 2 timesteps, 2 scenarios + + # Clear selections + sample_scenario_allocator.set_selection() + assert ts1._selected_timesteps is None + assert ts1.selected_timesteps.equals(sample_scenario_allocator.timesteps) + assert ts1._selected_scenarios is None + assert ts1.active_scenarios.equals(sample_scenario_allocator.scenarios) + assert ts1.selected_data.shape == (5, 3) # Back to full shape + + def test_as_dataset_with_scenarios(self, sample_scenario_allocator): + """Test as_dataset method with scenarios.""" + # Add some time series + sample_scenario_allocator.add_time_series('scalar_series', 42) + sample_scenario_allocator.add_time_series( + 'varying_series', np.array([[10, 20, 30, 40, 50], [15, 25, 35, 45, 55], [5, 15, 25, 35, 45]]).T + ) - with pytest.raises(ValueError, match='must be a subset'): - collection.activate_timesteps(invalid_timesteps) + # Get dataset + ds = sample_scenario_allocator.as_dataset(with_extra_timestep=False) + + # Check dataset dimensions + assert 'scenario' in ds.dims + assert 'time' in ds.dims + assert ds.dims['scenario'] == 3 + assert ds.dims['time'] == 5 + + # Check dataset variables + assert 'scalar_series' in ds + assert 'varying_series' in ds + + # Check values + assert np.all(ds['scalar_series'].values == 42) + baseline_values = ds['varying_series'].sel(scenario='baseline').values + assert np.array_equal(baseline_values, np.array([10, 20, 30, 40, 50])) + + def test_contains_and_iteration(self, sample_scenario_allocator): + """Test __contains__ and __iter__ methods.""" + # Add some time series + ts1 = sample_scenario_allocator.add_time_series('series1', 42) + sample_scenario_allocator.add_time_series('series2', 10) + + # Test __contains__ + assert 'series1' in sample_scenario_allocator + assert ts1 in sample_scenario_allocator + assert 'nonexistent' not in sample_scenario_allocator + + # Test behavior with invalid type + with pytest.raises(TypeError): + assert 42 in sample_scenario_allocator + + def test_update_time_series_with_scenarios(self, sample_scenario_allocator, sample_scenario_index): + """Test updating a time series with scenarios.""" + # Add a time series + ts = sample_scenario_allocator.add_time_series('series', 42) + assert ts.has_scenario_dim + assert np.all(ts.selected_data.values == 42) + + # Update with scenario-specific data + new_data = np.array([[1, 2, 3, 4, 5], [6, 7, 8, 9, 10], [11, 12, 13, 14, 15]]).T + sample_scenario_allocator.update_time_series('series', new_data) + + # Check update was applied + assert np.array_equal(ts.selected_data.values, new_data) + assert ts.has_scenario_dim + + # Check scenario-specific values + assert np.array_equal(ts.sel(scenario='baseline').values, new_data[:,0]) + assert np.array_equal(ts.sel(scenario='high_demand').values, new_data[:,1]) + assert np.array_equal(ts.sel(scenario='low_price').values, new_data[:,2]) + + +if __name__ == '__main__': + pytest.main()