Add transform accessor and move sel, isel and resample to this accessor (#510)

FBumann · web-flow · commit dedfeb89ffbf · 2025-12-08T14:36:17.000+01:00
* Add planning doc * Finalize planning * Add plotting acessor * Add plotting acessor * Add tests * Improve * Improve * Update docs * Updated the plotting API so that .data always returns xarray DataArray or Dataset instead of pandas DataFrame. * All .data now returns xr.Dataset consistently. * Fixed Inconsistencies and Unused Parameters * New Plot Accessors results.plot.variable(pattern) Plots the same variable type across multiple elements for easy comparison. # All binary operation states across all components results.plot.variable('on') # All flow_rates, filtered to Boiler-related elements results.plot.variable('flow_rate', include='Boiler') # All storage charge states results.plot.variable('charge_state') # With aggregation results.plot.variable('flow_rate', aggregate='sum') Key features: - Searches all elements for variables matching the pattern - Filter with include/exclude on element names - Supports aggregation, faceting, and animation - Returns Dataset with element names as variables results.plot.duration_curve(variables) Plots load duration curves (sorted time series) showing utilization patterns. # Single variable results.plot.duration_curve('Boiler(Q_th)|flow_rate') # Multiple variables results.plot.duration_curve(['CHP|on', 'Boiler|on']) # Normalized x-axis (0-100%) results.plot.duration_curve('demand', normalize=True) Key features: - Sorts values from highest to lowest - Shows how often each power level is reached - normalize=True shows percentage of time on x-axis - Returns Dataset with duration_hours or duration_pct dimension * Fix duration curve * Fix duration curve * Fix duration curve * Fix duration curve * xr.apply_ufunc to sort along the time axis while preserving all other dimensions automatically * ⏺ The example runs successfully. Now let me summarize the fixes: Summary of Fixes I addressed the actionable code review comments from CodeRabbitAI: 1. Documentation Issue - reshape parameter name ✓ (No fix needed) The CodeRabbitAI comment was incorrect. The public API parameter in PlotAccessor.heatmap() is correctly named reshape (line 335). The reshape_time parameter exists in the lower-level heatmap_with_plotly function, but the documentation correctly shows the public API parameter. 2. Fixed simple_example.py (lines 129-130) Problem: The example called balance() and duration_curve() without required arguments, which would cause TypeError at runtime. Fix: Added the required arguments: - optimization.results.plot.balance('Fernwärme') - specifying the bus to plot - optimization.results.plot.duration_curve('Boiler(Q_th)|flow_rate') - specifying the variable to plot 3. Fixed variable collision in plot_accessors.py (lines 985-988) Problem: When building the Dataset in the variable() method, using element names as keys could cause collisions if multiple variables from the same element matched the pattern (e.g., 'Boiler|flow_rate' and 'Boiler|flow_rate_max' would both map to 'Boiler', with the latter silently overwriting the former). Fix: Changed to use the full variable names as keys instead of just element names: ds = xr.Dataset({var_name: self._results.solution[var_name] for var_name in filtered_vars}) All tests pass (40 passed, 1 skipped) and the example runs successfully. * make variable names public in results * Fix sankey * Fix effects() * Fix effects * Remove bargaps * made faceting consistent across all plot methods: | Method | facet_col | facet_row | |------------------|-------------------------------------------|-----------------------------| | balance() | 'scenario' | 'period' | | heatmap() | 'scenario' | 'period' | | storage() | 'scenario' | 'period' | | flows() | 'scenario' | 'period' | | effects() | 'scenario' | 'period' | | variable() | 'scenario' | 'period' | | duration_curve() | 'scenario' | 'period' (already had this) | | compare() | N/A (uses its own mode='overlay'/'facet') | | | sankey() | N/A (aggregates to single diagram) | | Removed animate_by parameter from all methods * Update storage method * Remove mode parameter for simpli | Method | Default mode | |------------------|---------------------------------------------------| | balance() | stacked_bar | | storage() | stacked_bar (flows) + line (charge state overlay) | | flows() | line | | variable() | line | | duration_curve() | line | | effects() | bar | * Make plotting_accessors.py more self contained * Use faster to_long() * Add 0-dim case * sankey diagram now properly handles scenarios and periods: Changes made: 1. Period aggregation with weights: Uses period_weights from flow_system to properly weight periods by their duration 2. Scenario aggregation with weights: Uses normalized scenario_weights to compute a weighted average across scenarios 3. Selection support: Users can filter specific scenarios/periods via select parameter before aggregation Weighting logic: - Periods (for aggregate='sum'): (da * period_weights).sum(dim='period') - this gives the total energy across all periods weighted by their duration - Periods (for aggregate='mean'): (da * period_weights).sum(dim='period') / period_weights.sum() - weighted mean - Scenarios: Always uses normalized weights (sum to 1) for weighted averaging, since scenarios represent probability-weighted alternatives * Add colors to sankey * Add sizes * Add size filtering * Include storage sizes * Remove storage sizes * Add charge state and status accessor * Summary of Changes 1. Added new methods to PlotAccessor (plot_accessors.py) charge_states() (line 658): - Returns a Dataset with each storage's charge state as a variable - Supports filtering with include/exclude parameters - Default plot: line chart on_states() (line 753): - Returns a Dataset with each component's |status variable - Supports filtering with include/exclude parameters - Default plot: heatmap (good for binary data visualization) 2. Added data building helper functions (plot_accessors.py) build_flow_rates(results) (line 315): - Builds a DataArray containing flow rates for all flows - Used internally by PlotAccessor methods build_flow_hours(results) (line 333): - Builds a DataArray containing flow hours for all flows build_sizes(results) (line 347): - Builds a DataArray containing sizes for all flows _filter_dataarray_by_coord(da, **kwargs) (line 284): - Helper for filtering DataArrays by coordinate values 3. Deprecated old Results methods (results.py) The following methods now emit DeprecationWarning: - results.flow_rates() → Use results.plot.flows(plot=False).data - results.flow_hours() → Use results.plot.flows(unit='flow_hours', plot=False).data - results.sizes() → Use results.plot.sizes(plot=False).data 4. Updated PlotAccessor methods to use new helpers - flows() now uses build_flow_rates() / build_flow_hours() directly - sizes() now uses build_sizes() directly - sankey() now uses build_flow_hours() directly This ensures the deprecation warnings only fire when users directly call the old methods, not when using the plot accessor * 1. New methods added to PlotAccessor - charge_states(): Returns Dataset with all storage charge states - on_states(): Returns Dataset with all component status variables (heatmap display) 2. Data building helper functions (plot_accessors.py) - build_flow_rates(results): Builds DataArray of flow rates - build_flow_hours(results): Builds DataArray of flow hours - build_sizes(results): Builds DataArray of sizes - _filter_dataarray_by_coord(da, **kwargs): Filter helper - _assign_flow_coords(da, results): Add flow coordinates 3. Caching in PlotAccessor Added lazy-cached properties for expensive computations: - _all_flow_rates - cached DataArray of all flow rates - _all_flow_hours - cached DataArray of all flow hours - _all_sizes - cached DataArray of all sizes - _all_charge_states - cached Dataset of all storage charge states - _all_status_vars - cached Dataset of all status variables 4. Deprecated methods in Results class Added deprecation warnings to: - results.flow_rates() → Use results.plot.flows(plot=False).data - results.flow_hours() → Use results.plot.flows(unit='flow_hours', plot=False).data - results.sizes() → Use results.plot.sizes(plot=False).data 5. Updated PlotAccessor methods to use cached properties - flows() uses _all_flow_rates / _all_flow_hours - sankey() uses _all_flow_hours - sizes() uses _all_sizes - charge_states() uses _all_charge_states - on_states() uses _all_status_vars * Move deprectated functionality into results.py instead of porting to the new module * Revert to simply deprectae old methods without forwarding to new code * Remove planning file * Update plotting methods for new datasets * 1. Renamed data properties in PlotAccessor to use all_ prefix: - all_flow_rates - All flow rates as Dataset - all_flow_hours - All flow hours as Dataset - all_sizes - All flow sizes as Dataset - all_charge_states - All storage charge states as Dataset - all_on_states - All component on/off status as Dataset 2. Updated internal references - All usages in flows(), sankey(), sizes(), charge_states(), and on_states() methods now use the new names. 3. Updated deprecation messages in results.py to point to the new API: - results.flow_rates() → results.plot.all_flow_rates - results.flow_hours() → results.plot.all_flow_hours - results.sizes() → results.plot.all_sizes 4. Updated docstring examples in PlotAccessor to use the new all_* names. * Update deprecations messages * Update deprecations messages * Thsi seems much better. * Updaet docstrings and variable name generation in plotting acessor * Change __ to _ in private dataset caching * Revert breaking io changes * New solution storing interface * Add new focused statistics and plot accessors * Renamed all properties: - all_flow_rates → flow_rates - all_flow_hours → flow_hours - all_sizes → sizes - all_charge_states → charge_states * Cache Statistics * Invalidate caches * Add effect related statistics * Simplify statistics accessor to rely on flow_system directly instead of solution attrs * Fix heatma fallback for 1D Data * Add topology accessor * All deprecation warnings in the codebase now consistently use the format will be removed in v{DEPRECATION_REMOVAL_VERSION}. * Update tests * created comprehensive documentation for all FlowSystem accessors * Update results documentation * Update results documentation * Update effect statistics * Update effect statistics * Update effect statistics * Add mkdocs plotly plugin * Add section about custom constraints * documentation updates: docs/user-guide/results/index.md: - Updated table to replace effects_per_component with temporal_effects, periodic_effects, total_effects, and effect_share_factors - Fixed flow_hours['Boiler(Q_th)|flow_rate'] → flow_hours['Boiler(Q_th)'] - Fixed sizes['Boiler(Q_th)|size'] → sizes['Boiler(Q_th)'] - Replaced effects_per_component example with new effect properties and groupby examples - Updated complete example to use total_effects docs/user-guide/results-plotting.md: - Fixed colors example from 'Boiler(Q_th)|flow_rate' → 'Boiler(Q_th)' - Fixed duration_curve examples to use clean labels docs/user-guide/migration-guide-v6.md: - Added new "Statistics Accessor" section explaining the clean labels and new effect properties * implemented the effects() method in StatisticsPlotAccessor at flixopt/statistics_accessor.py:1132-1258. Summary of what was done: 1. Implemented effects() method in StatisticsPlotAccessor class that was missing but documented - Takes aspect parameter: 'total', 'temporal', or 'periodic' - Takes effect parameter to filter to a specific effect (e.g., 'costs', 'CO2') - Takes by parameter: 'component' or 'time' for grouping - Supports all standard plotting parameters: select, colors, facet_col, facet_row, show - Returns PlotResult with both data and figure 2. Verified the implementation works with all parameter combinations: - Default call: flow_system.statistics.plot.effects() - Specific effect: flow_system.statistics.plot.effects(effect='costs') - Temporal aspect: flow_system.statistics.plot.effects(aspect='temporal') - Temporal by time: flow_system.statistics.plot.effects(aspect='temporal', by='time') - Periodic aspect: flow_system.statistics.plot.effects(aspect='periodic') * Remove intermediate plot accessor * 1. pyproject.toml: Removed duplicate mkdocs-plotly-plugin>=0.1.3 entry (kept the exact pin ==0.1.3) 2. flixopt/plotting.py: Fixed dimension name consistency by using squeezed_data.name instead of data.name in the fallback heatmap logic 3. flixopt/statistics_accessor.py: - Fixed _dataset_to_long_df() to only use coordinates that are actually present as columns after reset_index() - Fixed the nested loop inefficiency with include_flows by pre-computing the flows list outside the loop - (Previously fixed) Fixed asymmetric NaN handling in validation check * _create_effects_dataset method in statistics_accessor.py was simplified: 1. Detect contributors from solution data variables instead of assuming they're only flows - Uses regex pattern to find {contributor}->{effect}(temporal|periodic) variables - Contributors can be flows OR components (e.g., components with effects_per_active_hour) 2. Exclude effect-to-effect shares - Filters out contributors whose base name matches any effect label - For example, costs(temporal) is excluded because costs is an effect label - These intermediate shares are already included in the computation 3. Removed the unused _compute_effect_total method - The new simplified implementation directly looks up shares from the solution - Uses effect_share_factors for conversion between effects 4. Key insight from user: The solution already contains properly computed share values including all effect-to-effect conversions. The computation uses conversion factors because derived effects (like Effect1 which shares 0.5 from costs) don't have direct {flow}->Effect1(temporal) variables - only the source effect shares exist ({flow}->costs(temporal)). * Update docs * Improve to_netcdf method * Update examples * Fix IIS computaion flag * Fix examples * Fix faceting in heatmap and use period as facet col everywhere * Inline plotting methods to deprecate plotting.py * Fix test * Simplify Color Management * ColorType is now defined in color_processing.py and imported into statistics_accessor.py. * Fix ColorType typing * Add color accessor * Ensure io * Add carrier class * implemented Carrier as a proper Interface subclass with container support. Here's what was done: 1. Carrier class (flixopt/carrier.py) - Now inherits from Interface for serialization capabilities - Has transform_data() method (no-op since carriers have no time-series data) - Has label property for container keying - Maintains equality comparison with both Carrier objects and strings 2. CarrierContainer class (flixopt/carrier.py) - Inherits from ContainerMixin['Carrier'] - Provides dict-like access with nice repr and error messages - Uses carrier.name for keying 3. FlowSystem updates (flixopt/flow_system.py) - _carriers is now a CarrierContainer instead of a plain dict - carriers property returns the CarrierContainer - add_carrier() uses the container's add() method - Serialization updated to include carriers in to_dataset() and restore them in from_dataset() 4. Exports (flixopt/__init__.py) - Both Carrier and CarrierContainer are now exported * Inline plotting methods to deprecate plotting.py (#508) * Inline plotting methods to deprecate plotting.py * Fix test * Simplify Color Management * ColorType is now defined in color_processing.py and imported into statistics_accessor.py. * Fix ColorType typing * statistics_accessor.py - Heatmap colors type safety (lines 121-148, 820-853) - Changed _heatmap_figure() parameter type from colors: ColorType = None to colors: str | list[str] | None = None - Changed heatmap() method parameter type similarly - Updated docstrings to clarify that dicts are not supported for heatmaps since px.imshow's color_continuous_scale only accepts colorscale names or lists 2. statistics_accessor.py - Use configured qualitative colorscale (lines 284, 315) - Updated _create_stacked_bar() to use CONFIG.Plotting.default_qualitative_colorscale as the default colorscale - Updated _create_line() similarly - This ensures user-configured CONFIG.Plotting.default_qualitative_colorscale affects all bar/line plots consistently 3. topology_accessor.py - Path type alignment (lines 219-222) - Added normalization of path=False to None before calling _plot_network() - This resolves the type mismatch where TopologyAccessor.plot() accepts bool | str | Path but _plot_network() only accepts str | Path | None * fix usage if index name in aggregation plot * Add to docs * Improve carrier colors and defaults * Update default carriers and colors * Update config * Update config * Move default carriers to config.py * Change default carrier handling * Add color handling * Rmeove meta_data color handling * Add carrierst to examples * Improve plotting acessor * Improve _resolve_variable_names * Improve _resolve_variable_names * Simplify coloring and remove color accessor * Add connected_and_transformed handling * Improve error message in container * Methods moved to TransformAccessor (transform_accessor.py): - sel() - select by label - isel() - select by integer index - resample() - resample time dimension - Helper methods: _dataset_sel, _dataset_isel, _dataset_resample, _resample_by_dimension_groups 2. Solution is dropped: All transform methods return a new FlowSystem with no solution - the user must re-optimize the transformed system. 3. Deprecation warnings: The old flow_system.sel(), flow_system.isel(), and flow_system.resample() methods now emit deprecation warnings and forward to the new TransformAccessor methods. 4. Backward compatible: Existing code still works, just with deprecation warnings. * Documentation updated * Re-add _dataset_sel and other helper methods for proper deprectation. ALso fix new methods to be classmethods * BUGFIX: Carrier from dataset * Update docs
diff --git a/docs/user-guide/core-concepts.md b/docs/user-guide/core-concepts.md
@@ -180,6 +180,55 @@ flow_system.statistics.plot.balance('HeatBus')
 | **Effect** | Metric to track/optimize | Costs, emissions, energy use |
 | **FlowSystem** | Complete model | Your entire system |
 
+## FlowSystem API at a Glance
+
+The `FlowSystem` is the central object in flixOpt. After building your model, all operations are accessed through the FlowSystem and its **accessors**:
+
+```python
+flow_system = fx.FlowSystem(timesteps)
+flow_system.add_elements(...)
+
+# Optimize
+flow_system.optimize(solver)
+
+# Access results
+flow_system.solution                    # Raw xarray Dataset
+flow_system.statistics.flow_hours       # Aggregated statistics
+flow_system.statistics.plot.balance()   # Visualization
+
+# Transform (returns new FlowSystem)
+fs_subset = flow_system.transform.sel(time=slice(...))
+
+# Inspect structure
+flow_system.topology.plot()
+```
+
+### Accessor Overview
+
+| Accessor | Purpose | Key Methods |
+|----------|---------|-------------|
+| **`solution`** | Raw optimization results | xarray Dataset with all variables |
+| **`statistics`** | Aggregated data | `flow_rates`, `flow_hours`, `sizes`, `charge_states`, `total_effects` |
+| **`statistics.plot`** | Visualization | `balance()`, `heatmap()`, `sankey()`, `effects()`, `storage()` |
+| **`transform`** | Create modified copies | `sel()`, `isel()`, `resample()`, `cluster()` |
+| **`topology`** | Network structure | `plot()`, `start_app()`, `infos()` |
+
+### Element Access
+
+Access elements directly from the FlowSystem:
+
+```python
+# Access by label
+flow_system.components['Boiler']        # Get a component
+flow_system.buses['Heat']               # Get a bus
+flow_system.flows['Boiler(Q_th)']       # Get a flow
+flow_system.effects['costs']            # Get an effect
+
+# Element-specific solutions
+flow_system.components['Boiler'].solution
+flow_system.flows['Boiler(Q_th)'].solution
+```
+
 ## Beyond Energy Systems
 
 While our example used a heating system, flixOpt works for any flow-based optimization:
diff --git a/docs/user-guide/migration-guide-v6.md b/docs/user-guide/migration-guide-v6.md
@@ -343,20 +343,52 @@ stats.total_effects['costs'].groupby('component_type').sum()
 | **Replace Optimization class** | Use `flow_system.optimize(solver)` instead |
 | **Update results access** | Use `flow_system.solution['var_name']` pattern |
 | **Update I/O code** | Use `to_netcdf()` / `from_netcdf()` |
+| **Update transform methods** | Use `flow_system.transform.sel/isel/resample()` instead |
 | **Test thoroughly** | Verify results match v5.x outputs |
 | **Remove deprecated imports** | Remove `fx.Optimization`, `fx.Results` |
 
 ---
 
+## Transform Methods Moved to Accessor
+
+The `sel()`, `isel()`, and `resample()` methods have been moved from `FlowSystem` to the `TransformAccessor`:
+
+=== "Old (deprecated)"
+    ```python
+    # These still work but emit deprecation warnings
+    fs_subset = flow_system.sel(time=slice('2023-01-01', '2023-06-30'))
+    fs_indexed = flow_system.isel(time=slice(0, 24))
+    fs_resampled = flow_system.resample(time='4h', method='mean')
+    ```
+
+=== "New (recommended)"
+    ```python
+    # Use the transform accessor
+    fs_subset = flow_system.transform.sel(time=slice('2023-01-01', '2023-06-30'))
+    fs_indexed = flow_system.transform.isel(time=slice(0, 24))
+    fs_resampled = flow_system.transform.resample(time='4h', method='mean')
+    ```
+
+!!! info "Solution is dropped"
+    All transform methods return a **new FlowSystem without a solution**. You must re-optimize the transformed system:
+    ```python
+    fs_subset = flow_system.transform.sel(time=slice('2023-01-01', '2023-01-31'))
+    fs_subset.optimize(solver)  # Re-optimize the subset
+    ```
+
+---
+
 ## Deprecation Timeline
 
 | Version | Status |
 |---------|--------|
 | v5.x | `Optimization` class deprecated with warning |
-| v6.0.0 | `Optimization` class removed |
+| v6.0.0 | `Optimization` class removed, `sel/isel/resample` methods deprecated |
 
 !!! warning "Update your code"
     The `Optimization` and `Results` classes are deprecated and will be removed in a future version.
+    The `flow_system.sel()`, `flow_system.isel()`, and `flow_system.resample()` methods are deprecated
+    in favor of `flow_system.transform.sel/isel/resample()`.
     Update your code to the new API to avoid breaking changes when upgrading.
 
 ---
diff --git a/docs/user-guide/optimization/index.md b/docs/user-guide/optimization/index.md
@@ -93,6 +93,56 @@ print(clustered_fs.solution)
 | Standard | Small-Medium | Slow | Optimal |
 | Clustered | Very Large | Fast | Approximate |
 
+## Transform Accessor
+
+The `transform` accessor provides methods to create modified copies of your FlowSystem. All transform methods return a **new FlowSystem without a solution** — you must re-optimize the transformed system.
+
+### Selecting Subsets
+
+Select a subset of your data by label or index:
+
+```python
+# Select by label (like xarray.sel)
+fs_january = flow_system.transform.sel(time=slice('2024-01-01', '2024-01-31'))
+fs_scenario = flow_system.transform.sel(scenario='base')
+
+# Select by integer index (like xarray.isel)
+fs_first_week = flow_system.transform.isel(time=slice(0, 168))
+fs_first_scenario = flow_system.transform.isel(scenario=0)
+
+# Re-optimize the subset
+fs_january.optimize(fx.solvers.HighsSolver())
+```
+
+### Resampling Time Series
+
+Change the temporal resolution of your FlowSystem:
+
+```python
+# Resample to 4-hour intervals
+fs_4h = flow_system.transform.resample(time='4h', method='mean')
+
+# Resample to daily
+fs_daily = flow_system.transform.resample(time='1D', method='mean')
+
+# Re-optimize with new resolution
+fs_4h.optimize(fx.solvers.HighsSolver())
+```
+
+**Available resampling methods:** `'mean'`, `'sum'`, `'max'`, `'min'`, `'first'`, `'last'`
+
+### Clustering
+
+See [Clustered Optimization](#clustered-optimization) above.
+
+### Use Cases
+
+| Method | Use Case |
+|--------|----------|
+| `sel()` / `isel()` | Analyze specific time periods, scenarios, or periods |
+| `resample()` | Reduce problem size, test at lower resolution |
+| `cluster()` | Investment planning with typical periods |
+
 ## Custom Constraints
 
 flixOpt is built on [linopy](https://github.com/PyPSA/linopy), allowing you to add custom constraints beyond what's available through the standard API.
diff --git a/examples/01_Simple/simple_example.py b/examples/01_Simple/simple_example.py
@@ -116,7 +116,6 @@
     flow_system.statistics.plot.balance('Storage')
     flow_system.statistics.plot.heatmap('CHP(Q_th)')
     flow_system.statistics.plot.heatmap('Storage')
-    flow_system.statistics.plot.heatmap('Storage')
 
     # Access data as xarray Datasets
     print(flow_system.statistics.flow_rates)
diff --git a/flixopt/flow_system.py b/flixopt/flow_system.py
diff --git a/flixopt/transform_accessor.py b/flixopt/transform_accessor.py