# Optimization Problem Formulation

First, we describe how to load provided or your own non-time-series dependent data as `OptDataCEP`

. Second, we describe the data types within the `OptDataCEP`

and how to access it.

## General

The capacity expansion problem (CEP) is designed as a linear optimization model. It is implemented in the algebraic modelling language JUMP. The implementation within JuMP allows to optimize multiple models in parallel and handle the steps from data input to result-analysis and diagram export in one open-source programming language. The coding of the model enables scalability based on the provided data input, single command based configuration of the setup model, result and configuration collection for further analysis and the opportunity to run design and operation in different optimizations.

The basic idea for the energy system is to have a spatial resolution of the energy system in discrete nodes. Each node has demand, non-dispatchable generation, dispatchable generation and storage capacities of varying technologies connected to itself. The different energy system nodes are interconnected with each other by transmission lines. The model is designed to minimize social costs by minimizing the following objective function:

## Sets

The model's scalability is relying on the usage of sets. The elements of the sets are extracted from the input data and scale the different variables. An overview of the sets is provided in the table. Depending on the model's configuration the necessary sets are initialized.

The sets are setup as a dictionary and organized as `set[tech_name][tech_group]=[elements...]`

, where:

`tech_name`

is the name of the dimension like e.g.`tech`

, or`node`

`tech_group`

is the name of a group of elements within each dimension like e.g.`["all", "generation"]`

. The group "all" always contains all elements of the dimension`[elements...]`

is the Array with the different elements like`["pv", "wind", "gas"]`

name | description |
---|---|

lines | transmission lines connecting the nodes |

nodes | spacial energy system nodes |

tech | generation, conversion, storage, and transmission technologies |

carrier | carrier that an energy balance is calculated for `electricity` , `hydrogen` ... |

impact | impact categories like EUR or USD, CO 2 − eq., ... |

account | fixed costs for installation and yearly expenses, variable costs |

infrastruct | infrastructure status being either new or existing |

time K | numeration of the representative periods |

time T period | numeration of the time intervals within a period |

time T point | numeration of the time points within a period |

time I period | numeration of the time intervals of the full input data periods |

time I point | numeration of the time points of the full input data periods |

dir transmission | direction of the flow uniform with or opposite to the direction of the line |

## Variables

The variables can have different types:

`cv`

: cost variable - information of the costs`dv`

: design variable - information of the energy system design`ov`

: operation variable - information of the energy system operation`sv`

: slack variable - information of unmet demands or exceeded emission limits

An overview of the variables used in the CEP is provided in the table:

name | type | dimensions | unit | description |
---|---|---|---|---|

COST | `cv` | [account,impact,tech] | EUR or USD, kg-LCA-categories | Costs |

CAP | `dv` | [tech,infrastruct,node] | MW | Capacity |

GEN | `ov` | [tech,carrier,t,k,node] | MW | Generation |

SLACK | `sv` | [carrier,t,k,node] | MW | Power gap, not provided by installed CAP |

LL | `sv` | [carrier] | MWh | LoastLoad Generation gap, not provided by installed CAP |

LE | `sv` | [impact] | LCA-categories | LoastEmission Amount of emissions that installed CAP crosses the Emission constraint |

INTRASTOR | `ov` | [tech,carrier,t,k,node] | MWh | Storage level within a period |

INTERSTOR | `ov` | [tech,carrier,i,node] | MWh | Storage level between periods of the full time series |

FLOW | `ov` | [tech,carrier,dir,t,k,line] | MW | Flow over transmission line |

TRANS | `ov` | [tech,infrastruct,lines] | MW | maximum capacity of transmission lines |

## Mathematical formulation

The mathematical formulation depends on the specific model configuration. The different configurations are introduced in Running the Capacity Expansion Problem. The specific equations, which are applied are tracked by the model itself and can be viewed as explained in Equations

We explain the equations used for of a simple optimization model with dispatchable generation, non-dispatchable generation, and a given demand:

The Objective Function minimizes total system costs, where `COST`

is the cost of different technologies, `LL`

is lost load, `c_{ll}`

the variable costs for lost load, `LE`

is lost emissions, and `c_{le}`

is the variable costs for lost emissions. The variable costs are calculated, where `GEN`

is the generation, `\Delta t`

is the time step length and `c_{acc,tech,imp}`

is the variable cost per electric energy. The fixed costs are calculated, where `CAP`

is the installed capacity and `yf`

is the year factor, calculating how many years are represented by the original time series. The generation is limited for dispatchable and non-dispatchable technologies by the installed capacities and an availability factor `z`

for the non-dispatchable generation. The existing capacity is fixed to the provided input values. The demand is multiplied with the installed demand-capacity and fixed as a negative generation. The emissions are limited to the emission constraints, which can be exceeded by the lost emissions. The sum of generation and slack is fixed to zero. The slack is positive if the dispatchable and non-dispatchable generation can not meet the demand.

## Running the Capacity Expansion Problem

The CEP model can be run with many configurations. The configurations themselves don't mess with each other through the provided input data must fulfil the ability to have, e.g. lines in order for transmission to work.

An overview is provided in the following table:

description | unit | configuration | values | type | default value |
---|---|---|---|---|---|

enforce an emission-limit | kg-impact/MWh-carrier | `limit_emission` | Dict{String,Number}(impact/carrier=>value) | ::Dict{String,Number} | Dict{String,Number}() |

including existing infrastructure (no extra costs) and limit infrastructure | - | `infrastructure` | Dict{String,Array}("existing"=>[tech-groups...], "limit"=>[tech-groups...]) | ::Dict{String,Array} | Dict{String,Array}("existing"=>["demand"]) |

type of storage implementation | - | `storage_type` | "none", "simple" or "seasonal" | ::String | "none" |

allowing conversion (necessary for storage) | - | `conversion` | `true` or `false` | ::Bool | false |

allowing demand | - | `demand` | `true` or `false` | ::Bool | true |

allowing dispatchable generation | - | `dispatchable_generation` | `true` or `false` | ::Bool | true |

allowing non dispatchable generation | - | `non_dispatchable_generation` | `true` or `false` | ::Bool | true |

allowing transmission | - | `transmission` | `true` or `false` | ::Bool | false |

fix. installed capacities to dispatch problem | - | `fixed_design_variables` | design variables from design run or nothing | ::OptVariables | nothing |

allowing lost load (necessary for dispatch) | price/MWh-carrier | `lost_load_cost` | Dict{String,Number}(carrier=>value) | ::Dict{String,Number} | Dict{String,Number}() |

allowing lost emission (necessary for dispatch) | price/kg-impact | `lost_emission_cost` | Dict{String,Number}(impact=>value) | ::Dict{String,Number} | Dict{String,Number}() |

They can be applied in the following way:

`CapacityExpansion.run_opt`

— Function`run_opt(ts_data::ClustData,opt_data::OptDataCEP,config::Dict{String,Any},optimizer::DataType)`

Organizing the actual setup and run of the CEP-Problem. This function shouldn't be called by a user, but from within the other `run_opt`

-functions Required elements are:

`ts_data`

: The time-series data.`opt_data`

: In this case the OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines.`config`

: This includes all the settings for the design optimization problem formulation.`optimizer`

: The used optimizer, which could e.g. be Clp:`using Clp`

`optimizer=Clp.Optimizer`

or Gurobi:`using Gurobi`

`optimizer=Gurobi.Optimizer`

.

` run_opt(ts_data::ClustData,opt_data::OptDataCEP,config::Dict{String,Any},fixed_design_variables::Dict{String,Any},optimizer::DataTyple;lost_el_load_cost::Number=Inf,lost_CO2_emission_cost::Number)`

This problem runs the operational optimization problem only, with fixed design variables. provide the fixed design variables and the `config`

of the previous step (design run or another opterational run) Required elements are:

`ts_data`

: The time-series data, which should be be the original time-series data for this operational run. The`keys(ts_data.data)`

need to match the`[time_series_name]-[node]`

`opt_data`

: In this case the OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines. - Should be the same as in the design run.`config`

: This includes all the previous settings for the design optimization problem formulation and ensures that the configuration is the same.`fixed_design_variables`

: All the design variables that are determined by the previous design run.`optimizer`

: The used optimizer, which could e.g. be Clp:`using Clp`

`optimizer=Clp.Optimizer`

or Gurobi:`using Gurobi`

`optimizer=Gurobi.Optimizer`

.

What you can change in the `config`

:

`lost_load_cost`

: Dictionary with numbers indicating the lost load price per carrier (e.g.`electricity`

in price/MWh should be greater than 1e6), give Inf for no SLACK and LL (Lost Load - a variable for unmet demand by the installed capacities)`lost_emission_cost`

: Dictionary with numbers indicating the emission price/kg-emission (Suggestion: around 700), give Inf for no LE (Lost Emissions - a variable for emissions that will exceed the limit in order to provide the demand with the installed capacities)

```
run_opt(ts_data::ClustData,
opt_data::OptDataCEP,
optimizer::DataType;
descriptor::String="",
storage_type::String="none",
demand::Bool=true,
dispatchable_generation::Bool=true,
non_dispatchable_generation::Bool=true,
conversion::Bool=false,
transmission::Bool=false,
lost_load_cost::Dict{String,Number}=Dict{String,Number}(),
lost_emission_cost::Dict{String,Number}=Dict{String,Number}(),
limit_emission::Dict{String,Number}=Dict{String,Number}(),
infrastructure::Dict{String,Array}=Dict{String,Array}("existing"=>["demand"],"limit"=>Array{String,1}()),
scale::Dict{Symbol,Int}=Dict{Symbol,Int}(:COST => 1e9, :CAP => 1e3, :GEN => 1e3, :SLACK => 1e3, :INTRASTOR => 1e3, :INTERSTOR => 1e6, :FLOW => 1e3, :TRANS =>1e3, :LL => 1e6, :LE => 1e9),
print_flag::Bool=true,
optimizer_config::Dict{Symbol,Any}=Dict{Symbol,Any}(),
round_sigdigits::Int=9)
```

Wrapper function for type of optimization problem for the CEP-Problem (NOTE: identifier is the type of `opt_data`

- in this case OptDataCEP - so identification as CEP problem). Required elements are:

`ts_data`

: The time-series data, which could either be the original input data or some aggregated time-series data. The`keys(ts_data.data)`

need to match the`[time_series_name]-[node]`

`opt_data`

: The OptDataCEP that contains information on costs, nodes, techs and for transmission also on lines.`optimizer`

: The used optimizer, which could e.g. be Clp:`using Clp`

`optimizer=Clp.Optimizer`

or Gurobi:`using Gurobi`

`optimizer=Gurobi.Optimizer`

.

Options to tweak the model are:

`descriptor`

: A name for the model`storage_type`

: String`"none"`

for no storage,`"simple"`

to include simple (only intra-day storage), or`"seasonal"`

to include seasonal storage (inter-day)`demand`

: Bool`true`

or`false`

for technology-group`dispatchable_generation`

: Bool`true`

or`false`

for technology-group`non_dispatchable_generation`

: Bool`true`

or`false`

for technology-group`conversion`

: Bool`true`

or`false`

for technology-group`transmission`

:Bool`true`

or`false`

for technology-group. If no transmission should be modeled, a 'copperplate' is assumed with no transmission restrictions between the nodes`limit`

: Dictionary with numbers limiting the kg.-emission-eq./MWh (e.g.`CO2`

normally in a range from 5-1250 kg-CO2-eq/MWh), give Inf or no kw if unlimited`lost_load_cost`

: Dictionary with numbers indicating the lost load price per carrier (e.g.`electricity`

in price/MWh should be greater than 1e6), give Inf for no SLACK and LL (Lost Load - a variable for unmet demand by the installed capacities). Example: lost*load*cost=Dict{String,Number}("electricity"=>1e6)`lost_emission_cost`

: Dictionary with numbers indicating the emission price/kg-emission (Suggestion: around 700), give Inf for no LE (Lost Emissions - a variable for emissions that will exceed the limit in order to provide the demand with the installed capacities). Example: lost*emission*cost=Dict{String,Number}("CO2"=>700)`infrastructure`

: Dictionary with Arrays indicating which technology groups should have`existing`

infrastructure (`"existing" => ["demand","dispatchable_generation"]`

) and which technology groups should have infrastructure`limit`

ed (`"limit" => ["non_dispatchable_generation"]`

)`scale`

: Dict{Symbol,Int} with a number for each variable (like`:COST`

) to scale the variables and equations to similar quantities. Try to acchieve that the numerical model only has to solve numerical variables in a scale of 0.01 and 100. The following equation is used as a relationship between the real value, which is provided in the solution (real-VAR), and the numerical variable, which is used within the model formulation (VAR): real-VAR [`EUR`

,`MW`

or`MWh`

] = scale[:VAR] ⋅ VAR.`descriptor`

: String with the name of this paricular model like "kmeans-10-co2-500"`print_flag`

: Bool to decide if a summary of the Optimization result should be printed.`optimizer_config`

: Each Symbol and the corresponding value in the Dictionary is passed on to the`with_optimizer`

function in addition to the`optimizer`

. For Gurobi an example Dictionary could look like`Dict{Symbol,Any}(:Method => 2, :OutputFlag => 0, :Threads => 2)`

more information can be found in the optimizer specific documentation.`round_sigdigits`

: Can be used to round the values of the result to a certain number of`sigdigits`

.

## Transmission

A CapacityExpansion model can be run with or without technology transmission.

If the technology `transmission`

is not modelled (`transmission=false`

), the transmission between nodes is not restricted, which is equivalent to a copperplate assumption.

Include `transmission=true`

and `infrastructure = Dict{String,Array}("existing"=>[...,"transmission"], "limit"=>[...,"transmission"])`

to model existing `transmission`

. This sets the existing transmission `TRANS`

to the values defined in the `lines.csv`

file in column `power_ex`

, and limits the transmission by the values defined in `lines.csv`

in the column `power_lim`

. If no new transmission should be setup, use the same values for existing transmission(column `power_ex`

) and the limit (column `power_lim`

).

## Solver

The package provides no `optimizer`

, and a solver has to be added separately. For the linear optimization problem suggestions are:

`Clp`

as an open-source solver`Gurobi`

as a proprietary solver with free academic licenses. Gurobi is faster than Clp, and we prefer it in the academic setting.`CPLEX`

as an alternative proprietary solver

Install the corresponding julia-package for the solver and call its `optimizer`

like e.g.:

```
using Pkg
Pkg.add("Clp")
using Clp
optimizer=Clp.Optimizer
```

## Solver Configuration

Depending on the Solver, different solver configurations are possible. The information is always provided as `Dict{Symbol,Any}`

. The keys of the dictionary are the parameters and the values of the dictionary are the values passed to the solver.

For example, the `Gurobi`

solver can be configured to have no OutputFlag and run on two threads (per julia thread) the following way:

`optimizer_config=Dict{Symbol,Any}(:OutputFlag => 0, :Threads => 2)`

Further information on possible keys for Gurobi can be found at Gurobi parameter description.

## Scaling

The package features the scaling of variables and equations. Scaling variables, which are used in the numerical model, to `0.01 ≤ x ≤ 100`

and scaling equations to `3⋅x = 1`

instead of `3000⋅x = 1000`

improves the shape of the optimization space and significantly reduces the computational time used to solve the numerical model.

The values are only scaled within the numerical model formulation, where we call the variable `VAR`

, but the values are unscaled in the solution, which we call `real-VAR`

. The following logic is used to scale the variables: `real-VAR [EUR, USD, MW, or MWh] = scale[:VAR] ⋅ VAR`

`0.01 ≤ VAR ≤ 100`

`⇔ 0.01 ≤ real-VAR / scale[:VAR] ≤ 100`

The equations are scaled with the scaling parameter of the first variable, which is `scale[:COST]`

in the following example: `scale[:COST]⋅COST = 10⋅scale[:CAP]⋅CAP`

`⇔ COST = 10⋅(scale[:CAP]/scale[:COST])⋅CAP`

### Change scaling parameters

Changing the scaling parameters is useful if the data you use represents a much smaller or bigger energy system than the ones representing Germany and California provided in this package Determine the right scaling parameters by checking the real-values of COST, CAP, GEN... (real-VAR) in a solution using your data. Select the scaling parameters to match the following: `0.01 ≤ real-VAR / scale[:VAR] ≤ 100`

Create a dictionary with the new scaling parameters for EACH variable and include it as the optional `scale`

input to overwrite the default scale in `run_opt`

:

```
scale=Dict{Symbol,Int}(:COST => 1e9, :CAP => 1e3, :GEN => 1e3, :SLACK => 1e3, :INTRASTOR => 1e3, :INTERSTOR => 1e6, :FLOW => 1e3, :TRANS =>1e3, :LL => 1e6, :LE => 1e9)
scale_result = run_opt(ts_clust_data,cep_data,optimizer;scale=scale)
```

### Adding another variable

- Extend the default
`scale`

-dictionary in the`src/optim_problems/run_opt`

-file to include the new variable as well. - Include the new variable in the problem formulation in the
`src/optim_problems/opt_cep`

-file. Reformulate the equations by dividing them by the scaling parameter of the first variable, which is`scale[:COST]`

in the following example: `scale[:COST]⋅COST = 10⋅scale[:CAP]⋅CAP + 100`

`⇔ COST = 10⋅(scale[:CAP]/scale[:COST])⋅CAP + 100/scale[:COST]`