Preparing ClustData
Here, we first describe how to load provided time-series input data or your own time-series input data as ClustData. We second describe how to aggregate the loaded time-series input data.
Provided Data
load_timeseries_data_provided() loads the data for a given region for which data is provided in this package. The optional input parameters to load_timeseries_data_provided() are the number of time steps per period T and the years to be imported.
CapacityExpansion.load_timeseries_data_provided — Function load_timeseries_data_provided(region::String="GER_1"; T::Int=24, years::Array{Int,1}=[2016], att::Array{String,1}=Array{String,1}())- Adding the information in the
*.csvfile atdata_pathto the data dictionary
The *.csv files shall have the following structure and must have the same length:
Timestamp | Year | [column names...] |
|---|---|---|
| [iterator] | [year] | [values] |
The first column should be called Timestamp if it contains a time iterator The other columns can specify the single timeseries like specific geolocation. for regions:
"GER_1": Germany 1 node"GER_18": Germany 18 nodes"CA_1": California 1 node"CA_14": California 14 nodes"TX_1": Texas 1 node
Your Own Data
For details refer to TimeSeriesClustering
The keys of {your-time-series}.data have to match "{time_series (as declared in techs.csv)}-{node}"
TimeSeriesClustering.load_timeseries_data — Functionfunction load_timeseries_data(data_path::String;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())Return all time series as ClustData struct that are stored as csv files in the specified path.
- Loads
*.csvfiles in the folder or the filedata_path - Loads all attributes (all
*.csvfiles) if theatt-Array is empty or only the files specified inatt - The
*.csvfiles shall have the following structure and must have the same length:
| Timestamp | Year | [column names...] |
|---|---|---|
| [iterator] | [year] | [values] |
- The first column of a
.csvfile should be calledTimestampif it contains a time iterator - The second column should be called
Yearand contains the corresponding year - Each other column should contain the time series data. For one node systems, only one column is used; for an N-node system, N columns need to be used. In an N-node system, each column specifies time series data at a specific geolocation.
- Returns time series as ClustData struct
- The
.datafield of the ClustData struct is a Dictionary where each column in[file name].csvfile is the key (called"[file name]-[column name]").file nameshould correspond to the attribute name, andcolumn nameshould correspond to the node name.
Optional inputs to load_timeseries_data:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column - att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
function load_timeseries_data(existing_data::Symbol;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())Return time series of example data sets as ClustData struct.
The choice of example data set is given by e.g. existing_data=:CEP-GER1. Example data sets are:
:DAM_CA: Hourly Day Ahead Market Electricity prices for California-Stanford 2015:DAM_GER: Hourly Day Ahead Market Electricity prices for Germany 2015:CEP_GER1: Hourly Wind, Solar, Demand data Germany one node:CEP_GER18: Hourly Wind, Solar, Demand data Germany 18 nodes
Optional inputs to load_timeseries_data:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column - att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
Aggregation
Time series aggregation can be applied to reduce the temporal dimension while (if done problem-specific correctly) keeping output precise. Aggregation methods are explained in TimeSeriesClustering High encouragement to run a second stage validation step if you use aggregation on your model. Second stage operational validation step
Examples
Loading time series data
using CapacityExpansion
state="GER_1"
# load ts-input-data
ts_input_data = load_timeseries_data_provided(state; T=24, years=[2016])
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")Aggregating time series data
ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=50,n_clust=5).clust_data
plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")