Preparing ClustData

Here, we first describe how to load provided time-series input data or your own time-series input data as ClustData. We second describe how to aggregate the loaded time-sereis input data.

Provided Data

load_timeseries_data_provided() loads the data for a given region for which data is provided in this package. The optional input parameters to load_timeseries_data_provided() are the number of time steps per period T and the years to be imported.

CapacityExpansion.load_timeseries_data_providedFunction
    load_timeseries_data_provided(region::String="GER_1"; T::Int=24, years::Array{Int,1}=[2016], att::Array{String,1}=Array{String,1}())
  • Adding the information in the *.csv file at data_path to the data dictionary

The *.csv files shall have the following structure and must have the same length:

TimestampYear[column names...]
[iterator][year][values]

The first column should be called Timestamp if it contains a time iterator The other columns can specify the single timeseries like specific geolocation. for regions:

  • "GER_1": Germany 1 node
  • "GER_18": Germany 18 nodes
  • "CA_1": California 1 node
  • "CA_14": California 14 nodes
  • "TX_1": Texas 1 node
source

Your Own Data

For details refer to TimeSeriesClustering

Note

The keys of {your-time-series}.data have to match "{time_series (as declared in techs.csv)}-{node}"

TimeSeriesClustering.load_timeseries_dataFunction
function load_timeseries_data(data_path::String;
                          region::String="none",
                          T::Int=24,
                          years::Array{Int,1}=[2016],
                          att::Array{String,1}=Array{String,1}())

Return all time series as ClustData struct that are stored as csv files in the specified path.

  • Loads *.csv files in the folder or the file data_path
  • Loads all attributes (all *.csv files) if the att-Array is empty or only the files specified in att
  • The *.csv files shall have the following structure and must have the same length:
TimestampYear[column names...]
[iterator][year][values]
  • The first column of a .csv file should be called Timestamp if it contains a time iterator
  • The second column should be called Year and contains the corresponding year
  • Each other column should contain the time series data. For one node systems, only one column is used; for an N-node system, N columns need to be used. In an N-node system, each column specifies time series data at a specific geolocation.
  • Returns time series as ClustData struct
  • The .data field of the ClustData struct is a Dictionary where each column in [file name].csv file is the key (called "[file name]-[column name]"). file name should correspond to the attribute name, and column name should correspond to the node name.

Optional inputs to load_timeseries_data:

  • region-region descriptor
  • T- Number of Segments
  • years::Array{Int,1}= The years to be selected from the csv file as specified in years column
  • att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
function load_timeseries_data(existing_data::Symbol;
                          region::String="none",
                          T::Int=24,
                          years::Array{Int,1}=[2016],
                          att::Array{String,1}=Array{String,1}())

Return time series of example data sets as ClustData struct.

The choice of example data set is given by e.g. existing_data=:CEP-GER1. Example data sets are:

  • :DAM_CA : Hourly Day Ahead Market Electricity prices for California-Stanford 2015
  • :DAM_GER : Hourly Day Ahead Market Electricity prices for Germany 2015
  • :CEP_GER1 : Hourly Wind, Solar, Demand data Germany one node
  • :CEP_GER18: Hourly Wind, Solar, Demand data Germany 18 nodes

Optional inputs to load_timeseries_data:

  • region-region descriptor
  • T- Number of Segments
  • years::Array{Int,1}= The years to be selected from the csv file as specified in years column
  • att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.

Aggregation

Time series aggregation can be applied to reduce the temporal dimension while (if done problem specific correctly) keeping output precise. Aggregation methods are explained in TimeSeriesClustering High encouragement to run a second stage validation step if you use aggregation on your model. Second stage operational validation step

Examples

Loading time series data

using CapacityExpansion
state="GER_1"
# load ts-input-data
ts_input_data = load_timeseries_data_provided(state; T=24, years=[2016])
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")

Plot

Aggregating time series data

ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=50,n_clust=5).clust_data
plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")

Plot