Preparing ClustData
Here, we first describe how to load provided time-series input data or your own time-series input data as ClustData
. We second describe how to aggregate the loaded time-sereis input data.
Provided Data
load_timeseries_data_provided()
loads the data for a given region
for which data is provided in this package. The optional input parameters to load_timeseries_data_provided()
are the number of time steps per period T
and the years
to be imported.
CapacityExpansion.load_timeseries_data_provided
— Function load_timeseries_data_provided(region::String="GER_1"; T::Int=24, years::Array{Int,1}=[2016], att::Array{String,1}=Array{String,1}())
- Adding the information in the
*.csv
file atdata_path
to the data dictionary
The *.csv
files shall have the following structure and must have the same length:
Timestamp | Year | [column names...] |
---|---|---|
[iterator] | [year] | [values] |
The first column should be called Timestamp
if it contains a time iterator The other columns can specify the single timeseries like specific geolocation. for regions:
"GER_1"
: Germany 1 node"GER_18"
: Germany 18 nodes"CA_1"
: California 1 node"CA_14"
: California 14 nodes"TX_1"
: Texas 1 node
Your Own Data
For details refer to TimeSeriesClustering
The keys of {your-time-series}.data
have to match "{time_series (as declared in techs.csv)}-{node}"
TimeSeriesClustering.load_timeseries_data
— Functionfunction load_timeseries_data(data_path::String;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())
Return all time series as ClustData struct that are stored as csv files in the specified path.
- Loads
*.csv
files in the folder or the filedata_path
- Loads all attributes (all
*.csv
files) if theatt
-Array is empty or only the files specified inatt
- The
*.csv
files shall have the following structure and must have the same length:
Timestamp | Year | [column names...] |
---|---|---|
[iterator] | [year] | [values] |
- The first column of a
.csv
file should be calledTimestamp
if it contains a time iterator - The second column should be called
Year
and contains the corresponding year - Each other column should contain the time series data. For one node systems, only one column is used; for an N-node system, N columns need to be used. In an N-node system, each column specifies time series data at a specific geolocation.
- Returns time series as ClustData struct
- The
.data
field of the ClustData struct is a Dictionary where each column in[file name].csv
file is the key (called"[file name]-[column name]"
).file name
should correspond to the attribute name, andcolumn name
should correspond to the node name.
Optional inputs to load_timeseries_data
:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column
- att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
function load_timeseries_data(existing_data::Symbol;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())
Return time series of example data sets as ClustData struct.
The choice of example data set is given by e.g. existing_data=:CEP-GER1. Example data sets are:
:DAM_CA
: Hourly Day Ahead Market Electricity prices for California-Stanford 2015:DAM_GER
: Hourly Day Ahead Market Electricity prices for Germany 2015:CEP_GER1
: Hourly Wind, Solar, Demand data Germany one node:CEP_GER18
: Hourly Wind, Solar, Demand data Germany 18 nodes
Optional inputs to load_timeseries_data
:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column
- att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
Aggregation
Time series aggregation can be applied to reduce the temporal dimension while (if done problem specific correctly) keeping output precise. Aggregation methods are explained in TimeSeriesClustering High encouragement to run a second stage validation step if you use aggregation on your model. Second stage operational validation step
Examples
Loading time series data
using CapacityExpansion
state="GER_1"
# load ts-input-data
ts_input_data = load_timeseries_data_provided(state; T=24, years=[2016])
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
Aggregating time series data
ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=50,n_clust=5).clust_data
plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")