Preparing ClustData
Here, we first describe how to load provided time-series input data or your own time-series input data as ClustData
. We second describe how to aggregate the loaded time-series input data.
Provided Data
load_timeseries_data_provided()
loads the data for a given region
for which data is provided in this package. The optional input parameters to load_timeseries_data_provided()
are the number of time steps per period T
and the years
to be imported.
CapacityExpansion.load_timeseries_data_provided
— Function load_timeseries_data_provided(region::String="GER_1"; T::Int=24, years::Array{Int,1}=[2016], att::Array{String,1}=Array{String,1}())
- Adding the information in the
*.csv
file atdata_path
to the data dictionary
The *.csv
files shall have the following structure and must have the same length:
Timestamp | Year | [column names...] |
---|---|---|
[iterator] | [year] | [values] |
The first column should be called Timestamp
if it contains a time iterator The other columns can specify the single timeseries like specific geolocation. for regions:
"GER_1"
: Germany 1 node"GER_18"
: Germany 18 nodes"CA_1"
: California 1 node"CA_14"
: California 14 nodes"TX_1"
: Texas 1 node
Your Own Data
For details refer to TimeSeriesClustering
The keys of {your-time-series}.data
have to match "{time_series (as declared in techs.csv)}-{node}"
TimeSeriesClustering.load_timeseries_data
— Functionfunction load_timeseries_data(data_path::String;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())
Return all time series as ClustData struct that are stored as csv files in the specified path.
- Loads
*.csv
files in the folder or the filedata_path
- Loads all attributes (all
*.csv
files) if theatt
-Array is empty or only the files specified inatt
- The
*.csv
files shall have the following structure and must have the same length:
Timestamp | Year | [column names...] |
---|---|---|
[iterator] | [year] | [values] |
- The first column of a
.csv
file should be calledTimestamp
if it contains a time iterator - The second column should be called
Year
and contains the corresponding year - Each other column should contain the time series data. For one node systems, only one column is used; for an N-node system, N columns need to be used. In an N-node system, each column specifies time series data at a specific geolocation.
- Returns time series as ClustData struct
- The
.data
field of the ClustData struct is a Dictionary where each column in[file name].csv
file is the key (called"[file name]-[column name]"
).file name
should correspond to the attribute name, andcolumn name
should correspond to the node name.
Optional inputs to load_timeseries_data
:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column
- att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
function load_timeseries_data(existing_data::Symbol;
region::String="none",
T::Int=24,
years::Array{Int,1}=[2016],
att::Array{String,1}=Array{String,1}())
Return time series of example data sets as ClustData struct.
The choice of example data set is given by e.g. existing_data=:CEP-GER1. Example data sets are:
:DAM_CA
: Hourly Day Ahead Market Electricity prices for California-Stanford 2015:DAM_GER
: Hourly Day Ahead Market Electricity prices for Germany 2015:CEP_GER1
: Hourly Wind, Solar, Demand data Germany one node:CEP_GER18
: Hourly Wind, Solar, Demand data Germany 18 nodes
Optional inputs to load_timeseries_data
:
- region-region descriptor
- T- Number of Segments
- years::Array{Int,1}= The years to be selected from the csv file as specified in
years column
- att::Array{String,1}= The attributes to be loaded. If left empty, all attributes will be loaded.
Aggregation
Time series aggregation can be applied to reduce the temporal dimension while (if done problem-specific correctly) keeping output precise. Aggregation methods are explained in TimeSeriesClustering High encouragement to run a second stage validation step if you use aggregation on your model. Second stage operational validation step
Examples
Loading time series data
using CapacityExpansion
state="GER_1"
# load ts-input-data
ts_input_data = load_timeseries_data_provided(state; T=24, years=[2016])
using Plots
plot(ts_input_data.data["solar-germany"], legend=false, linestyle=:dot, xlabel="Time [h]", ylabel="Solar availability factor [%]")
Aggregating time series data
ts_clust_data = run_clust(ts_input_data;method="kmeans",representation="centroid",n_init=50,n_clust=5).clust_data
plot(ts_clust_data.data["solar-germany"], legend=false, linestyle=:solid, width=3, xlabel="Time [h]", ylabel="Solar availability factor [%]")