The config file
=====

The ``config.yaml`` file is the file where all settings for the *cd2es* tool are saved.
To create your first ``config.yaml``, copy the file ``config_default.yaml`` to the same folder and rename it to ``config.yaml``.
In the folder ``examples`` in the repository, you can find some example config files with the corresponding results.
In this chapter we will walk you through the ``config.yaml`` in more detail:

.. code:: yaml

    # Select from "Backbone", "PyPSA-Eur" or "Techfiles_only"
    desired_ESM: Techfiles_only

* select the energy system model (ESM) for which you want to use cd2es
   * ``None``: Techfiles only
   * ``Backbone``: https://gitlab.vtt.fi/backbone/backbone
   * ``PyPSA-Eur``: https://github.com/PyPSA/pypsa-eur (requires a modified version of ``PyPSA-Eur``. If you are interested in details, contact paul.puetz@ee.ruhr-uni-bochum.de)

.. code:: yaml

    # ------- General: Climate Data -----

    # File to download the climate data into - careful, those files are quite huge, maybe use an external hard drive
    data_dir: "path/to/climate_data_storage"

* insert a valid path on your machine to store the climate data in.
* be careful, the files can be really huge, choose a sufficiently big partition

.. code:: yaml

    # Settings for run without specified output file
    scenario:
    # Technologies to be calculated
    # Possible technologies: 'csp', 'pv', 'wind' (onshore), 'wind_offshore', 'hydro' (hydro not for desired_ESM == "PyPSA-Eur")
    # and tpp(CoolingType)_p{Plant} with cooling type in [CL, OT] (closed loop, once through)
    # and plant in [Nuclear, Coal, CCGT, Biomass, H2]
    technologies: ['pv', 'wind', 'wind_offshore', 'hydro', 'tppCL_pNuclear', 'tppOT_pNuclear', 'tppCL_pCoal', 'tppOT_pCoal', 'tppCL_pCCGT', 'tppOT_pCCGT', 'tppCL_pBiomass', 'tppOT_pBiomass', 'tppCL_pH2', 'tppOT_pH2']
    # Add 'demand' for calculation of demand (only working for Europe + desired_ESM == "Backbone" or "Techfiles_only")
    demand: ['demand']
    # Name of climate model
    # For starters, we recommend using NCC-NorESM1-M
    model: ['NCC-NorESM1-M']
    # RCP without decimal separator
    rcp: ['85']
    # Years
    year: [2041, 2042, 2043, 2044, 2045, 2046, 2047, 2048, 2049, 2050]
    # Domain
    # Choose from CORDEX domains (southamerica, centralamerica, northamerica, europe, africa, eastasia, centralasia, southasia, southeastasia, australasia)
    domain: ["europe"]

* scenario settings: only apply, when you use the rule ``runAll`` meaning that you do not specify a specific output file
* choose technologies to be calculated, the climate model, the RCP scenario, the year and the domain
* for simple applications, it is enough to change the config file until here

.. code:: yaml

    # enable automatic download of cordex climate data from https://esgf-data.dkrz.de/search/cordex-dkrz/
    # otherwise, the climate data must be present in the data_dir in the following folder: {data_dir}/{domain_name}/original/{model_name}/
    # cordex data will be automatically renamed to the names needed for the workflow, if the original names from esgf are kept
    download_cordex: True
    cordexParameter: 
    # to be specified on the available datasets under https://esgf-data.dkrz.de/search/cordex-dkrz/ 
    # for the combination ensemble: "r1i1p1", institut: "GERICS", RCMModel: "REMO2015", downscalingRealisation: "v1" there is sufficient data for the whole globe for
    # the driving models MPI-M-MPI-ESM-LR and NCC-NorESM1-M and RCPs 2.6 and 8.5
        ensemble: "r1i1p1"
        institute: "GERICS"
        RCMModel: "REMO2015"
        downscalingRealisation: "v1"

* section of config dedicated to download climate data from cordex
* if you enable ``download_cordex``, the tool will automatically download the cordex climate data
* be sure that the parameters given under cordexParameters fit to the climate model and RCP you are looking at (either specified by file name or in the scenario section of the config).
* you can check which datasets exist under https://esgf-data.dkrz.de/search/cordex-dkrz/ 
* if you disable ``download_cordex``, the tool will instead rename already downloaded cordex files to include them in the workflow.
* in this case, the data has to be present in the data_dir in the following folder: ``{data_dir}/{domain_name}/original/{climate_model_name}/``

.. code:: yaml

    download_era5: True
    # enable automatic download of era5 data for regression and bias adaption - otherwise the era5 data must be present in the folder 
    # {data_dir}/observed with file name {era5 variable name}_era5_{year}.nc (see bias_adaption/climateVariablesDict in this config for names)
    loginDataAsParams: False
    # only necessary when building .cdsapirc in home folder is not possible
    # then, the url and the key for the copernicus data store must be written before the snakemake command in the following manner
    # cdsUrl="yourUrl" cdsKey="yourKey" snakemake -j...

* section of config dedicated to download reanalyis data from era5 (for bias adaption and regression)
* if you enable ``download_era5``, the tool will automatically download the era 5 data
* you must be registered with the copernicus climate data storage (see :ref:`installation` on how to do it)
* if you have trouble to build the .cdsapirc in your home folder, you can give the cds login parameter as parameter for the snakemake worklflow by changing the flag ``loginDataAsParams`` to True
* in this case, the cds url and key must be added at the beginning of the snakemake command in the following manner ``cdsUrl="yourUrl" cdsKey="yourKey" snakemake -j...``
* if you disable ``download_era5``, the reanalysis data must be present in the folder ``{data_dir}/observed`` with file name ``{era5 variable name}_era5_{year}.nc`` (see ``bias_adaption/climateVariablesDict`` in the config for names)

.. code:: yaml

    # enable or disable bias adaption
    use_bias_adaption: True

    bias_adaption:
        yearHistStart: 2011
        yearHistEnd: 2015
        climateVariablesDict: # corresponding era5 names to cordex variable names
            sfcWind: ['v10', 'u10']
            tas: 't2m' 
            rsds: 'ssr'
            mrro: 'ro'

* section of config dedicated to bias adaption process (see :ref:`biasAdaption` for the method)
* enable or disable bias adaption via ``use_bias_adaption``
* choose years to be included in the comparison of climate model and reanalyis
* ``climateVariablesDict`` translates era5 data names to cordex data names

.. code:: yaml

    # xarray can read the climate data in chunks to preserve memory - more chunks mean less memory usage but slower execution
    numberOfChunks: 10

* as the climate data files can be quite huge, there might be not enough RAM to load them at once
* the python package ``xarray`` (used to read the files) can read the file in chunks, meaning not as one file but a many seperate dask arrays
* enlarging the number of chunks reduced memory usage but enlarges execution time

.. code:: yaml

    # parallelized dask calculations tend to get stuck
    # after maxTime passed, rules are killed and restarted
    # for bias adaption (most resource intensive rule) this time is doubled
    maxTime: 600

* to not overload memory, most calculations on the large datasets work with dask/xarray
* parallelized dask sometimes get stuck, therefore all dask rules terminate after maxTime has passed without completion
* snakemake retries those rules three times until it finally terminates
* on slow machines, you might need to enlarge the maxTime parameter
* rule ``bias_adopt`` is always allowed to need ``2*maxTime`` as it is the most resource intensive rule

.. code:: yaml

    # ------- General: Methods -----

    # renewables are built preferably to locations with high cfs. This can lead to an underestimation, when just aggregating over a whole country. 
    # Here, it can be chosen, that only the best x percents of data points per node are aggregated
    choose_only_x_percent_of_country:
        onwind: 0.05
        offwind: 0.05
        pv: 0.2

* in the aggregation process, it is possible to only consider the x best percent of the points within a bus for wind and pv
* this accounts for the fact that renewables are built preferably to locations with high capacity factors
* the values given here were determined by comparing the model output to data from renewables ninja :cite:`Staffell.2016`, :cite:`Pfenninger.2016` (see :doc:`methods` for more details)

.. code:: yaml

    onwind:
        hub_height: 100 # in m
        v_in: 2 # cut in velocity in m/s
        v_r: 9.5 # rated velocity in m/s
        v_out: 12 # cut out velocity in m/s

    offwind:
        hub_height: 100 # in m
        v_in: 0 # cut in velocity in m/s
        v_r: 13 # rated velocity in m/s
        v_out: 15 # cut out velocity in m/s
        max_depth: 50 # maximal water depth

* parameters for the calculation of wind profiles
* values were determined by comparing the model output to data from renewables ninja :cite:`Staffell.2016` (see :ref:`wind` for more details)

.. code:: yaml

    solar:
        # following climix model https://doi.org/10.1016/j.rser.2014.09.041
        option: 3 
        GStc: 1000 # irradiance at standard testing conditions in W/m^2
        # if option = 1, everything below is not necessary
        # give all temperatures in °C
        c1: 4.3 
        c2: 0.943
        c3: 0.028
        c4: -1.528 # only necessary for option 3
        TStc: 25  # temperature at standard testing conditions in °C
        beta: 0.0045 # only necessary for option 2
        gamma: -0.005 # !!! something different in option 2 and 3

* parameters for the calculation of photovoltaic profiles
* choose between three different calculation methods from :cite:`Jerez.2015` via ``option`` (see :ref:`pv` for more details)
* be careful: the meaning of the parameters may differ for the different options

.. code:: yaml

    tpp:
    # https://doi.org/10.1016/j.rser.2019.06.006 
    OT:
        OTyearStart: 2011
        OTyearEnd: 2015
        # give all temperatures in K
        nuclear:
            temp_const: 0.0044 # loss of efficiency per temperature rise in 1/K
            T_health: 288
            T_shutdown: 305
            T_outmax: 305
            deltaT: 10
        coal:
            temp_const: 0.0097 # loss of efficiency per temperature rise in 1/K
            T_health: 288
            T_shutdown: 305
            T_outmax: 305
            deltaT: 10    
        ccgt:
            temp_const: 0.0031 # loss of efficiency per temperature rise in 1/K
            T_health: 288
            T_shutdown: 305
            T_outmax: 305
            deltaT: 10
    CL:
        nuclear:
            temp_const: 0.0044 # loss of efficiency per temperature rise in 1/K
            T_health: 283 # temperature in K, above which degradation of efficiency starts
        coal:
            temp_const: 0.0094 # loss of efficiency per temperature rise in 1/K
            T_health: 283 # temperature in K, above which degradation of efficiency starts
        ccgt:
            temp_const: 0.0030 # loss of efficiency per temperature rise in 1/K
            T_health: 283 # temperature in K, above which degradation of efficiency starts

* parameters for the calculation of the availability of thermal power plants taken from :cite:`Abdin.2019`
* separated in closed loop and once through cooling
* for once through cooling: also possible to decide which historic years are used to calculate historic water availability
* see :ref:`tpp` for more details on methods

.. code:: yaml

    # ----- ESM Specific -----

    # -- Only affecting if desired ESM == "PyPSA-Eur" --

    # Requires a modified version of PyPSA-Eur. If you are interested in details, contact paul.puetz@ee.ruhr-uni-bochum.de
    pypsa:
    # Experimental mode requiring training of a neuronal net. Results not verified yet, use with caution!
    # Last time checked, soiltemp only required for ground heat pumps
    predict_soiltemp: False
    # Template for PyPSA cutout (with heights) -> Grid for cd2es. Coordinates have to be within CORDEX domain
    cutout_template_path: 'resources/pypsa_europe_cutout_height.nc'

* settings that only take effect if desired ESM == "PyPSA-Eur"

.. code:: yaml

    # -- Only affecting if desired ESM == "Backbone" or "Techfiles_only" --

    hydro:
        yearStart: 2011
        yearEnd: 2015
        factor: 1.7

    demand:
        reference_year: 2017
        countries: ['ALB', 'AUT', 'BIH', 'BEL', 'BGR', 'CHE', 'CZE', 'DEU', 'DNK', 'EST', 'ESP', 'FIN', 'FRA', 'GBR', 'GRC', 'HRV', 'HUN', 'IRL', 'ITA', 'LTU', 'LUX', 'LVA', 'MNE', 'MKD', 'NLD', 'NOR', 'POL', 'PRT', 'ROU', 'SRB', 'SWE', 'SVN', 'SVK']
        yearStart: 2015
        yearEnd: 2019
        makePlots: True
        scaleDemand: False # scale reference demand to meet projections for future before applying cd2es
        scaleDemandCountries: ["BEL", "BGR", "CZE", "DNK", "DEU", "EST", "IRL", "GRC", "ESP", "FRA", "HRV", "ITA", "LVA", "LTU", "LUX", "HUN", "NLD", "AUT", "POL", "PRT", "ROU", "SVN", "SVK", "FIN", "SWE", "GBR", "NOR", "CHE", "Balkan"]
        scaleDemandNewLoad: [124.2, 40.9, 90.9, 51.1, 666.2, 11.3, 39.0, 64.8, 334.3, 629.1, 23.6, 453.8, 11.4, 13.4, 13.8, 54.2, 152.6, 95.1, 232.5, 58.6, 71.6, 19.8, 39.3, 110.4, 190.5, 471.5, 147.1, 60.8, 73.4] # given in TWh, based on Pietzcker et al 2021


* settings that only take effect if desired ESM == "Backbone" or "Techfiles_only"
* parameters for hydro and demand calculation
* hydro power output is calibrated based on historic data, you can choose which years shall be considered
* hydro power is calculated by comparing historic runoff to future runoff at the plant's locations
* ``factor`` is used to determine how much of the average historical inflow corresponds to a plant's rated capacity (meaning a value of 1.7 means, that a plant reaches its rated capacity at 170% of average historic inflow)
* the value 1.7 was found by comparing the output of *cd2es* to :cite:`Formayer.2023`
* the demand is calculated based on regression of historic demand and temperatures, you can choose which countries to look at, which years to do the regression for and whether or not you want the plots as output
* the reference year is the year of which you use the demand in your model
* see :ref:`demand` and :ref:`hydropower` for more details on methods


.. code:: yaml

    # put in custom map in geojson format if desired. Else a map aggregated on country level will be used.
    # To calculate offshore cfs you also need to provide an offshore map.
    # the geojson files must a column "name" with the name of the node and a column "geometry" with the shape of the node
    # for hydro and demand calculation, they must also contain a column with a three letter country code and the name "countryCode"
    use_custom_bus_map: False
    bus_map: "path/to/onshore_map.geojson"
    offshore_map: "path/to/offshore_map.geojson"

* settings that only take effect if desired ESM == "Backbone" or "Techfiles_only"
* in this section you can activate the use of custom busmaps
* if you disable ``use_custom_bus_map`` the tool uses a map on country level
* if you provide a custom bus map it must contain a column "name" with the name of your bus and a column "geometry" with the shape of the bus
* for hydro and demand calculation you also need a column "countryCode" with the three letter country code of the country the bus belongs to

.. code:: yaml

    geography:
        # how many points in each direction?
        x_size: 150
        y_size: 150

* settings that only take effect if desired ESM == "Backbone" or "Techfiles_only"
* the cordex climate data is remapped from rotated coordinates to a rectangular coordinate system
* you can decide how many points the rectangular coordinate system will have in x and y direction
* more points enhance accuracy but also enlarge file size and slow down the conversion process

.. code:: yaml

    # -- Only affecting if desired ESM == "Backbone"

    # it is possible to directly insert the time series into a backbone (https://doi.org/10.3390/en12173388) model
    # path to backbone file
    backbone_input: "path/to/backbone_input_file.xlsx"
    backbone_tpps: 
        types: ["Nuclear", "Biomass"] # plants in [Nuclear, Coal, CCGT, Biomass]
        inv_cost_per_cooling_type: # in €/MW, taken from https://restservice.epri.com/publicdownload/000000000001005358/0/Product
            once-through: 0
            closed-loop: 11163
            dry-cooling: 43005
        efficiency_drop_per_cooling_type: # given as factors to multiply with given efficiency
        # taken from http://dx.doi.org/10.1016/j.gloenvcha.2014.01.005 with the assumption, that once-through is the given efficiency
            once-through: 1
            closed-loop: 0.9741
            dry-cooling: 0.9421
        aggregateNodes: False
        nodesToAggregate: {"BL0 0": ["EE6 0", "LT6 0", "LV6 0"], "BK0 0": ["AL1 0", "BA1 0", "HR1 0", "RS1 0", "ME1 0", "MK1 0"], "ES0 0": ["ES1 0", "ES4 0"], "LB0 0": ["BE1 0", "LU1 0"], "DK0 0": ["DK1 0", "DK2 0"], "IT0 0": ["IT1 0", "IT3 0"], "GB1 0": ["GB0 0", "GB5 0"]}

* settings that only take effect if desired ESM == "Backbone"
* this last section is only necessary if you want to insert your data directly into backbone (a specific energy system model :cite:`Helisto.2019b`) excel file
* you must give the location of the existing excel file
* the following parameters are to manipulate the costs of thermal power plants based on their cooling type
* will probably only work smoothly if your backbone input file was prepared with the tool https://gitlab.ruhr-uni-bochum.de/ee/backbone-tools

Bibliography
------------

.. bibliography:: leoniePromo.bib