Copulas for hydroclimatic analysis: A practice‐oriented overview

A warming climate is associated with increasing hydroclimatic extremes, which are often interconnected through complex processes, prompting their concurrence and/or succession, and causing compound extreme events. It is critical to analyze the risks of compound events, given their disproportionately high adverse impacts. To account for the variability in two or more hydroclimatic variables (e.g., temperature and precipitation) and their dependence, a rising number of publications focuses on multivariate analysis, among which the notion of copula‐based probability distribution has attracted tremendous interest. Copula is a mathematical function that expresses the joint cumulative probability distribution of multiple variables. Our focus is to re‐emphasize the fundamental requirements and limitations of applying copulas. Confusion about these requirements may lead to misconceptions and pitfalls, which can potentially compromise the robustness of risk analyses for environmental processes and natural hazards. We conducted a systematic literature review of copulas, as a prominent tool in the arsenal of multivariate methods used for compound event analysis, and underpinned them with a hydroclimatic case study in Sweden to illustrate a practical approach to copula‐based modeling. Here, we (1) provide end‐users with a didactic overview of necessary requirements, statistical assumptions and consequential limitations of copulas, (2) synthesize common perceptions and practices, and (3) offer a user‐friendly decision support framework to employ copulas, thereby support researchers and practitioners in addressing hydroclimatic hazards, hence demystify what can be an area of confusion.


| INTRODUCTION
Many hydroclimatic variables are interconnected and have combined socio-ecological impacts that are disproportionately larger than the sum of their impacts individually (Alizadeh et al., 2020). Studying these variables separately may lead to an insufficient characterization of their variability and change, and may therefore underestimate their risks (Hao et al., 2018). Trying to understand dependence structure among hydroclimatic variables and underpin risks associated with their joint extreme states, many multivariate methods have been adopted in the literature (De Michele et al., 2005;De Michele & Salvadori, 2003;Favre et al., 2004;Gräler et al., 2013;Salvadori et al., 2014;Salvadori & De Michele, 2004). A keyword search in Scopus (Burnham, 2006), resulted in 964 published articles since 1970 until the end of 2020 when combining the terms "multivariate" and "hydrology," and 6590 publications when combining the terms "multivariate" and "climate." The total number of articles published per year has nearly doubled since 2010 ( Figure 1a). These articles utilize a number of different multivariate methods, including, for instance, multivariate analysis of variance (Shukla & Gedam, 2019), multivariate regression (Abdulelah Al-Sudani et al., 2019), or principal component analysis (Westra et al., 2007).
In line with the increasing use of multivariate methods, the application of copula-based analyses-a specific type of multivariate approaches-has also attracted tremendous interest (Figure 1b). The copula concept is particularly advantageous, since it allows for constructing a joint distribution of random variables without any constraint on their marginal distributions ( Figure 2) (AghaKouchak et al., 2010). Copula is a function that connects a multivariate distribution to its one-dimensional marginal distributions (Nelsen, 2006). It is based on the early works of Frechet (1951), later formalized by Sklar (1959), and has been widely used in different fields of study (e.g., Ghoddusi, 2017;Patton, 2012). Since the early 2000's, copula methods have been adopted in hydrological modeling (e.g., Favre et al., 2004;Salvadori & De Michele, 2004). Since then, researchers have applied copulas to study interactions between a variety of different hydroclimatic variables.
Despite the growing number of copula studies, practitioners mostly rely on the statistical literature, which often focuses on the theoretical foundation (e.g., Genest et al., 2009;Genest & Boies, 2003;Joe, 2014;Nelsen, 2006), and only a few articles focus on practical challenges encountered in hydroclimatic research, including for instance how to explore the dependence between variables at various scales, how to pretreat data to fulfill particular copula F I G U R E 1 Number of annually published peer-reviewed articles listed by Scopus (a) when combining the search term "multivariate" with either "hydrology" (brown bars) or "climate" (turquoise bars), and (b) when combining the search term "copula" with either "hydrology" (brown bars) or "climate" (turquoise bars) assumptions, how to robustly select copula models, or how to validate the fit of a copula model to data (e.g., Dupuis, 2007;Schölzel & Friederichs, 2008). Here, we bridge this gap and provide a comprehensive literature review, aiming to (1) provide a didactic overview of the theoretical background of copulas, (2) clarify common perceptions and practices in copula modeling, and (3) offer clear guidelines and a user-friendly decision support framework on how to implement copulas. The goal of this article is to support researchers and decision makers in using copula modeling, in particular for studies of hydroclimatic impacts. We offer a step-by-step guidance to using copulas through a precipitation-temperature analysis for a case study of the Vattholma catchment in Sweden. Specifically, we demonstrate how to approach the assessment of spatiotemporal controls of dependence, pretreatment of data, copula model choice and parameter estimation, and goodness-of-fit evaluation.

| What is copula?
A copula is a powerful tool to capture and model dependence structures among variables, by coupling marginal distributions and forming their joint cumulative distribution ( Figure 2). Copulas have been formally introduced in the early work of Sklar (1959), who showed that any multivariate joint cumulative distribution function (CDF) can be specified as a copula function of the marginal variables, assuming that these variables are continuous and time-independent. If H is an m-dimensional CDF with one-dimensional marginal distributions F i x i ð Þ, then an m-dimensional copula (C) exists such that: where x 1 , x 2 , …, x m ð Þ ϵℝ are random variables of interest, and F 1 x 1 ð Þ,F 2 x 2 ð Þ, …, F m x m ð Þ-denoted by u 1 , u 2 , …,u m ð Þ -are marginal distributions (i.e., variables contributing to the joint distribution), for which the dependence structure is to be modeled by a copula C (Joe, 2014).
Although copulas can be defined for any number of variables m (so-called m-dimensional copulas), twodimensional copulas are of particular interest for hydroclimatic applications due to their simplicity and yet robustness in representing dependence structures. Therefore, two-dimensional copulas are a focal point of this article. In the following sections, we provide a brief overview of both empirical and theoretical copula distributions. F I G U R E 2 Empirical two-dimensional copula visualized schematically as (a) a 3D surface and (b) a two dimensional contour with uniformly distributed variables u 1 and u 2 (gray histograms) that are rank transformed of random variables of interest x 1 , x 2 ð Þϵℝ. Here, Þis a two-dimensional cumulative distribution function, and F 1 x 1 ð Þ,F 2 x 2 ð Þ are marginal distributions, for which the dependence structure is modeled by a copula C

| Empirical copula
To capture the underlying joint probability (i.e., the dependence structure) between two sets of random variables x 1 and x 2 , the joint rank dependence of the variables, termed empirical copula , is derived. As the marginal distribution functions u 1 ¼ F 1 x 1 ð Þ and u 2 ¼ F 2 x 2 ð Þ are typically unknown, pseudo observations U 1 and U 2 need to be constructed by transforming original marginal distributions u 1 and u 2 to standard uniform using the Weibull plotting position: where R 1,i is the rank of x 1,i among observations of the first variable x 1,1 , …, x 1,n and R 2,i the rank of x 2,i among observations of the second variable x 2,1 , …, x 2,n . Note that several other forms of plotting positions, such as Gringorten, can be used for this purpose, which are mainly application dependent. The choice of plotting positions may impact estimates of the average exceedance probabilities associated with the historical data and the future projections (England Jr et al., 2019). Then, the empirical bivariate copula is defined as the discrete function C n given by (Deheuvels, 1979;Weiß, 2011): with 1 Á ð Þ being a logical indicator function (Genest et al., 2009) and C n u ð Þ being a rank-based estimator for all observations i ¼ 1, 2, 3, … to n. The result is the joint CDF of the uniformly distributed marginal variables (indicated by gray bars in Figure 2), which can be interpreted as the cumulative distribution of rank-transformed data (Charpentier et al., 2007). Originally introduced by Deheuvels (1979) and later slightly modified for independent cases (Tsukahara, 2005), the result of Equation (3) is usually called the empirical copula which arguably is the most objective approximation to the true underlying copula (Genest et al., 2009).

| Theoretical copula families
The dependence structure captured by the empirical copula can guide the selection of suitable theoretical, or so-called parametric copulas C θ ð Þ. A variety of theoretical copulas with different characteristics exist that can represent the structure and the strength of dependence between variables (Vandenberghe et al., 2010;Vogl et al., 2012). The structure is provided by the choice of the copula family, while the copula parameter reflects upon the strength of the dependence. Consequently, different theoretical copula families capture various features of the dependence structure differently, with some copulas being more suitable to model the overall dependence while others are better suited to represent the dependence at the tails of the distribution, which is particularly relevant for modeling compound extreme events (Zscheischler et al., 2020). These copula families differ in their tail behavior, dependence structure, symmetry of the dependence and suitability to model positive and/or negative dependence. Symmetry here implies that variables are exchangeable, that is, C u 1 , u 2 ð Þ¼C u 2 ,u 1 ð Þ (Wu, 2014) and, therefore, tails show the same behavior in terms of (in) dependence. Most of the existing copula families that are typically used for hydroclimatic modeling studies either belong to the class of Elliptical or to the class of Archimedean copulas, which are different in their complexity and symmetry of the dependence, as explained hereafter.

| Archimedean copulas
Archimedean copulas are frequently used in hydroclimatic applications as they have a simple mathematical form and are specifically easy to sample from, when considering only two variables (i.e., bivariate case; Joe, 2014). Furthermore, they allow for different dependence structures, such as concordance and tail dependence (Hofert, 2008). Several Archimedean families exhibit asymptotic dependence in at least one of the two tails (Charpentier & Segers, 2009). This makes them especially suitable for modeling of extreme events (Hofert, 2008) such as droughts or floods.
The general form for Archimedean copulas is: where ψ is a continuous strictly decreasing convex function, known as the generator function and its pseudo-inverse ψ À1 (McNeil & Nešlehov a, 2009). Three common types of Archimedean copulas are Gumbel, Clayton, and Frank copulas . Their main features are shortly summarized here and more details and equations can be found in Figure 3a. The Gumbel copula (Gumbel, 1960) is an asymmetric copula with higher probability concentrated in the upper tail, while Clayton copulas (Clayton, 1978) have higher probability in the lower tail. Frank copulas (Frank, 1979), which are symmetric, tend to have lower density in the tails compared to Clayton and Gumbel and are capable of capturing negative correlations as well.

| Elliptical copulas
Elliptical copulas are commonly used to account for overall dependence structures, for example in bias adjustment of climate models (e.g., B ardossy & Pegram, 2012;Li et al., 2014), as they are capable of representing positive and negative dependence. However, they are less suitable for modeling hydroclimatic events if there is no evidence of radial symmetry. Additionally, they are also more difficult to apply, because they do not have closed form expressions (Embrechts et al., 2003;Favre et al., 2018). The governing function for Elliptical copulas is: where Φ is a suitable multivariate distribution and Φ À1 denotes the inverses of univariate marginal distribution. The two most widely used types of Elliptical copulas are Gaussian and Student-t copulas ( Figure 3b). Gaussian copulas-just like bivariate Gaussian distributions-can model positive and negative dependencies and are symmetric. However, unlike Gaussian bivariate distributions, the marginal variables can follow any distribution and are, thus, not limited to solely follow a Gaussian distribution. The Student-t copula is characterized by a greater modeling flexibility in terms of tail dependence (Carreau & Bouvier, 2016), which is enabled by an additional degree-of-freedom parameter compared to the Gaussian copula.

| Other families
The families discussed above represent the most commonly applied theoretical copulas in the hydroclimatic literature to date. Typically, they are readily available in many statistical software packages, for example, in the Statistics and Machine Learning Toolbox of MATLAB (https://www.mathworks.com/help/stats/), in the MATLAB MVCAT toolbox (https://se.mathworks.com/matlabcentral/fileexchange/69217-multivariate-copula-analysis-toolbox-mvcat) introduced by Sadegh et al. (2017) or in the "copula" package in R (https://cran.r-project.org/web/packages/copula/ index.html).
However, theoretical copulas are not limited solely to those families. In fact, the literature holds a variety of other copulas, such as Gumbel-Hougaard or Galambos copula that belongs to the class of extreme-value copulas (appropriate to model dependence between extreme events (Kamnitui et al., 2019)), the Ali-Mikhail-Haq copula (Ali et al., 1978) (belongs to Archimedean class and allows for both positive and negative dependence) or the Independence copula (specifically used for independent variables)-to mention only a few. For a comprehensive overview of available theoretical copulas, we refer the reader to Nelsen (2006).

| Copula for conditional simulation
Copulas can be used to compute the conditional distribution of one variable given a specific value of other variable(s). For example, this approach can be used to estimate the conditional probability distribution of precipitation, given a particular temperature value. This can be especially useful for calculating conditional return periods (e.g., AghaKouchak et al., 2014;Brunner et al., 2016), and for correcting biases in climate model outputs (e.g., Li et al., 2014;Räty et al., 2018) or to simply assess the goodness-of-fit of copula models. Note that calculation of F I G U R E 3 Commonly applied (a) Archimedean and (b) elliptical copulas in the hydroclimatic literature, including their shapes, dependence structures (upper or lower tail), symmetry, their theoretical parameters based on Kendall's τ (here, τ ¼ 0:75), and their probability distribution. The top right plot is the same three dimensional surface as in Figure 2a. In Student-t copula, ϑ ¼ 1 is the parameter associated with the degree of freedom conditional probably may involve Monte Carlo estimates of the quantities of interest (Brechmann et al., 2013), when closed form copulas and marginal distributions are not available. Here we explain its parametric derivation.
Considering the first partial derivative of copula (Salvadori et al., 2007): Then a sample including conditional u 2 j u 1 is derived by solving Equation (6) for u 2 . To this end, two random samples of the same size are drawn from uniform distribution (0 1) representing c u 2 ju 1 and u 1 , which allows solving for c u 2 ju 1 .
For example, for the Clayton copula, this would look as: By solving Equation (7) for u 2 : The simulated pair from the Clayton copula is then u 1 , u 2 ju 1clayton À Á : The schematic process of conditional simulation is illustrated in Figure 4.
A review of conditional simulation from other Archimedean copulas can be found in Salvadori et al. (2007) or Zhang and Singh (2019). In the case of Gaussian copulas, the conditional probability can be calculated by using Cholesky decomposition of the Gaussian copula parameter . For a t-copula, a multivariate t- Þ 0 is generated where t υ denotes the degree of freedom of a standard univariate t-distribution. For more F I G U R E 4 Schematic illustration of Monte Carlo simulation of conditional u 2 given known u 1 (u 2 j u 1 ) and known copula probability C u 1 ,u 2 ð Þ: (a) A sample (here of size n = 15) is drawn from the first partial derivative of the copula. c u2ju1 and u 1 . Then the sample is distributed within the boundaries of C u 1 ,u 2 ð Þ. (b) This procedure is repeated until a sample of u 2 is uniformly distributed in (0 1) range. The result is then a simulated conditional distribution of u 2 information regarding simulation of the conditional distribution based on the Student-t copula and Gaussian, we refer to Charpentier et al. (2007) and Demarta and Mcneil (2004).

| METHODS
To model the dependence structure between hydroclimatic variables, several data analysis, processing, and fitting steps are required, which are-in practice-often presented in complex language in statistical/theoretical literature. Here, we synthesize common perception and practices in existing theoretical and applied literature, and provide a comprehensive overview of practical aspects when adopting a copula framework in hydroclimatic studies by combining a literature review and a case study.

| Literature review: identifying common practices
To identify a formal procedure for applying copulas, in particular for precipitation and temperature, a systematic literature review was conducted. We critically reviewed all articles returned from the Scopus search engine by the end of 2020 on keywords "Copula + Precipitation + Temperature." These keywords resulted in a list of 95 English peer-reviewed journal articles, of which 19 were identified as not relevant for the purpose of the current study after an initial visual screening. From each of the remaining 76 publications, the procedure applied by the authors (including all individual sub-steps) was extracted and collected in a database. Additionally, the underlying statistical assumptions, necessary requirements, and consequential limitations (if mentioned in the articles) were gathered. By directly comparing all steps and assumptions/ limitations, we identified recurring practical patterns, synthesized a common workflow, and recognized typical pitfalls.

| Demo case study: applying common practices
The extracted workflow with all sub-steps was then applied on a real demo case study: the Vattholma river catchment in South-eastern Sweden ( Figure 5). This particular catchment is located close to the city of Uppsala, approximately 70 km north of the capital Stockholm. With an area of 293 km 2 this catchment can be considered small-scale. It is predominantly flat and mostly covered by forest (81%). To assess the dependence structure between temperature and precipitation, we constructed a copula analysis for the "climate normal period" of 1961-1990 (corresponding to a 30-year average of the Earth's climate as defined by the World Meteorological Organization, available at community.wmo.int/ wmo-climatological-normals). Daily precipitation and temperature data were downloaded from a freely available database (www.vattenwebb.smhi.se) provided by the Swedish Meteorological and Hydrological Institute (SMHI). According to this data, the catchment is characterized by a warm-summer continental climate with an average annual temperature of 5.2 C and an annual precipitation sum of 633 mm.

| Development of a decision support framework
Based on the literature review and the experience from the demo case study, a didactic decision support framework was developed to provide a step-by-step guide for the application of copulas in hydroclimatic studies. The proposed framework considers the recurring practical patterns and typical perceptions, and offers a generic workflow that can-although being optimized for precipitation and temperature data-be applied to any other set of hydroclimatic variables.

| COMMON PRACTICE AND POTENTIAL MISCONCEPTIONS
Derived from the literature review, we identified four common practical steps in copula analyses include: (1) exploratory analysis of the dependence between two variables, (2) pretreatment of the data, (3) copula selection and parameter estimation, and (4) goodness-of-fit examination of selected copula models.
While all reviewed articles featured different experimental setups and, thus, included different combinations of the above steps and typically covered a varying number of sub-steps, only few articles consistently incorporated all of these steps, and others either did not follow all steps or did not report them transparently. Failure to consider all four aspects can compromise the robustness of the fitted copulas and, thus, result in misconception and malpractice, as we will demonstrate in the following sections.
Hereafter, we discuss each of the four main steps (spatiotemporal controls of dependence, pretreatment of data, copula model choice and parameter estimation, and goodness-of-fit evaluation) in detail. We first provide a short background, explaining the importance and practical implications of each step. Thereafter, we summarize the results from the literature review, that is, how has each step been addressed and reported by other authors. Finally, we demonstrate how to practically approach each step based on our demo case study.

| Background
To apply a copula model to a given set of variables, it is crucial to initially explore the relationship of these variables and assess their dependence Serinaldi, 2016). This is typically done at the spatio-temporal scale that is required by the purpose of the application. If variables are statistically independent, their joint distribution is simply the product of the marginal distributions (Brunner et al., 2016). However, if variables are statistically dependent, their combined behavior and properties should be taken into consideration.
In case of precipitation and temperature, there have been many attempts to study their combined behavior in a compound context for different purposes (e.g., Benestad Singh et al., 2020;Zscheischler et al., 2017). Their commonality is that they all highlight the importance of jointly considering the compound effects of precipitation and temperature dynamics in hydroclimatic studies. At the same time, hydroclimatic variables exhibit scale-dependent behavior, which is influenced by external forcing, coupling (A. Berg et al., 2015) and intrinsic variability of the climate system (Alberti et al., 2021;Franzke et al., 2020). Therefore, an accurate assessment of their joint distribution-especially in the context of a changing climate-is not straightforward. For example, the dependence between precipitation and temperature might vary regionally and depends on both considered spatial and temporal scales (Huang et al., 2009;Isaac & Stuart, 1992;C. Li et al., 2014;Trenberth, 2018;Trenberth & Shea, 2005). Barnston (1993) specifically argues that averaging over longer periods is tantamount to a filtering process and more reliable to measure the consistency of an underlying dependence. Accordingly, more consistent behavior can be yielded at an aggregated spatiotemporal scale (Fischer et al., 2013;Franzke et al., 2020;Markonis et al., 2018), because large scales commonly diminish the effect of natural variability (Myhre et al., 2019;van Oldenborgh et al., 2021).
Traditionally, the parametric Pearson linear correlation coefficient has been used to measure the dependence strength between two variables, which is, however, dependent upon the marginal distributions (Salvadori et al., 2014) and bounded by the assumption that marginal variables approximately follow normal distribution. This can result in misleading evaluations of the dependence nature of the variables in hand when such assumption is violated. The most widely used nonparametric measures employed in copula modeling for evaluating dependence strength are therefore rank correlations, including Kendall's τ (Kendall, 1938) and Spearman's ρ (Spearman, 1904). In many cases, these two measures result in different values, as they essentially measure different aspects of dependence. Spearman's ρ is simply Pearson correlation on the rank-transformed marginal variables (Fredricks & Nelsen, 2007), while Kendall's τ is the difference between probability of concordance, and the probability of discordance for two pairs (X i , Y i Þ and (X j , Y j Þ of observations (Nelsen, 2006). Both Kendall's τ and Spearman's ρ provide measures of strength of the association between random variables (Serinaldi, 2016), typically referred to as correlation. Thus, they do not fully reflect the complexity of dependence structure between variables, as they are not able to reveal the dependence structure and as such do not acknowledge dependence at specific parts of the distribution. For example, they do not specifically capture tail dependence, which refers to the strength of dependence in the upper-right-quadrant tail or lower-left-quadrant tail of a bivariate distribution (Joe, 2014;Nelsen, 2006;Poulin et al., 2007). These aspects need further investigation in the application of copulas and will be referred to in the subsequent sections.

| Common practices
Despite the importance of assessing dependence strength between two variables at a given spatiotemporal scale, 35% of the reviewed articles did not indicate the strength of correlation by any of the aforementioned correlation measures. Twenty-one percent reported dependence strength in the form of Pearson correlation coefficient, and less than half of all articles (44%) transparently stated (or visualized) the correlation strength between variables with application of nonparametric measures.
Each article was also scanned for the spatial scale of the investigated study site(s) and the temporal resolution of the data. Roughly 54% of the reviewed articles were based on meso-to large-scale studies (i.e., >300 km 2 ), while 22% were conducted on smaller scale (i.e., <300 km 2 ) and the remaining 24% focused on particular hydroclimatic stations.
In addition, we found that most reviewed publications (55%) used daily data, 26% considered monthly data, 4% (three publications) considered hourly data, and 5% (four publications) considered annual data of temperature and precipitation. Additionally, 3% (two publication) assessed the dependence structure in different temporal scales (i.e., monthly, seasonal, and yearly). The remaining publications (roughly 7%) did not specify their temporal scale of the data. There were no marked differences in the reporting rate of the dependence between studies at lower (i.e., monthly) and higher (i.e., daily) temporal resolution.

| Demo case study
For the demo case study of the Vattholma river basin, we selected daily resolution to further generate copula dependence structures, because daily data can readily be used for finer-resolution hydrological purposes (e.g., bias adjustment, daily runoff estimation). For illustrative purposes, we also investigated how dependence strength between precipitation and temperature, as measured by Kendall's τ, is modulated by the temporal resolution ( Figure 6) which showed that correlation for monthly scale was generally stronger than that at the daily scale ( Figure 6, lines 1 and 2). Measured correlation also showed seasonal variations: at the monthly scale, we observed negative correlations throughout the entire year except for the months of February and March. At daily resolution, temperature and precipitation were positively correlated in autumn and winter (September-March) and negatively in spring and summer (April-August).

| Pretreatment of the data
The application of copulas requires the hydroclimatic variables to be continuous and time-independent (Nelsen, 2006). Therefore, it is essential to assess the existence of ties (i.e., data with the same rank) or other characteristics such as autocorrelation and stationarity of variables. Otherwise, the violation of particular assumptions (explained in the following subsections) in relation to these characteristics can potentially compromise the robustness of their joint copulas and the resulting conclusion.

| Ties (data with the same rank)
Background Copulas must be built on continuous sets of variables (Equation 3) and are based upon rank dependence between variables (see Section 2.2). Numerous practical studies highlighted that the presence of ties in rank-scaled samples can cause bias in the copula, and later affect the goodness-of-fit examination of selected copula models (De Michele et al., 2013;Genest et al., 2011;Pappadà et al., 2017).
Therefore, the existence of ties can cause issues to rank-based inference methods that are intended for data with no ties. In many cases, real data often contains ties. For example, precipitation or streamflow data-especially on daily resolution-is bound by zero and typically contains numerous zero-data values (i.e., dry days) with the same rank. However, issue with such ties are not limited to dry days only: They can also occur because of measurement imprecision (e.g., due to rounding to the next integer value) (Vandenberghe et al., 2010). Also, many climate models generate a large number of days with very low precipitation values (often of the same rank)-so-called drizzle days with low-intensity, but nonzero precipitation (Ines & Hansen, 2006;Teutschbein & Seibert, 2012). In the presence of ties, a specific F I G U R E 6 Kendall correlation (τ) between precipitation and temperature at different temporal scales. Filled markers indicate statistical significance at the 5% level. Line 2 represents Kendall correlation in daily resolution prior to tie removal and line 3 after eliminating dry days (to avoid ties) version of Kendall's τ can be applied that allows for ties in the data (Hamed, 2008(Hamed, , 2011Nešlehov a, 2007). However, the effect of ties is not easy to account for in case of persistent data (Hamed, 2011).
The literature offers several solutions when encountering ties, specifically for the case of copula analysis (e.g., Bücher & Kojadinovic, 2016;Kojadinovic, 2017). For example: (1) to stop the analysis, (2) to discard the ties, (3) to use minimum, average or maximum ranks for the tied observations, or (4) break the ties randomly.
Option (1) is often not preferred as it essentially stops the practitioner from any further copula-based analysis.
Option (2) implies that parts of the data are thrown away. Several studies have concentrated on the joint distribution of precipitation and temperature on wet days only (e.g., Li et al., 2014;Piani & Haerter, 2012;Räty et al., 2018). However, depending on the region of study and the temporal scale, there can be a considerable number of zero-precipitation values and removing those can often eliminate a large portion of the precipitation-temperature spectrum, especially in drier regions. This method typically alters the underlying dependence structure (Genest et al., 2011) and requires an added postprocessing step (which adds additional uncertainty) to handle the data points that were eliminated.
Option (3) assigns tied observations a rank equal to either the minimum, the average or maximum rank that they would have if they were broken (Kojadinovic, 2017;Mcneil et al., 2015): In a sample of five observations {3, 1, 4, 3, 9} the two occurrences of number three would have gotten second (minimum) and third (maximum) rank. Thus, using the minimum rank, the sample would have ranks {2, 1, 4, 2, 5}. In the case of using the average rank, the sample would receive ranks {2.5, 1, 4, 2.5, 5}, while it would have ranks {3, 1, 4, 3, 5} in case of using the maximum rank.
Overall, handling observations with similar ranks has not received the attention it deserves, but is receiving more attention recently and new approaches are being proposed. For example, a recently modified test for the goodness-of-fit of copula models in presence of ties has been introduced by Kojadinovic (2017) and has been implemented in the 'copula' R package (Hofert et al., 2020).

Common practice
Our literature review revealed that 76% of the publications did not mention the issue of dealing with ties. However, as mentioned in Section 4.1.2, roughly 35% of the reviewed articles analyzed monthly or yearly data, in which ties are less likely to be observed. Among the articles that dealt with the issue with ties, 20% discarded where the data was bounded by ties (dry days) and 7% (five articles) used a jittering algorithm to break the ties.

Demo case study
To check for ties, we assessed the number of dry days in the Vattholma catchment, which varied from month to month, ranging from 51% dry days in May to 26% in November. Except for the dry days, the data did not feature any other duplicates (like drizzle) that would have caused ties. The removal of dry days affected the dependence for some months, with the largest impacts in the warmer season from April to August ( Figure 6, line 3). The Kendall correlation was much more sensitive to the elimination of dry days during these months as they contained more dry days.
Removing dry days resulted in generally weaker correlations between precipitation and temperature ( Figure 6, line 3). Kendall correlations, after removing zeros, were significant for the warm season of May to August. Colder months indicated only weak and insignificant correlations. Thus, the subsequent copula-based analysis will be executed for the month of June, which featured the strongest negative (and significant) correlation.

| Autocorrelation
Background Autocorrelation can be defined as the degree of similarity (i.e., the correlation) between a time series and a lagged version of it over successive intervals (Box et al., 2015). The presence of autocorrelation reduces the efficiency of a copula, because it boosts the variances of residuals and estimated coefficients (Cong & Brady, 2012;Rakonczai et al., 2012). In some studies, copulas have been applied to autocorrelated time series using so-called serial copulas (Genest & Rémillard, 2004;Ghoudi & Rémillard, 2004) -sometimes also referred to as autocopulas (Rakonczai et al., 2012;Sugimoto et al., 2016), or using dynamic copulas (Serinaldi & Kilsby, 2017), which allow for a time-varying dependence structure instead of assuming it to be fixed (van den Goorbergh et al., 2005). Relying on the concept of conditional copulas (Patton, 2006), these copulas allow for a conditional distribution of variables and their dependence to be applied on lagged version of the marginal variables (Größer & Okhrin, 2021). However, the mathematical procedure to obtain these copulas are generally more complex, and further uncertainties may emerge (Rakonczai et al., 2012). For more information and an advanced review of copulas for autocorrelated time series, the reader is referred to Größer and Okhrin (2021), Oh and Patton (2018), and Patton (2006Patton ( , 2012. Some articles argued that a classic copula approach is still applicable in case of only low-degree (removable) autocorrelation in time series (Laux et al., 2011;Vogl et al., 2012). Therefore, it is important to consider the degree of autocorrelation present in the studied data and to potentially take measures to remove autocorrelation or consider this aspect as the added source of uncertainty. Such measures may include the adoption of different filters, such as autoregressive (AR), moving average (MA), or autoregressive moving average (ARMA) models in different variants and time lags (Hamed, 2009). Then copula analysis is performed on the residuals extracted from this process (e.g., Manning et al., 2018;Serinaldi, 2016).
If the data also features heteroskedasticity, models that are capable of handling such behavior, such as the Generalized AutoRegressive Conditional Heteroskedasticity (GARCH) process, or a combination of ARMA and GARCH approaches (ARMA-GARCH) can also be considered (e.g., Elek & M arkus, 2008;Laux et al., 2011).

Common practice
Of the reviewed articles, 81% did not mention if their data exhibited autocorrelation. The remaining 19% dealt with autocorrelation in different ways: 9% (seven articles) prewhitened the data by applying an AR model, 3% (two articles) used dynamic copulas, 5% (four articles) selected the data in a way that it did not show autocorrelated behavior, and 2% (one article) mentioned not dealing with autocorrelation as limitation of their framework.

Demo case study
To check if the daily observations in the Vattholma demo case study showed autocorrelated behavior, we examined the autocorrelation function (ACF) for precipitation and temperature data in the month of June. Precipitation did not show autocorrelated behavior (Figure 7a), while temperature indicated some autocorrelated characteristics (Figure 7b). In order to account for this autocorrelation, we followed the procedure used by Laux et al. (2011): a first-order autoregressive model AR (1) was synthesized to represent time-dependent time series. Then the synthesized autocorrelated data was subtracted from the original temperature data set. We followed this procedure to create a temperature data set free from autocorrelation, which we validated by plotting the ACF of the residuals (Figure 7c). The new data set was time-independent and hence suitable to be used in the copula model.

Background
The assumption of stationary of a data series requires the mean and variance to not change significantly over time, because copulas are commonly established on a set of available data within a certain (historic) period to draw conclusions about the combined future behavior of two or more variables. For example, copulas can be used for multivariate frequency analyses to calculate joint return periods of extreme variables such as extreme rainfall intensity and extremely long duration of individual precipitation events (e.g., AghaKouchak et al., 2014;Salvadori & De Michele, 2010). In these cases, stationarity is of importance, because copula parameters remain constant even for future periods (i.e., assuming stationarity) and findings from such analyses are commonly extrapolated to analyze future risks.
But given a changing climate (IPCC, 2014), the assumption of stationarity is possibly not met in many analyses (Ehret et al., 2012;Maraun, 2012;Slater et al., 2021). Future extrapolations may then no longer be valid and can severely limit the credibility of future climate-change risk assessments (Cooley, 2013). The literature holds a large variety of articles that address the detection and discuss methods for the attribution and prediction of nonstationarity in hydroclimatic variables. Slater et al. (2021) provide a comprehensive overview of these methods and discuss related key challenges.
While detection of nonstationarity does not necessarily preclude a copula analysis, it requires the use of appropriate nonstationary models. For instance, studies by Jiang et al. (2015) and Ahn and Palmer (2016) took the nonstationarity in the marginal distributions into account by applying a time-varying (i.e., nonstationary) copula model to estimate the probability of low flows. In this approach, the parameter estimation can be carried out using a moving time window, which is large enough to provide a good fit of the copula function, but can offer stationarity in the data (Bender et al., 2014;Feng et al., 2020). The copula parameter can then be modified to be time-dependent, which changes in each time window (Ahn & Palmer, 2016).

Common practice
Although nonstationarity is a topic that receives much attention in theoretical hydroclimatic modeling, it seems to be a statistical feature that receives little devotion in applied impact studies, as 71% of the reviewed articles did not mention this particular issue in their copula analysis. Among others, 10% checked or assumed that their variables are stationary, while 13% mentioned that their data showed nonstationary behavior and dealt with it either with a nonstationary copula model (9%) or by removing the nonstationary behavior (4%). Four articles (6%) mentioned not dealing nonstationarity, as the limitation of their work.

Demo case study
To assess if precipitation and temperature exhibited stationary or nonstationary behavior, we divided the data into two samples of 15 years. We applied the Mann-Whitney U-test (Mann & Whitney, 1947) under the null hypothesis that the data in both 15-year periods came from random samples with equal medians, which was not rejected at a 5% significance level. This pointed to stationarity of our data in the period of observation.

| Copula selection and parameter estimation
Copula selection and parameter estimation is an important step when adopting a copula approach to study multivariate behavior, because a proper fitting ensures that both the dependence structure and the dependence strength are properly represented. The dependence structure is reflected by the chosen copula family, while the dependence strength is estimated by the copula parameters. Thus, both the selection of an appropriate copula family as well as the estimation of copulas parameters require special attention.

| Selection of copula families
Background Numerous theoretical copulas exist that model different features of the dependence structure differently, with some copulas being more appropriate to represent the overall dependence while others are better suited for capturing the F I G U R E 7 The autocorrelation function (ACF) for daily precipitation (a) and temperature (b) in June, and temperature residuals (c) up to 20th order of lag after removing autocorrelation from the data by AR (1) filter dependence at the tails of the distribution. Thus, it is crucial to re-assess if dependence was preserved, diminished, or enhanced following data pretreatment (e.g., by using traditional correlation measures), and to unravel the nature of tail dependence. Numerical measures should ideally be complemented by a graphical analysis. A simple graphical method is to create a scatter plot displaying the rank transformed pairs, which is also known as empirical copula (exemplified in Figure 8). Other graphical methods such as Chi-plots (Fisher & Switzer, 1985) or K-plots (or Kendall plot;Genest & Boies, 2003) can also be used. Chi-plots were introduced as an alternative to rank transformed scatter plots and they are connected to chi-square test statistics of independence (Genest & Boies, 2003;Poulin et al., 2007;Requena et al., 2013), but compared to other methods (e.g., scatter plot and K-plot) they are relatively difficult to interpret (Genest & Boies, 2003;Zhang & Singh, 2019) and are, thus, not visualized in this study. K-plots are based on the Kendall distribution function and are relatively easier to interpret compared to a Chi-plot. A K-plot relates the order statistics H i ð Þ estimated from the observed data to the expected value of these statistics W i:n ð Þ generated under the null hypothesis of independence between the marginal distributions (Genest & Rivest, 1993;Nelsen et al., 2003) (exemplified in Figure 9). After a visual assessment of tail dependence, a set of copula families should be selected to reflect the identified dependence structure between the given variables.

Common practice
Visual inspection enhances the assessment of tail dependence and consequently the selection of suitable copula families. However, more than half of the articles (53%) did not mention any visual tests (i.e., scatter plots, Chi-plot, or Kplot). Among the reviewed articles, 47% mentioned or showed scatter plots of the joint distribution of the variables. Finally, 5% (four articles) referred to K-plots, and 4% (three articles) used Chi-plots.
In the reviewed literature, five copula families of Gaussian, Student-t, Clayton, Frank, and Gumbel were selected most often as suitable candidates to generate identified dependence structures between variables.
Demo case study To graphically investigate shape and strength of the dependence structure of a given sample, we used scatter plots. Here, a month with weak Kendall correlation (i.e., December) is randomly selected and compared to June, which exhibited the strongest correlation. Then the empirical pairwise rank relationship (i.e., empirical copula) between variables was assessed (using Equation 3) and plotted for December ( Figure 8a) and June (Figure 8b).
The scatter plot for December seemed to be randomly distributed and, thus, did not reveal any clear dependence pattern. In June, the data showed negative dependence between variables. To assess dependence in more detail and to visualize the tail behavior, we also visualized the December and June data in K-plots (Figure 9).
The closer the pairs W i , H i ð Þare to the diagonal line ( Figure 9, dashed line), the more likely they are to be independent. In contrast, a larger distance represents stronger dependence (Genest & Boies, 2003). The K-plot of precipitation and temperature in December (Figure 9, pink line) was relatively close to the diagonal line, thus indicating only weak dependence (Figure 9). On the contrary, in June (Figure 9, turquoise line) the distance to the diagonal line was larger, providing evidence for a stronger dependence. The fact that the K-plot fell below the diagonal line implies a negative dependence. The K-plot also indicated a stronger dependence in the middle part of the distribution (i.e., a greater F I G U R E 8 Scatter plots of empirical rank transformed variables daily precipitation (U 1 ) and temperature (U 2 ) in (a) December representing random dependence and (b) June representing negative dependence distance) and weaker dependence in the upper tail (i.e., a shorter distance), while the lower tail featured independence (i.e., K-plot fell on the diagonal line). Thus, copula families, which are capable of preserving these features must be selected as suitable candidates for further assessment. Consequently, the Student-t, Gaussian, and Frank copulas were selected as potentially suitable copula families as they can all model negative dependence and are able to model lower dependencies at the tails (see Section 2.3).

| Parameter estimation
Background Application of copula model requires copula parameter estimation. This process entails (1) an estimate of marginal distributions, (2) drawing copula parameter either by means of theoretical relationships between the parameter and empirical dependence measures, or by maximizing a likelihood function. In the literature, methods for copula parameter estimation are grouped as parametric, nonparametric, and semi-parametric (Choro s et al., 2010;Weiß, 2011).
In parametric approaches, both marginal distributions and copula parameter are determined by parametric distributions and copula parameter is estimated by maximizing the likelihood function of composite functions of marginal distributions and dependence (Choro s et al., 2010;Joe & Xu, 1996). Maximum likelihood (ML) and inference function for marginal distributions (IFM) are two of the widely used parametric methods for copula parameter estimation. In ML, likelihood function including parameters of marginal distribution and dependence are maximized jointly (Joe, 2014). IFM has two steps: (1) a distribution is fitted to the marginal variables, and (2) parameters of marginal distributions are substituted in the likelihood function-now only dependent on the copula model parameters-which is maximized to infer copula parameter values. IFM is specifically advantageous if the marginal variables cannot be represented well by any theoretical probability distribution or if the likelihood function is hard to be maximized directly (Joe, 2005).
Methods based on parametric estimation of marginal distributions are numerically easier (than nonparametric methods) to implement and are also suitable for higher dimensions, while also being applicable to data with timevarying dependence (Joe, 2014). They can, however, only represent a given number of distribution shapes, which can limit their application (Kim et al., 2007) and naturally depend on correct assumptions about the distribution of marginal variables.  specifically advised against adopting parametric estimation of marginal distributions, stating: "dependence structure captured by a copula has nothing to do with the individual behavior of the variables. Any inference about the parameter indexing a family of copulas should thus rely only on the ranks of the F I G U R E 9 K-plot to assess dependence structure for December (pink line) and June (turquoise line), where H i ð Þ is the order statistics estimated from the observed data and W i:n ð Þis the expected value of these statistics observations, which are the best summary of the joint behavior of the random pairs." According to Genest et al. (2009), "an inappropriate choice of models for the margins may have detrimental effects on the estimation of the dependence parameter per se." To overcome the issue of parametric assumptions about marginal distributions, semi-parametric and nonparametric approaches are introduced which treat marginal variables as unknown functions (Kim et al., 2007), and solely rely on ranks of variables. Consequently, the dependence structure is not compromised by assumptions about the marginal distributions (Charpentier et al., 2007).
In the nonparametric methods, copula parameters are estimated based on their theoretical relationships with rank/ order correlation coefficients (i.e., Kendall's τ or Spearman's ρ) (Genest & Boies, 2003). The relationship between five widely used copula families and Kendall' s τ is shown in Figure 3. This method is quite straightforward and rather easy to apply, however is less suitable for copulas with more than one parameter or when it is not possible to draw an immediate relationship between rank/order correlation coefficients and copula parameter  as is the case for the ν in the Student-t copula.
Semi-parametric approaches, on the other hand, offer a compromise between fully parametric and nonparametric methods (Kim et al., 2007) as they are based on first estimating the univariate marginal distributions nonparametrically by their sample rank behavior and forming pseudo observations, and then estimating copula parameters by maximizing the pseudo likelihood function. Due to the flexibility of semi-parametric approaches, they have been used in several hydroclimatic studies (e.g., Pham et al., 2016;Ribeiro et al., 2019;Vergni et al., 2020) and is also adopted in this article.

Common practice
In the studied articles, 10% used the empirical copula to model dependence structure between variables; therefore, it was not necessary to discuss their method to estimate copula parameter. Twenty-two percent of all articles did not identify their method to estimate copula parameters. Among the 68% which mentioned their method for parameter estimation, nonparametric methods were used most often.

Demo case study
We estimated copula parameters by maximum pseudo likelihood method for three selected copula families, that is, Gaussian, Student-t, and Frank. These families are selected from the five widely used copula families previously introduced in Section 2.3, because they are capable of retaining negative dependence (as it is the case for dependence in June).
We specifically chose a semi-parametric method because it relies only on rank transformation of the variables and does not require heavy assumptions about the marginal distributions. The estimated parameters are reported in Table 1.

| Background
After estimating the copula parameters, a performance measure must be used to assess the goodness-of-fit obtained from the different copula families to eliminate nonadmissible copulas. To this end, one approach is to draw random samples from the copula family with the specific copula parameters that were estimated in the previous step. Then, a graphical goodness-of-fit test can be applied to visually assess the fit. For example, a scatter plot of the rank-transformed data (i.e., empirical copula) can be compared with the generated data points from the theoretical copula (Brunner et al., 2016;. Generated samples should either be of the same size or larger than observed sample size to avoid misleading results . Another option is to compare the K-plot of the empirical copula directly with the K-plot of a sample drawn from theoretical copula (Hofert & Mächler, 2014), or to analyze the quantile-quantile plots or level curves/isolines (e.g., Sadegh et al., 2017;Salvadori & De Michele, 2004;Sraj et al., 2015). For more details on graphical goodness-of-fit measures, especially in higher dimensions, refer to Hofert and Mächler (2014).
While visual inspections can provide a quick and approximate measure as to whether the theoretical copulas roughly fit the empirical copula, formal goodness-of-fit tests have received much more interest. The review articles by Berg (2009) or Genest et al. (2009) in particular provide intensive lists of possible goodness-of-fit tests. Here, we describe two formal approaches: the rank-based versions of the Cramér-von Mises S n ð Þ and the Kolmogorov-Smirnov T n ð Þ statistics (Genest et al., 2006): where  n u ð Þ ¼ ffiffiffi Þand sup is the supremum of the distances at point x. K n and K θ respectively refer to a sample taken from empirical (C n ) and theoretical C θ ð ) copula family. Both tests assess the null hypothesis that the empirical copula comes from a given parametric copula family. Thus, rejection of the null hypothesis states that the empirical copula under examination does not follow the selected parametric copula. Both tests are relatively easy to conduct as they require no additional parameters and no particular strategic decisions, however, Genest et al. (2009) found S n to be somewhat more powerful than T n .
To estimate a reliable S n or T n , and to calculate the corresponding p-value, a bootstrapping algorithm as described by Genest et al. (2009) can be implemented via "copula" package (https://cran.r-project.org/web/packages/copula/ index.html) or "gofCopula" package (https://cran.r-project.org/web/packages/gofCopula/index.html) in R. A numerical bootstrapping of Cramér-von Mises S n ð Þ is implemented in the Multivariate Copula Analysis Toolbox (MvCAT)  in MATLAB.
In the case of several suitable copula families, the literature suggests to either choose the model with the smallest test statistics (Wang & Wells, 2000) or to apply the Akaike or the Bayesian information criterion to identify the copula model that best balances model fit with simplicity/generalizability (Brunner et al., 2016;Huard et al., 2006).

| Common practice
In the reviewed literature, 10% of the articles used the empirical copula as the co-dependence structure and, thus, a formal validation of fit was not necessary. Twenty-five percent of the articles did not transparently describe any performance measures, while 65% checked the goodness-of-fit of different copulas using at least one performance measure. The Cramér-von Mises statistic S n ð Þ was used more often (26%) than the Kolmogorov-Smirnov statistic T n ð Þ (5%), while 34% of the studies used other methods such as rank scatter plots, in combination with Akaike information criterion.

| Demo case study
We initially performed a visual examination of rank-based scatter plots of the three used copulas (Figure 10a). In direct comparison to the scatterplot of the empirical copula (Figure 8b), all three families seemed reasonable. Additionally, we evaluated the contours for their joint CDF for 0:1, 0:2,…0:9 quantiles (Figure 10b).
The three copulas performed differently in the tails, which are more apparent in the lower tail. However, these differences are not easily visualized, which highlights the need for following the formal procedure to calculate S n and T n according to Equations (9) and (10) after this visual analysis. Both tests suggested that none of the three copula families were T A B L E 1 Copula parameters, Cramér-von Mises S n ð Þ, and the Kolmogorov-Smirnov T n ð Þ statistics of selected copula families their associated rejected at 5% significance level (Table 1). However, following the suggestion of Wang and Wells (2000), the Frank copula was selected as the most suitable model to represent the dependence structure between precipitation and temperature in June.

| DECISION SUPPORT FRAMEWORK FOR APPLYING COPULAS
To ensure an adequate, consistent, and transparent use of copulas in hydroclimatology, we highlighted a number of common steps and requirements that need careful handling. Below, we briefly outline a stepwise decision support framework ( Figure 11) for hydroclimatic copula applications with the goal of providing a didactic guideline for end users, and to prevent users from stepping into one of the statistical traps that copulas conceal. The proposed decision support framework has been developed to minimize possibilities of incorrect applications of copulas and to ensure reliable and comparable results. Figure 11 shows four steps that should typically be taken before and during implementation of the copula modeling, and elucidates different alternatives that can be undertaken. The four generic steps include (1) exploratory analysis of the dependence, (2) pretreatment of the data, (3) copula selection and parameter estimation, and finally, (4) goodness-of-fit assessment. For each step in the framework (Figure 11) comprehensive explanations, important mathematical equations and scientific references for the various associated tasks can be found in the previous sections. A short narrative of each step is provided here: 1. Prior to constructing copulas, it is valuable to initially explore the dependence between variables at spatio-temporal scales prescribed by the purpose of the application. One should critically reflect on whether the involved variables are controlled by related driving forces that could, at a relevant spatiotemporal scale, lead to a dependence structure between them. If so, the strength of their dependence can be calculated by correlation measures such as Kendall's τ F I G U R E 1 0 Copulas for daily precipitation (U 1 ) and temperature (U 2 ) in June in the Vattholma catchment. (a) Scatter plots of rank transformed pairs simulated from Student-t, Gaussian, and Frank copulas. (b) Level curves (isolines) of three fitted copulas and the empirical copulas. The numeric labels on the curves correspond to the cumulative joint probability of U 1 and U 2 or Spearman's ρ (see descriptions in Section 4.1), which can also serve as a reference for evaluating the effects of data pretreatment. If needed, dependence on other applicable scales can also be investigated. Without common F I G U R E 1 1 Decision support framework for adopting copula modeling. Each step (light blue) refers to a subheading of Section 4, which explains the statistical requirements and general assumptions that should be fulfilled 20 of 28 controlling factors or indications for potential dependencies, the added value of using a copula approach could be questionable and simpler methods might be favorable. 2. Thereafter, additional pretreatment steps should be taken to ensure a robust copula modeling. This implies a thorough assessment of the given data set in terms of stationarity, autocorrelation and potential ties (i.e., data with same ranks). Depending on the results, the data has to be treated (see descriptions in Section 4.2) to fulfill the statistical requirements of copulas. Ties, nonstationarities and autocorrelation add uncertainties to the analysis and limit the efficiency of copula modeling, thus reducing the reliability of results. 3. When selecting copula families and estimating parameters, the joint dependence structure (e.g., stronger dependence in the tails or entire distribution, negative or positive correlation) must be further examined. This strongly affects the selection of appropriate copula families, or may support choosing a simpler method. After selecting a set of suitable copula families, their parameters need to be estimated by adopting feasible parametric, nonparametric or semiparametric approaches (see description in Section 4.3). 4. The final step includes the selection of the best-fitting copula based on a goodness-of-fit assessment. Several visual (e.g., scatter plots or K-plots) or analytical tests (e.g., Cramér-von Mises, Kolmogorov-Smirnov statistics in addition to Akaike information criterion) can be performed to identify improper copulas (Section 4.4).
Copula modeling, similar to any other modeling effort, is associated with uncertainties (e.g., due to insufficient observations or model structural uncertainty), which propagate into joint probability estimates and other derived inferences . Additional challenges for the practical application of copulas, apart from uncertainty, have also stimulated currently trending research focusing, for instance, on (1) Vine copulas to construct higher-dimensional copulas, for example, to assess interplay between three or more hydroclimatic variables (Aas et al., 2009;Größer & Okhrin, 2021;Pham et al., 2016;, (2) dynamic copulas for retaining autocorrelation and nonstationarity in hydroclimatic time series (Ahn & Palmer, 2016;Feng et al., 2020;Rakonczai et al., 2012), and (3) copula modeling for hydroclimatic data with ties (Kojadinovic, 2017;. In light of the growing interest in copulas for practice-oriented hydroclimatic applications, our article provides an overview of the state-of-the-art of using copulas and highlights necessary requirements, statistical assumptions and limitations. Based on a systematic literature review, we found two general issues emerged: (1) important steps for deriving copulas are often not reported in the literature (limited reproducibility) and (2) key statistical assumptions required by copulas are often not tested (limited credibility). Thus, we identified four common steps for practice-oriented studies: 1. Exploratory analysis of spatiotemporal controls of dependence, 2. pretreatment of data, 3. copula model choice and parameter estimation, and 4. goodness-of-fit evaluation.
Failure to assess each of these steps could potentially compromise study conclusions, as demonstrated by the showcase study in this article. We further provide a visual decision support framework ( Figure 11) to foster the transparency and reproducibility of scientific publications, while at the same time increasing the credibility of copula applications in hydroclimatic research. By breaking down the process into four generic steps and providing short and simple narratives, the framework can easily be adopted for any set of variables to construct copulas, thereby supporting researchers and decision makers in addressing hydroclimatic hazards.