A generalized definition of the tetrachoric correlation coefficient
(English)Manuscript (Other academic)
We generalize the tetrachoric correlation coefficient to a large class of parametric families of bivariate distributions. We also show that the generalized definition agrees with the conventional definition on the family of bivariate normal distributions. Furthermore, we provide a necessary and sufficient condition for the generalized tetrachoric correlation coefficient to be well defined for a given family of distributions, and some sufficient criteria which can be useful for practical purposes. Moreover, we illustrate with examples how the distributional assumption can have a profound impact on the conclusions of the association analysis. Using S&P 100 stock data, we exemplify the fact that a correct distributional assumption is vitally important for the analysis. Consequently, it is concluded that the tetrachoric correlation coefficient is not robust to changes of the distributional assumption.
tetrachoric correlation, generalization, 2 x 2 contingency table, dichotomous variables, measure of association, robustness
Probability Theory and Statistics
Research subject Statistics
IdentifiersURN: urn:nbn:se:uu:diva-100694OAI: oai:DiVA.org:uu-100694DiVA: diva2:210882