Application of tetrachoric and polychoric correlation coefficients to forecast verification

Authors

  • Josip Juras Department of Geophysics, Faculty of Science, University of Zagreb, Zagreb, Croatia
  • Zoran Pasarić Department of Geophysics, Faculty of Science, University of Zagreb, Zagreb, Croatia

Keywords:

tetrachoric correlation coefficient, contingency table, forecast evaluation

Abstract

The measure of association in 2 x 2 (K x K) contingency tables known as tetrachoric (polychoric) correlation coefficient is recalled. These measures rely on two assumptions: 1) there exist continuous latent variables underlying the contingency table and 2) joint distribution of corresponding standard normal deviates is bivariate normal. It is shown that, in practice, the tetrachoric (polychoric) correlation coefficient is an estimate of Pearson correlation coefficient between the latent variables. Consequently, these measures do not depend on bias nor on marginal frequencies of the table, which implies a natural and convenient partition of information (carried by the contingency table), between association, bias and probability of the event and subsequently enables the analysis of how other scores depend on bias and marginal frequencies. Results extended to K ´ K tables lead to eventual reduction in dimensionality from K2 to 2K. The theoretical findings are illustrated through analysis of real-life, 6 ´ 6 contingency tables on verification of quantitative precipitation forecasts.

Downloads

Published

2006-01-31

Issue

Section

Original scientific paper