Categorical Neighbour Correlation Coefficient (CnCor) for Detecting Relationships between Categorical Variables

Authors

  • Lifeng Zhang School of Information, Renmin University of China
  • Shimo Yang School of Information, Renmin University of China
  • Hongxun Jiang School of Information, Renmin University of China

DOI:

https://doi.org/10.1609/aaai.v36i8.20889

Keywords:

Machine Learning (ML)

Abstract

Categorical data is common and, however, special in that its possible values exist only on a nominal scale so that many statistical operations such as mean, variance, and covariance become not applicable. Following the basic idea of the neighbour correlation coefficient (nCor), in this study, we propose a new measure named the categorical nCor (CnCor) to examine the association between categorical variables through using indicator functions to reform the distance metric and product-moment correlation coefficient. The proposed measure is easy to compute, and enables a direct test of statistical dependence without the need of converting the qualitative variables to quantitative ones. Compare to previous approaches, it is much more robust and effective in dealing with multi-categorical target variables especially when highly nonlinear relationships occurs in the multivariate case. We also applied the CnCor to implementing feature selection by the scheme of backward elimination. Finally, extensive experiments performed on both synthetic and real-world datasets are conducted to demonstrate the outstanding performance of the proposed methods, and draw comparisons with state-of-the-art association measures and feature selection algorithms.

Downloads

Published

2022-06-28

How to Cite

Zhang, L., Yang, S., & Jiang, H. (2022). Categorical Neighbour Correlation Coefficient (CnCor) for Detecting Relationships between Categorical Variables. Proceedings of the AAAI Conference on Artificial Intelligence, 36(8), 9048-9056. https://doi.org/10.1609/aaai.v36i8.20889

Issue

Section

AAAI Technical Track on Machine Learning III