Volume 30 Issue 6
Nov.  2021
Turn off MathJax
Article Contents
LIU Chuanlu, WANG Shuliang, YUAN Hanning, et al., “Detecting Three-Dimensional Associations in Large Data Set,” Chinese Journal of Electronics, vol. 30, no. 6, pp. 1131-1140, 2021, doi: 10.1049/cje.2021.08.008
Citation: LIU Chuanlu, WANG Shuliang, YUAN Hanning, et al., “Detecting Three-Dimensional Associations in Large Data Set,” Chinese Journal of Electronics, vol. 30, no. 6, pp. 1131-1140, 2021, doi: 10.1049/cje.2021.08.008

Detecting Three-Dimensional Associations in Large Data Set

doi: 10.1049/cje.2021.08.008

This work is supported by Science and Technology Innovation Research Project of The Ministry of Science and Technology of China (No.ZLY201970, No.ZLY201976-02).

  • Received Date: 2020-01-20
  • Rev Recd Date: 2020-07-08
  • Available Online: 2021-09-23
  • Publish Date: 2021-11-05
  • The associations detection among variables in the large dataset is recently important due to the rapid growth rate of data. The interested associations can provide references for solving the problems such as dimension reduction and feature selection. Many methods have done on the associations detection of pairwise variables. The multi-dimensional variables, especially three-dimensional variables, is rarely studied. The relationships among them cannot be revealed by the detection of pairwise variables methods. A new method of Maximal three-dimensional information coefficient (MTDIC) is proposed which is able to indicate the associations of three-dimensional variables. The correlation coefficient is calculated from the three-dimensional mutual information. The World Health Organization (WHO) data and the Tara data are selected to evaluate their associations. The experiment is verified by comparing the coefficient results with the Distance correlation (Dcor). The accurate association strength is obtained by an iterative optimization procedure on sorting descending order of coefficients. The MTDIC performs better than the Dcor in generality and equitability properties.
  • loading
  • D.R Li, S.L Wang, H.N Yuan, et al., "Software and applications of spatial data mining", Wiley Interdisciplinary Reviews:Data Mining and Knowledge Discovery, Vol.6, No.3, pp.84-114, 2016.
    B.Y Chen, and M.P Kwan, "Special issue on spatiotemporal big data analytics for transportation applications", Transportmetrica, Vol.16, No.1, pp.1-4, 2020.
    N. Verma, D. Malhotra and J. Singh, "Big data analytics for retail industry using MapReduce-Apriori framework", Journal of Management Analytics, Vol.7, No.3, pp.424-442, 2020.
    W. Wang and L.B Cao, "Negative sequence analysis:A review", ACM Computing Surveys, Vol.52, No.2, pp.1-39, 2019.
    D. Xu, I.W Tsang, E.K Chew,et al., "A data-analytics approach for enterprise resilience", IEEE Intelligent Systems, Vol.34, No.3, pp.6-18, 2019.
    P. Mukherjee and B.J Jansen, "Analyzing attitude of second screen social media messages", IEEE Intelligent Systems, Vol.33, No.6, pp.27-35, 2018.
    N. Japkowicz and Y. Elovici, "Introduction to the special issue on data mining for cybersecurity", IEEE Intelligent Systems, Vol.33, No.6, pp.27-35, 2018.
    D.R Li, S.L Wang and D.Y Li, Spatial Data Mining, SpringerVerlag, Berlin Heidelberg,Germany, pp.1-22, 2015.
    E.H Linfoot, "An informational measure of correlation", Information and Control, Vol.1, No.1, pp.85-89, 1957.
    D.N Reshef, Y.A Reshef, H.K Finucane, et al., "Detecting novel associations in large data sets", Science, Vol.334, No.6062, pp.1518-1524, 2011.
    D.N Reshef, Y.A Reshef, P.C Sabeti,et al., "An empirical study of the maximal and total information coefficients and leading measures of dependence", The Annals of Applied Statistics, Vol.334, No.6062, pp.1518-1524, 2011.
    D. Albanese, S. Riccadonna, S. Donati,et al., "A practical tool for maximal information coefficient analysis", GigaScience, Vol.7, No.4, pp.1-8, 2018.
    S.L Wang, Y.P Zhao, Y. Shu, et al., "Fast search local extremum for maximal information coefficient (MIC)", Journal of Computational and Applied Mathematics, Vol.327, pp.372-387, 2018.
    Y. Chen, Y. Zeng, F. Luo, et al., "A new algorithm to optimize maximal information coefficient", PloS One, Vol.11, No.6, pp.1-13, 2016.
    H.Q Lyu, M.X Wan, J.Q Han, et al., "A filter feature selection method based on the maximal information coefficient and Gram-Schmidt orthogonalization for biomedical data mining", Computers in Biology and Medicine, Vol.89, pp.264-274, 2017.
    Q.C Tang and M.H Yu, "A mIC-based empirical study of attribute reduction", Foundations of Intelligent Systems, Vol.33, No.6, pp.1005-1013, 2014.
    G.L Li, Z.J Zhou, C.H Hu, et al., "An optimal safety assessment model for complex systems considering correlation and redundancy", International Journal of Approximate Reasoning, Vol.104, pp.38-56, 2019.
    R.R Li, Y.Z Lai, Y.M Zhang, et al., "Classification of cognitive level of patients with leukoaraiosis on the basis of linear and non-linear functional connectivity", Frontiers in Neurology, Vol.8, No.2, pp.1-12, 2017.
    Y.Q Liu, M Yong, K.Y Chen, et al., "Daily activity feature selection in smart homes based on pearson correlation coefficient", IEEE Access, Vol.51, No.2, pp.1771-1787, 2020.
    P. Delicado and M. Smrekar, "Measuring non-linear dependence for two random variables distributed along a curve", Statistics and Computing, Vol.19, No.3, pp.255-269, 2009.
    Y.M Yu, 'On the maximal correlation coefficient", Statistics and Probability Letters, Vol.78, No.9, pp.1072-1075, 2008.
    G.J Székely, M.L Rizzo, N.K Bakirov, et al., "Measuring and testing dependence by correlation of distances", The Annals of Statistics, Vol.35, No.6, pp.2769-2794, 2007.
    G.J Székely, M.L Rizzo, N.K Bakirov, et al., "Brownian distance covariance", The Annals of Statistics, Vol.3, No.4, pp.1236-1265, 2009.
    Q. Wang, Y. Shen and J.Q Zhang, "A nonlinear correlation measure for multivariable data set", Physica D Nonlinear Phenomena, Vol.200, No.3, pp.287-295, 2005.
    J.J Wang, N.N Zheng, B.D Chen, et al., "Multivariate correlation entropy and law discovery in large data sets", IEEE Intelligent Systems, Vol.33, No.5, pp.47-54, 2018.
    C.L Wen, F.N Zhou, C.B Wen, et al., "An extended multiscale principal component analysis method and application in anomaly detection", Chinese Journal of Electronics, Vol.21, No.3, pp.471-476, 2012.
    L.G Liu and Z.J Liu, "A novel fast dimension-reducing ranked query method with high security for encrypted cloud Data", Chinese Journal of Electronics, Vol.29, No.2, pp.344-350,2020.
    D.R Hardoon, S. Szedmak and J. Shawe-Taylor, "Canonical correlation analysis:An overview with application to learning methods", Neural Computation,Vol.16, No.12, pp.2639-2664, 2004.
    W.W Min, J. Liu and S.H Zhang, "Sparse weighted canonical correlation analysis", Chinese Journal of Electronics, Vol.27, No.3, pp.459-466, 2018.
    P. Bork, C. Bowler, C. De Vargas, et al., "Tara Oceans studies plankton at PLANETARY SCALE", Science, Vol.348, No.6237, pp.873-873, 2015.
  • 加载中


    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (342) PDF downloads(10) Cited by()
    Proportional views


    DownLoad:  Full-Size Img  PowerPoint