LI Jinfeng, WANG Hongxia, JING Yi, “Audio Perceptual Hashing Based on NMF and MDCT Coefficients,” Chinese Journal of Electronics, vol. 24, no. 3, pp. 579-583, 2015, doi: 10.1049/cje.2015.07.024
Citation: LI Jinfeng, WANG Hongxia, JING Yi, “Audio Perceptual Hashing Based on NMF and MDCT Coefficients,” Chinese Journal of Electronics, vol. 24, no. 3, pp. 579-583, 2015, doi: 10.1049/cje.2015.07.024

Audio Perceptual Hashing Based on NMF and MDCT Coefficients

doi: 10.1049/cje.2015.07.024
Funds:  This work is supported in part by the National Natural Science Foundation of China (NSFC) (No.61170226, No.61373180), the Fundamental Research Funds for the Central Universities (No.SWJTU12ZT02), the Young Innovative Research Team of Sichuan Province (No.2011JTD0007), and Chengdu Science and Technology program (No.12DXYB214JH-002).
  • Received Date: 2014-08-29
  • Rev Recd Date: 2014-10-11
  • Publish Date: 2015-07-10
  • Audio perceptual hashing is a digest of audio contents, which is independent of content preserving manipulations, such as MP3 compression, amplitude scaling, noise addition, etc. It provides a fast and reliable tool for identification, retrieval, and authentication of audio signals. A new audio hashing scheme based on non- Negative matrix factorization (NMF) of Modified discrete cosine transform (MDCT) coefficients is proposed. MDCT coefficients, which have been widely used in audio coding, exhibit good discrimination for different audio contents and highly robustness against content preserving manipulations, especially MDCT based compression such as MP3, AAC, etc. Based on the extraction of MDCT coefficients of the audio frames firstly, NMF is used to construct hash bits. Experiment results demonstrate that, compared with methods mentioned in literature, the proposed scheme exhibits a high efficiency in terms of discrimination, perceptual robustness identification rate and time consumption.
  • loading
  • P. Cano, E. Batlle, et al., "Audio fingerprinting: Concepts and applications", Computational Intelligence for Modelling and Prediction, Springer Berlin Heidelberg, pp.233-245, 2005.
    X. Niu and Y. Jiao, "An overview of perceptual hashing", Chinese Journal of Electronics, Vol.38, No.7, pp.1405-1411, 2008.
    A.J. Menezes, P.C. Van Oorschot, "Handbook of applied cryptography", CRC Press, 2010.
    Y. Liu, K. Cho, H.S. Yun, et al., "Dct based multiple hashing technique for robust audio fingerprinting", Proc. of IEEE International Conference on Acoustics, Speech and Signal, IEEE, Taipei, pp.61-64, 2009.
    Y. Jiao, L. Ji and X. Niu, "Robust speech hashing for content authentication", Signal Processing Letters, Vol.16, No.9, pp.818-821, 2009.
    N. Chen, H.D. Xiao, et al., "Audio hash function based on nonnegative matrix factorisation of mel-frequency cepstral coefficients", IET Information Security, Vol.5, No.1, pp.19-25, 2011.
    J. Seo, M. Jin, S. Lee, et al., "Audio fingerprinting based on normalized spectral subband centroids", Proc. of IEEE International Conference on Acoustics, Speech, and Signal Processing, Vol.3, pp.213-216, 2005.
    J. Seo,M. Jin, S. Lee, et al., "Audio fingerprinting based on normalized spectral subband moments", Signal Processing Letters, Vol.13, No.4, pp.209-212, 2006.
    J. Deng, W. Wan, X. Yu, et al., "Audio fingerprinting based on spectral energy structure and NMF", Proc. of IEEE 13th International Conference on Communication Technology, pp.1103- 1106, 2011.
    Y. Jiao, B. Yang, et al., "MDCT-based perceptual hashing for compressed audio content identification", Proc. of IEEE 9th Workshop on Multimedia Signal Processing, pp.381-384, 2007.
    Y. Jiao, Q. Li and X. Niu, "Compressed domain perceptual hashing for MELP coded speech", Proc. of IHMSP'08 International Conference on Intelligent Information Hiding and Multimedia Signal Processing, pp.410-413, 2008.
    N. Chen, W. Wan and H. Xiao, "Robust audio hashing based on discrete-wavelet-transform and non-negative matrix factorisation", IET Communications, Vol.4, No.14, pp.1722-1731, 2010.
    W. Li, X. Zhang and Z. Wang, "Music content authentication based on beat segmentation and fuzzy classification", EURASIP Journal on Audio, Speech, and Music Processing, Vol.1, pp.1- 13, 2013.
    N. Chen and H. Xiao, "Perceptual audio hashing algorithm based on Zernike moment and maximum-likelihood watermark detection", Digital Signal Processing, Vol.23, No.4, pp.1216- 1227, 2013.
    D. Lee and H. Seung, "Algorithms for non-negative matrix factorization", Proc. of Advances in Neural Information Processing Systems, pp.556-562, 2001.
    Y. Wang, L. Yaroslavsky, et al., "On the relationship between MDCT, SDPT and DFT", Proc. of 5th International Conference on Signal Processing Proceedings, Vol.1, pp.44-47, 2000.
    Z. Wen, J. Gao, Y. Zhu, et al., "Video perceptual hashing fusing spatiotemporal change detection", Acta Electronica Sinica, Vol.42, No.6, pp.1163-1167, 2014. (in Chinese)
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (574) PDF downloads(792) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return