Volume 31 Issue 6
Nov.  2022
Turn off MathJax
Article Contents
YE Zhaoda, HE Xiangteng, PENG Yuxin, “Unsupervised Cross-Media Hashing Learning via Knowledge Graph,” Chinese Journal of Electronics, vol. 31, no. 6, pp. 1081-1091, 2022, doi: 10.1049/cje.2021.00.455
Citation: YE Zhaoda, HE Xiangteng, PENG Yuxin, “Unsupervised Cross-Media Hashing Learning via Knowledge Graph,” Chinese Journal of Electronics, vol. 31, no. 6, pp. 1081-1091, 2022, doi: 10.1049/cje.2021.00.455

Unsupervised Cross-Media Hashing Learning via Knowledge Graph

doi: 10.1049/cje.2021.00.455
Funds:  This work was supported by the National Natural Science Foundation of China (61925201, 62132001, U21B2025) and the National Key R&D Program of China (2021YFF0901502)
More Information
  • Author Bio:

    Zhaoda YE received the B.S. degree in computer science and technology from Peking University in 2018. He is currently pursuing the Ph.D. degree at Wangxuan Institute of Computer Technology, Peking University. His current research interests include text-to-image generation, cross-media retrieval and machine learning

    Xiangteng HE received the Ph.D. degree in computer application technology from Peking University, Beijing, China, in 2020. He is currently the Research Assistant Professror with the Wangxuan Institute of Computer Technology, Peking University. He has authored more than 20 papers in refereed international journals and conference proceedings, including IJCV, IEEE TIP, IEEE TCSVT, CVPR, ACM MM, ACM SIGIR, IJCAI and AAAI. His research interests include muliti-modal content analysis, fine-grained visual analysis, image and video recognition and understanding, and computer vision. He was one of the recipients of 2020 CCF (China Computer Federation) Outstanding Doctoral Dissertation Award and 2018 Baidu Scholarship

    Yuxin PENG (corresponding author) received the Ph.D. degree in computer applied technology from Peking University, Beijing, China, in 2003. He is the Boya Distinguished Professor with the Wangxuan Institute of Computer Technology, Peking University. He has authored over 170 papers, including more than 80 papers in the top-tier journals and conference proceedings. He has submitted 48 patent applications and been granted 37 of them. His current research interests mainly include cross-media analysis and reasoning, image and video recognition and understanding, and computer vision. He led his team to win the First Place in video semantic search evaluation of TRECVID ten times in the recent years. He won the First Prize of the Beijing Technological Invention Award in 2016 (ranking first) and the First Prize of the Scientific and Technological Progress Award of Chinese Institute of Electronics in 2020 (ranking first). He was a recipient of the National Science Fund for Distinguished Young Scholars of China in 2019, and the best paper award at MMM 2019 and NCIG 2018. He serves as the Associate Editor of IEEE TMM, TCSVT, etc.(Email: pengyuxin@pku.edu.cn)

  • Received Date: 2021-12-28
  • Accepted Date: 2022-07-07
  • Available Online: 2022-10-31
  • Publish Date: 2022-11-05
  • With the rapid growth of multimedia data, cross-media hashing has become an important technology for fast cross-media retrieval. Because the manual annotations are difficult to obtain in real-world application, unsupervised cross-media hashing is studied to address the hashing learning without manual annotations. Existing unsupervised cross-media hashing methods generally focus on calculating the similarities through the features of multimedia data, while the learned hashing code cannot reflect the semantic relationship among the multimedia data, which hinders the accuracy in the cross-media retrieval. When humans try to understand multimedia data, the knowledge of concept relations in our brain plays an important role in obtaining high-level semantic. Inspired by this, we propose a knowledge guided unsupervised cross-media hashing (KGUCH) approach, which applies the knowledge graph to construct high-level semantic correlations for unsupervised cross-media hash learning. Our contributions in this paper can be summarized as follows: 1) The knowledge graph is introduced as auxiliary knowledge to construct the semantic graph for the concepts in each image and text instance, which can bridge the multimedia data with high-level semantic correlations to improve the accuracy of learned hash codes for cross-media retrieval. 2) The proposed KGUCH approach constructs correlation of the multimedia data from both the semantic and the feature aspects, which can exploit complementary information to promote the unsupervised cross-media hash learning. The experiments are conducted on three widely-used datasets, which verify the effectiveness of our proposed KGUCH approach.
  • loading
  • [1]
    J. Wang, S. Kumar, and S. F. Chang, “Semi-supervised hashing for scalable image retrieval,” in Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, pp.3424–3431, 2010.
    [2]
    R. Zhang, L. Lin, R. Zhang, et al., “Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification,” IEEE Transactions on Image Processing, vol.24, no.12, pp.4766–4779, 2015. doi: 10.1109/TIP.2015.2467315
    [3]
    Y. Gong, S. Lazebnik, A. Gordo, et al., “Iterative quantization: A procrustean approach to learning binary codes for large-scale image retrieval,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.35, no.12, pp.2916–2929, 2013. doi: 10.1109/TPAMI.2012.193
    [4]
    G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Information Processing & Management, vol.24, no.5, pp.513–523, 1998. doi: 10.1016/0306-4573(88)90021-0
    [5]
    G. Salton, “Another look at automatic text-retrieval systems,” Communications of the ACM, vol.29, no.7, pp.648–656, 1986. doi: 10.1145/6138.6149
    [6]
    B. Yuwono and D. L. Lee, “Server ranking for distributed text retrieval systems on the internet,” in Proceedings of International Conference on Database Systems for Advanced Applications (DASFAA), Melbourne, Australia, pp.41–50, 1997.
    [7]
    D. R. Hardoon, S. Szedmak, and J. Shawe-Taylor, “Canonical correlation analysis: An overview with application to learning methods,” Neural Computation, vol.16, no.12, pp.2639–2664, 2004. doi: 10.1162/0899766042321814
    [8]
    S. Kumar and R. Udupa, “Learning hash functions for cross-view similarity search,” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), Barcelona, Catalonia, Spain, pp.1360–1365, 2011.
    [9]
    Y. Weiss, A. Torralba, and R. Fergus, “Spectral hashing,” in Proceedings of the 21st International Conference on Neural Information Processing Systems (NIPS), Vancouver, British Columbia, Canada, pp.1753–1760, 2008.
    [10]
    H. Hotelling, “Relations between two sets of variates,” Biometrika, vol.28, no.3/4, pp.321–377, 1936. doi: 10.2307/2333955
    [11]
    J. Song, Y. Yang, Y. Yang, et al., “Inter-media hashing for large-scale retrieval from heterogeneous data sources,” in Proceedings of ACM SIGMOD International Conference on Management of Data (SIGMOD), New York, NY, USA, pp.785–796, 2013.
    [12]
    C. Li, C. Deng, L. Wang, et al., “Coupled cyclegan: Unsupervised hashing network for cross-media retrieval,” in Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Honolulu, Hawaii, USA, pp.176–183, 2019.
    [13]
    S. Su, Z. Zhong, and C. Zhang, “Deep joint-semantics reconstructing hashing for large-scale unsupervised cross-media retrieval,” in Proceedings of 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, Korea (South), pp.3027–3035, 2019.
    [14]
    J. Yu, H. Zhou, Y. Zhan, et al., “Deep graphneighbor coherence preserving network for unsupervised cross-media hashing,” in Proceedings of AAAI Conference on Artificial Intelligence(AAAI), Virtual Event, pp.4626–4634, 2021.
    [15]
    M. M. Bronstein, A. M. Bronstein, F. Michel, et al., “Data fusion through cross-mediaity metric learning using similarity-sensitive hashing,” in Proceedings of IEEE Conference on Computer Vision and Pattern Recognition (CVPR), San Francisco, CA, USA, pp.3594–3601, 2010.
    [16]
    D. Zhang and W. Li, “Large-scale supervised multimodal hashing with semantic correlation maximization,” in Proceedings of AAAI Conference on Artificial Intelligence (AAAI), Québec City, Québec, Canada, pp.2177–2183, 2014.
    [17]
    Z. Lin, G. Ding, M. Hu, et al., “Semantics-preserving hashing for cross-view retrieval,” in Proceedings of Conference on Computer Vision and Pattern Recognition (CVPR), Boston, MA, USA, pp.3864–3872, 2015.
    [18]
    R. Xu, C. Li, J. Yan, et al., “Graph convolutional network hashing for cross-media retrieval,” in Proceedings of International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, pp.982–988, 2019.
    [19]
    Y. Shi, X. You, F. Zheng, et al., “Equally-guided discriminative hashing for cross-media retrieval,” in Proceedings of the 28th International Joint Conference on Artificial Intelligence (IJCAI), Macao, China, pp.4767–4773, 2019.
    [20]
    H. Cui, L. Zhu, J. Li, et al., “Scalable deep hashing for large-scale social image retrieval,” IEEE Transactions on Image Processing, vol.29, pp.1271–1284, 2019. doi: 10.1109/TIP.2019.2940693
    [21]
    L. Xu, L. Zhu, Z. Cheng, et al., “Online multi-modal hashing with dynamic query-adaption,” in Proceedings of ACM SIGIR Conference on Research Development in Information Retrieval (SIGIR), Paris, France, pp.715–724, 2019.
    [22]
    R. Speer, J. Chin, and C. Havasi, “Conceptnet 5.5: An open multilingual graph of general knowledge,” in Proceedings of AAAI Conference on Artificial Intelligence (AAAI), San Francisco, California, USA, pp.4444–4451, 2017.
    [23]
    K. Simonyan and A. Zisserman, “Very deep convolutional networks for large-scale image recognition,” arXiv preprint, arXiv: 1409.1556, 2014.
    [24]
    P. Anderson, X. He, C. Buehler, et al., “Bottom-up and top-down attention for image captioning and visual question answering,” in Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Salt Lake City, UT, USA, pp.6077–6086, 2018.
    [25]
    G. Ding, Y. Guo, J. Zhou, et al., “Large-scale cross-modiaity search via collective matrix factorization hashing,” IEEE Transactions on Image Processing, vol.25, no.11, pp.5427–5440, 2016. doi: 10.1109/TIP.2016.2607421
    [26]
    M. Long, Y. Cao, J. Wang, et al., “Composite correlation quantization for efficient multimodal retrieval,” in Proceedings of the 39th International ACM SIGIR Conference on Research Development in Information Retrieval (SIGIR), Pisa, Italy, pp.579–588, 2016.
    [27]
    S. Liu, S. Qian, Y. Guan, et al., “Joint-modal distribution-based similarity hashing for large-scale unsupervised deep cross-modal retrieval,” in Proceedings of ACM SIGIR Conference on Research Development in Information Retrieval (SIGIR), Virtual Event, pp.1379–1388, 2020.
    [28]
    M. Li and H. Wang, “Unsupervised deep cross-modal hashing by knowledge distillation for large-scale cross-modal retrieval,” in Proceedings of 2021 International Conference on Multimedia Retrieval (ICMR), Taipei, China, pp.183–191, 2021.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(5)

    Article Metrics

    Article views (2125) PDF downloads(18) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return