Volume 30 Issue 4
Jul.  2021
Turn off MathJax
Article Contents
WANG Zhuohao, WANG Dong, LI Qing, “Keyword Extraction from Scientific Research Projects Based on SRP-TF-IDF,” Chinese Journal of Electronics, vol. 30, no. 4, pp. 652-657, 2021, doi: 10.1049/cje.2021.05.007
Citation: WANG Zhuohao, WANG Dong, LI Qing, “Keyword Extraction from Scientific Research Projects Based on SRP-TF-IDF,” Chinese Journal of Electronics, vol. 30, no. 4, pp. 652-657, 2021, doi: 10.1049/cje.2021.05.007

Keyword Extraction from Scientific Research Projects Based on SRP-TF-IDF

doi: 10.1049/cje.2021.05.007
  • Received Date: 2021-03-03
    Available Online: 2021-07-19
  • Publish Date: 2021-07-05
  • Keyword extraction by Term frequency-Inverse document frequency (TF-IDF) is used for text information retrieval and mining in many domains, such as news text, social contact text, and medical text. However, keyword extraction in special domains still needs to be improved and optimized, particularly in the scientific research field. The traditional TF-IDF algorithm considers only the word frequency in documents, but not the domain characteristics. Therefore, we propose the Scientific research project TF-IDF (SRP-TF-IDF) model, which combines TF-IDF with a weight balance algorithm designed to recalculate candidate keywords. We have implemented the SRP-TF-IDF model and verified that our method has better precision, recall, and F1 score than the traditional TF-IDF and TextRank methods. In addition, we investigated the parameter of our weight balance algorithm to find an optimal value for keyword extraction from scientific research projects.
  • loading
  • Z. Jingsheng, Z. Qiaoming, Z. Guodong, et al., "Review of research in automatic keyword extraction", Journal of Software, Vol.28, No.9, pp.2431-2449, 2017.
    Z. A. Merrouni, B. Frikh and B. Ouhbi, "Automatic keyphrase extraction:A survey and trends", Journal of Intelligent Information Systems, Vol.54, No.2, pp.391-424, 2020.
    A. Hassaine, S. Mecheter and A. Jaoua, "Text categorization using hyper rectangular keyword extraction:Application to news articles classification", in:Proceedings of the Relational and Algebraic Methods in Computer Science-15th International Conference, Braga, pp.312-325, 2015.
    Y. Bai, Z. Li, K. Wu, et al. "Researchain:Union blockchain based scientific research project management system", 2018 Chinese Automation Congress (CAC), Xi'an, China, pp.4206-4209, 2018.
    X. Wei and Y. Li, "Role control based workflow management for research projects", 2013 6th International Conference on Information Management, Innovation Management and Industrial Engineering, Xi'an, China, pp.472-475, 2014.
    Y. Liu, Y. Yao, X. Zhang, et al. "Design of research management system based on workflow and rapid development platform technology", 2015 International Conference on Estimation, Detection and Information Fusion (ICEDIF), Harbin, China, pp.329-334, 2015.
    Y. Wang, D. Zhang, Y. Yuan, et al. "Improvement of TF-IDF algorithm based on knowledge graph", 2018 IEEE 16th Int. Conf. on Software Engineering Research, Management and Applications, Kunming, China, pp.19-24, 2018.
    P. Shanchen, Y. Jiamin, L. Ting, et al. "A text similarity measurement based on semantic fingerprint of characteristic phrases", Chinese Journal of Electronics, Vol.29, No.2, pp.233-241, 2020.
    L. Yao, Z. Pengzhou and Z. Chi, "Research on news keyword extraction technology based on TF-IDF and TextRank", 2019 IEEE/ACIS 18th Int. Conf. on Computer and Information Science (ICIS), Beijing, China, pp.452-455, 2019.
    P. Sun, L. Wang and Q. Xia, "The keyword extraction of Chinese medical web page based on WF-TF-IDF algorithm", 2017 International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery (CyberC), Nanjing, China, pp.193-198, 2017.
    Imamah and F. H. Rachman, "Twitter sentiment analysis of Covid-19 using term weighting TF-IDF and logistic regresion", 2020 6th Information Technology International Seminar (ITIS), Surabaya, Indonesia, pp.238-242, 2020.
    A. Rahmah, H. B. Santoso and Z. A. Hasibuan, "Exploring technology-enhanced learning key terms using TF-IDF weighting", 2019 Fourth International Conference on Informatics and Computing (ICIC), Semarang, Indonesia, pp.1-4, 2019.
    S. K. Biswas, M. Bordoloi and J. Shreya, "A graph based keyword extraction model using collective node weight", Expert Systems with Applications, Vol.97, No.1, pp.51-59, 2017.
    J. Cao, Z. Jiang, M. Huang, et al. "A way to improve graph-based keyword extraction", 2015 IEEE International Conference on Computer and Communications (ICCC), Chengdu, China, pp.166-170, 2015.
    T. Mikolov, I. Sutskever, C. Kai, et al. "Distributed representations of words and phrases and their compositionality", Proceedings of the 26th International Conference on Neural Information Processing Systems, Nevada, United States, pp.3111-3119, 2013.
    J. Pennington, R. Socher and C. Manning, "Glove:Global vectors for word representation", Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, pp. 1532-1543, 2014.
    J. Guo, W. Che, H. Wang, et al. "Learning sense-specific word embeddings by exploiting bilingual resources", Proceedings of the 25th Int. Conf. on Computational Linguistics:Technical, Dublin, Ireland, pp.497-507, 2014.
    C. C. Arellano, G. J. L. Ruiz and L. M. Segundo, "Management of scientific and technological research", 2015 International Conference on Computing Systems and Telematics (ICCSAT), Xalapa, Mexico, pp.1-6, 2015.
    Y. Komiyama and K. Yamaji, "Nationwide research data management service of Japan in the open science era", 2017 6th IIAI International Congress on Advanced Applied Informatics, Hamamatsu, Japan, pp.129-133, 2017.
    K. Liu, J. Jiang, X. Ding, et al. "Design and development of management information system for research project process based on front-end and back-end separation", 2017 Int. Conf. on Computing Intelligence and Information System (CIIS), Nanjing, China, pp.338-342, 2017.
    Z. Yan, G. Wei, L. Dongdong, et al. "University research project management system based on cloud platform", 2020 International Conference on Big Data and Informatization Education (ICBDIE), Zhangjiajie, China, pp.453-456, 2020.
    C. C. Arellano, G. J. L. Ruiz and L. M. Segundo, "Management of scientific and technological research", 2015 International Conference on Computing Systems and Telematics (ICCSAT), Xalapa, Mexico, pp.1-6, 2015.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Article Metrics

    Article views (745) PDF downloads(29) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return