Volume 31 Issue 2
Mar.  2022
HUANG Kaiyu, CAO Jingxiang, LIU Zhuang, et al., "Word-Based Method for Chinese Part-of-Speech via Parallel and Adversarial Network," Chinese Journal of Electronics, vol. 31, no. 2, pp. 337-344, 2022
Citation: HUANG Kaiyu, CAO Jingxiang, LIU Zhuang, et al., "Word-Based Method for Chinese Part-of-Speech via Parallel and Adversarial Network," Chinese Journal of Electronics, vol. 31, no. 2, pp. 337-344, 2022

Word-Based Method for Chinese Part-of-Speech via Parallel and Adversarial Network

doi: 10.1049/cje.2020.00.411
Funds:  This work was supported by the National Key Research and Development Program of China (2020AAA0108004) and the National Natural Science Foundation of China (U1936109, 61672127)
    obtained the Ph.D. degree in computer application technology at Dalian University of Technology, China, in 2021. And he received the B.S. degree in computer science and the B.A. degree in Japanese from Dalian University of Technology, China, in 2016. He has published at highly-ranked conferences and journals, such as ACL, EMNLP, IJCAI, and ACM TALLIP. Specifically, he favors the research perspectives on natural language processing, including pre-trained language models, document-level (discourse) neural machine translation, conversational question and answering, and text sequence labeling (i.e., CWS, POS, NER). Moreover, he have joined multiple research projects and foundations. (Email: kaiyuhuang@mail.dlut.edu.cn)

    received the B.A. degree in English for science and technology from Dalian University of Technology (DUT), China, in 1995, and M.A degree in linguistics and applied linguistics from Dalian Maritime University, China, in 2000, and Ph.D. degree in computer application technology from DUT in 2013. She made a one-year visit to University Centre for Computer Corpus Research on Language (UCREL), Lancaster University, U.K., in 2011. She is currently an Associate Professor of linguistics at Dalian University of Technology. Her research interests include corpus linguistics, natural language processing, and machine translation. She is now a Member of CCF and ACL. (Email: caojx@dlut.edu.cn)

    received the Ph.D. degree in computer science from Dalian University of Technology. Currently, he is a Lecturer at School of Applied Finance, Dongbei University of Finance and Economics. His research covers areas of natural language processing (e.g. machine comprehension, question answering, and dialogue generation), graph neural networks, financial text mining, and block chain. He has published at highly-ranked journals such as ACM TIST, and leading international conferences such as IJCAI, ECAI, CIKM, and EMNLP. He served as the Architect and Deputy Technical Director of financial technology companies including Alipay before returning to university. He served as Session Chair for data mining in IJCAI 2020, as Guest Editor of Information Extraction and NLP, and as Program Committee Members and/or Reviewers regularly at numerous journals and conferences such as IEEE TNNLS, IEEE Access, IEEE Signal Proc. Let., AAAI, ACL, NAACL, EMNLP, etc. (Email: liuzhuang@dufe.edu.cn)

    (corresponding author) was born in 1965. He received the Ph.D. degree in computer science from the Dalian University of Technology, China, in 2004. He is currently a Professor with the School of Computer Science, Dalian University of Technology. His research interests include machine translation and knowledge graph. He is now a Senior Member of CIPS, ACM, CAAI. (Email: huangdg@dlut.edu.cn)

  • Received Date: 2020-12-11
  • Accepted Date: 2021-02-24
  • Available Online: 2021-10-18
  • Publish Date: 2022-03-05
  • Chinese part-of-speech (POS) tagging is an essential task for Chinese downstream natural language processing tasks. The accuracy of the Chinese POS task will drop dramatically by word-based methods because of the segmentation errors and the word sparsity. Also, there are several Chinese POS tagging sets with different criteria. Some of them only have a small-scale annotated corpus and are hard to train. To this end, we propose a modified word-based transformer neural network architecture. Meanwhile, we utilize an adversarial transfer learning method that splits the architecture into shared and private parts. This work directly improves the ability of the word-based model, instead of adopting a joint character-based method. Extensive experiments show that our method achieves state-of-the-art performance on all datasets, and more importantly, our method improves performance effectively for the word-based Chinese sequence labeling task.
