A Semi-shared Hierarchical Joint Model for Sequence Labeling

LIU Gongshen; DU Wei; ZHOU Jie; LI Jing; CHENG Jie

doi:10.23919/cje.2020.00.363

LIU Gongshen, DU Wei, ZHOU Jie, LI Jing, CHENG Jie. A Semi-shared Hierarchical Joint Model for Sequence Labeling[J]. Chinese Journal of Electronics, 2023, 32(3): 519-530. DOI: 10.23919/cje.2020.00.363

Citation:

A Semi-shared Hierarchical Joint Model for Sequence Labeling

Graphical Abstract

Graphical Abstract

Abstract

Abstract

Multi-task learning is an essential yet practical mechanism for improving overall performance in various machine learning fields. Owing to the linguistic hierarchy, the hierarchical joint model is a common architecture used in natural language processing. However, in the state-of-the-art hierarchical joint models, higher-level tasks only share bottom layers or latent representations with lower-level tasks thus ignoring correlations between tasks at different levels, i.e., lower-level tasks cannot be instructed by the higher features. This paper investigates how to advance the correlations among various tasks supervised at different layers in an end-to-end hierarchical joint learning model. We propose a semi-shared hierarchical model that contains cross-layer shared modules and layer-specific modules. To fully leverage the mutual information between various tasks at different levels, we design four different dataflows of latent representations between the shared and layer-specific modules. Extensive experiments on CTB-7 and CONLL-2009 show that our semi-shared approach outperforms basic hierarchical joint models on sequence tagging while having much fewer parameters. It inspires us that the proper implementation of the cross-layer sharing mechanism and residual shortcuts is promising to improve the performance of hierarchical joint natural language processing models while reducing the model complexity.