Volume 31 Issue 5
Sep.  2022
Turn off MathJax
Article Contents
WEN Liang, SHI Haibo, ZHANG Xiaodong, et al., “Learning to Combine Answer Boundary Detection and Answer Re-ranking for Phrase-Indexed Question Answering,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 938-948, 2022, doi: 10.1049/cje.2021.00.079
Citation: WEN Liang, SHI Haibo, ZHANG Xiaodong, et al., “Learning to Combine Answer Boundary Detection and Answer Re-ranking for Phrase-Indexed Question Answering,” Chinese Journal of Electronics, vol. 31, no. 5, pp. 938-948, 2022, doi: 10.1049/cje.2021.00.079

Learning to Combine Answer Boundary Detection and Answer Re-ranking for Phrase-Indexed Question Answering

doi: 10.1049/cje.2021.00.079
Funds:  The work was supported by National Natural Science Foundation of China (62036001, 62032001) and PKU-Baidu Fund (2020BD021).
More Information
  • Author Bio:

    was born in 1990. He is working toward the Ph.D. degree at the School of Electronics Engineering and Computer Science, Peking University. He has great interests in natural language processing and machine learning. Currently, his research areas include question answering and information retrieval. (Email: yuco@pku.edu.cn)

    is working as a Senior R&D Engineer in the ranking group at Baidu Inc. He graduated from EECS, Peking University, supervised by Prof. Chao Xu. His main research interests include machine learning, natural language processing, and question answering. (Email: shihaibo@baidu.com)

    was born in 1990. He received the Ph.D. degree in computer science from Peking University, Beijing, China. He is currently an R&D Engineer at Baidu Inc. His research interests include question answering and dialogue system and machine learning. (Email: zhangxiaodong11@baidu.com)

    was born in 1995. He received the B.E. degree in computer science from Sun Yat-sen University, China. He is currently a Ph.D. candidate at MOE Key Laboratory of Computational Linguistics, Peking University. His main research interests include grammatical error correction and sentence rewriting. (Email: sunx5@pku.edu.cn)

    (corresponding author) is a Professor with the School of Electronic Engineering and Computer Science, Peking University (PKU). Now, he is the Director of the Institute of Computational Linguistics of PKU. His research interests include natural language processing and machine learning. (Email: wanghf@pku.edu.cn)

  • Received Date: 2021-03-01
  • Accepted Date: 2021-08-31
  • Rev Recd Date: 2021-08-31
  • Available Online: 2021-11-09
  • Publish Date: 2022-09-05
  • Phrase-indexed question answering (PIQA) seeks to improve the inference speed of question answering (QA) models by enforcing complete independence of the document encoder from the question encoder, and it shows that the constrained model can achieve significant efficiency at the cost of its accuracy. In this paper, we aim to build a model under the PIQA constraint while reducing its accuracy gap with the unconstrained QA models. We propose a novel framework—AnsDR, which consists of an answer boundary detector (AnsD) and an answer candidate ranker (AnsR). More specifically, AnsD is a QA model under the PIQA architecture and it is designed to identify the rough answer boundaries; and AnsR is a lightweight ranking model to finely re-rank the potential candidates without losing the efficiency. We perform the extensive experiments on public datasets. The experimental results show that the proposed method achieves the state of the art on the PIQA task.
  • Note that the accuracy of both boundaries is different from the Exact Match metric[2]. Exact Match determines if the predicted answer span is literally the same as the target answer span which does not take the positions of answer span into consideration.
    Theoretically, for a document with $ m $ words, the number of all possible answer phrases is $ O\left(m^{2}\right) $. In practice, to efficiently compute and store the answer phrase representations, mainstream approaches represent answer phrases as the concatenation of corresponding start word and end word representations.
    For convenience and simplicity, we don’t emphasize the difference between sub-word token and word token.
    As in AnsDR, we first run our reimplemented “DENSPI” model to obtain rough answer boundaries and adopt the large candidate expansion strategy to expand candidates. Then, we use our answer re-ranker that was jointly trained with “DENSPI” to select the best.
    So far, there are no public evaluation results on the NewsQA dataset which follow the independence restrictions from PIQA.
    For some questions, more than one answer is correct.
  • loading
  • [1]
    F. Hill, A. Bordes, S. Chopra, et al., “The goldilocks principle: Reading children’s books with explicit memory representations,” in Proceedings of the 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, USA, pp.1–13, 2016.
    [2]
    P. Rajpurkar, J. Zhang, K. Lopyrev, et al., “SQuAD: 100,000+ questions for machine comprehension of text,” in Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing, Austin, Texas, USA, pp.2383–2392, 2016.
    [3]
    D. Chen, J. Bolton, and C. D. Manning, “A thorough examination of the CNN/Daily mail reading comprehension task,” in Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics, Berlin, Germany, pp.2358–2367, 2016.
    [4]
    S. Wang and J. Jiang, “Machine comprehension using match-lstm and answer pointer,” in Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, pp.1–11, 2017.
    [5]
    M. Joon Seo, A. Kembhavi, A. Farhadi, et al., “Bidirectional attention flow for machine comprehension,” in Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, pp.1–13, 2017.
    [6]
    C. Xiong, V. Zhong, and R. Socher, “Dynamic coattention networks for question answering,” in Proceedings of the 5th International Conference on Learning Representations, ICLR 2017, Toulon, France, pp.1–14, 2017.
    [7]
    Y. Cui, Z. Chen, S. Wei, et al., “Attention-over-attention neural networks for reading comprehension,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp.593–602, 2017.
    [8]
    W. Wang, N. Yang, F. Wei, et al., “Gated self-matching networks for reading comprehension and question answering,” in Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics, Vancouver, Canada, pp.189–198, 2017.
    [9]
    A. Wei Yu, D. Dohan, M. Luong, et al., “Qanet: Combining local convolution with global self-attention for reading comprehension,” in Proceedings of the 6th International Conference on Learning Representations, ICLR 2018, Vancouver, BC, Canada, pp.1–16, 2018.
    [10]
    M. Hu, Y. Peng, Z. Huang, et al., “Reinforced mnemonic reader for machine reading comprehension,” in Proceedings of the 27th International Joint Conference on Artificial Intelligence, Stockholm, Sweden, pp.4099–4106, 2018.
    [11]
    J. Devlin, M. Chang, K. Lee, et al., “BERT: Pre-training of deep bidirectional transformers for language understanding,” in Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Minneapolis, MN, USA, pp.4171–4186, 2019
    [12]
    M. Joon Seo, T. Kwiatkowski, A. P. Parikh, et al., “Phrase-indexed question answering: A new challenge for scalable document comprehension,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp.559–564, 2018
    [13]
    M. Joon Seo, J. Lee, T. Kwiatkowski, et al., “Real-time open-domain question answering with dense-sparse phrase index,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp.4430–4441, 2019
    [14]
    M. Joshi, D. chen, Y. Liu, et al., “SpanBERT: Improving pre-training by representing and predicting spans,” Transactions of the Association for Computational Linguistics, vol.8, pp.64–77, 2020. doi: 10.1162/tacl_a_00300
    [15]
    A. Trischler, T. Wang, X. Yuan, et al., “NewsQA: A machine comprehension dataset,” in Proceedings of the 2nd Workshop on Representation Learning for NLP, Vancouver, Canada, pp.191–200, 2017.
    [16]
    A. Fisch, A. Talmor, R. Jia, et al., “MRQA 2019 shared task: Evaluating generalization in reading comprehension,” in Proceedings of the 2nd Workshop on Machine Reading for Question Answering, Hong Kong, China, pp.1–13, 2019.
    [17]
    M. E. Peters, M. Neumann, M. Iyyer, et al., “Deep contextualized word representations,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, pp.2227–2237, 2018
    [18]
    G. Lai, Q. Xie, H. Liu, et al., “RACE: Large-scale reading comprehension dataset from examinations,” in Proc. of the 2017 Conference on Empirical Methods in Natural Language Processing, Copenhagen, Denmark, pp.785–794, 2017.
    [19]
    P. Rajpurkar, R. Jia, and P. Liang, “Know what you don’t know: Unanswerable questions for SQuAD,” in Proc. of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp.784–789, 2018.
    [20]
    S. Reddy, D. Chen, and C. D. Manning, “Coqa: A conversational question answering challenge,” Transactions of the Association for Computational Linguistics, vol.7, pp.249–266, 2019. doi: 10.1162/tacl_a_00266
    [21]
    Z. Yang, P. Qi, S. Zhang, et al., “HotpotQA: A dataset for diverse, explainable multi-hop question answering,” in Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, Brussels, Belgium, pp.2369–2380, 2018.
    [22]
    S. Salant and J. Berant, “Contextualized word representations for reading comprehension,” in Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT, New Orleans, Louisiana, USA, pp.554–559, 2018.
    [23]
    Z. Yang, Z. Dai, Y. Yang, et al., “Xlnet: Generalized autoregressive pretraining for language understanding,” Advances in Neural Information Processing Systems, Vancouver, BC, Canada, pp.5753–5763, 2019.
    [24]
    S. Wang, M. Yu, X. Guo, et al., “R3: Reinforced reader-ranker for open-domain question answering,” in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI-18, New Orleans, Louisiana, USA, pp.5981–5988, 2018.
    [25]
    S. Wang, M. Yu, J. Jiang, et al., “Evidence aggregation for answer re-ranking in open-domain question answering,” in Proceedings of the 6th International Conference on Learning Representations, ICLR-2018, Vancouver, BC, Canada, pp.1–14, 2018.
    [26]
    Z. Wang, J. Liu, X. Xiao, et al., “Joint training of candidate extraction and answer selection for reading comprehension,”in Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, pp.1715–1724, 2018
    [27]
    B. Kratzwald, A. Eigenmann, and S. Feuerriegel, “RankQA: Neural question answering with answer re-ranking,” in Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy, pp.6076–6085, 2019.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(5)  / Tables(6)

    Article Metrics

    Article views (828) PDF downloads(78) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return