Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks

Duan, Wenying; He, Xiaoxi; Zhou, Zimu; Rao, Hong; Thiele, Lothar

doi:10.21437/Interspeech.2021-229

Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks

Wenying Duan, Xiaoxi He, Zimu Zhou, Hong Rao, Lothar Thiele

Pre-trained language models have been widely adopted as backbones in various natural language processing tasks. However, existing pre-trained language models ignore the descriptive meta-information in the text such as the distinction between the title and the mainbody, leading to over-weighted attention to insignificant text. In this paper, we propose a hypernetwork-based architecture to model the descriptive meta-information and integrate it into pre-trained language models. Evaluations on three natural language processing tasks show that our method notably improves the performance of pre-trained language models and achieves the state-of-the-art results on keyphrase extraction.

doi: 10.21437/Interspeech.2021-229

Cite as: Duan, W., He, X., Zhou, Z., Rao, H., Thiele, L. (2021) Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks. Proc. Interspeech 2021, 3216-3220, doi: 10.21437/Interspeech.2021-229

@inproceedings{duan21_interspeech,
  author={Wenying Duan and Xiaoxi He and Zimu Zhou and Hong Rao and Lothar Thiele},
  title={{Injecting Descriptive Meta-Information into Pre-Trained Language Models with Hypernetworks}},
  year=2021,
  booktitle={Proc. Interspeech 2021},
  pages={3216--3220},
  doi={10.21437/Interspeech.2021-229}
}