Skip to main content

Continual Learning of Long Topic Sequences in Neural Information Retrieval

  • Conference paper
  • First Online:
  • 2505 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13185))

Abstract

In information retrieval (IR) systems, trends and users’ interests may change over time, altering either the distribution of requests or contents to be recommended. Since neural ranking approaches heavily depend on the training data, it is crucial to understand the transfer capacity of recent IR approaches to address new domains in the long term. In this paper, we first propose a dataset based upon the MSMarco corpus aiming at modeling a long stream of topics as well as IR property-driven controlled settings. We then in-depth analyze the ability of recent neural IR models while continually learning those streams. Our empirical study highlights in which particular cases catastrophic forgetting occurs (e.g., level of similarity between tasks, peculiarities on text length, and ways of learning models) to provide future directions in terms of model design.

This is a preview of subscription content, log in via an institution.

Buying options

Chapter
USD   29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD   89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD   119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Notes

  1. 1.

    https://huggingface.co/transformers/model_doc/t5.html.

  2. 2.

    Using bert-base-uncased pretrain.

  3. 3.

    https://github.com/tgeral68/continual_learning_of_long_topic.

  4. 4.

    https://www.sbert.net/examples/applications/clustering~(fast~clustering).

  5. 5.

    Implemented in pyserini: https://github.com/castorini/pyserini.

  6. 6.

    https://pypi.org/project/k-means-constrained/.

  7. 7.

    If not the case, we sample one document to build the query-relevant document pairs.

References

  1. Albakour, M.D., Macdonald, C., Ounis, I.: On sparsity and drift for effective real-time filtering in microblogs. In: Proceedings of the 22nd ACM International Conference on Information and Knowledge Management, CIKM 2013, p. 419–428. Association for Computing Machinery, New York (2013). https://doi.org/10.1145/2505515.2505709

  2. Asghar, N., Mou, L., Selby, K.A., Pantasdo, K.D., Poupart, P., Jiang, X.: Progressive memory banks for incremental domain adaptation. In: ICLR arXiv:1811.00239 (2020)

  3. Cai, F., Liang, S., de Rijke, M.: Time-sensitive personalized query auto-completion. In: Proceedings of the 23rd ACM International Conference on Conference on Information and Knowledge Management, CIKM 2014, pp. 1599–1608. Association for Computing Machinery, New York (2014). https://doi.org/10.1145/2661829.2661921

  4. Cai, H., Chen, H., Zhang, C., Song, Y., Zhao, X., Yin, D.: Adaptive parameterization for neural dialogue generation. In: EMNLP-IJCNLP, pp. 1793–1802 (2019)

    Google Scholar 

  5. Dai, Z., Xiong, C., Callan, J., Liu, Z.: Convolutional neural networks for soft-matching n-grams in ad-hoc search. In: WSDM, pp. 126–134 (2018)

    Google Scholar 

  6. Devlin, J., Chang, M., Lee, K., Toutanova, K.: BERT: pre-training of deep bidirectional transformers for language understanding. In: NAACL-HLT, pp. 4171–4186 (2019)

    Google Scholar 

  7. Formal, T., Piwowarski, B., Clinchant, S.: SPLADE: sparse lexical and expansion model for first stage ranking. In: Diaz, F., Shah, C., Suel, T., Castells, P., Jones, R., Sakai, T. (eds.) SIGIR 2021: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, 11–15 July 2021, pp. 2288–2292. ACM (2021). https://doi.org/10.1145/3404835.3463098

  8. Gao, J., Xiong, C., Bennett, P.: Recent advances in conversational information retrieval. In: Proceedings of the 43rd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2020, pp. 2421–2424. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3397271.3401418

  9. Garcia, X., Constant, N., Parikh, A.P., Firat, O.: Towards continual learning for multilingual machine translation via vocabulary substitution. In: NAACL-HLT, pp. 1184–1192 (2021)

    Google Scholar 

  10. Guo, J., Fan, Y., Ai, Q., Croft, W.B.: A deep relevance matching model for ad-hoc retrieval. In: CIKM, pp. 55–64 (2016)

    Google Scholar 

  11. Hofstätter, S., Lin, S., Yang, J., Lin, J., Hanbury, A.: Efficiently teaching an effective dense retriever with balanced topic aware sampling. In: Diaz, F., Shah, C., Suel, T., Castells, P., Jones, R., Sakai, T. (eds.) SIGIR 2021: The 44th International ACM SIGIR Conference on Research and Development in Information Retrieval, Virtual Event, Canada, pp. 113–122. ACM (2021)

    Google Scholar 

  12. Hui, K., Yates, A., Berberich, K., de Melo, G.: PACRR: a position-aware neural IR model for relevance matching. In: EMNLP, pp. 1049–1058 (2017)

    Google Scholar 

  13. Hui, K., Yates, A., Berberich, K., de Melo, G.: CO-PACRR: a context-aware neural IR model for ad-hoc retrieval. In: WSDM, pp. 279–287 (2018)

    Google Scholar 

  14. Karpukhin, V., et al.: Dense passage retrieval for open-domain question answering. In: Webber, B., Cohn, T., He, Y., Liu, Y. (eds.) Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing, EMNLP 2020, Online, 16–20 November 2020, pp. 6769–6781. Association for Computational Linguistics (2020). https://doi.org/10.18653/v1/2020.emnlp-main.550

  15. Khattab, O., Zaharia, M.: ColBERT: efficient and effective passage search via contextualized late interaction over BERT. In: SIGIR, pp. 39–48. ACM (2020)

    Google Scholar 

  16. Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: Bengio, Y., LeCun, Y. (eds.) ICLR 2015 (2015)

    Google Scholar 

  17. Kirkpatrick, J., et al.: Overcoming catastrophic forgetting in neural networks. CoRR arXiv:1612.00796 (2016)

  18. Lange, M.D., et al.: Continual learning: a comparative study on how to defy forgetting in classification tasks. CoRR arXiv:1909.08383 (2019)

  19. Lee, S.: Toward continual learning for conversational agents. CoRR arXiv:1712.09943 (2017)

  20. Li, X., Zhou, Y., Wu, T., Socher, R., Xiong, C.: Learn to grow: a continual structure learning framework for overcoming catastrophic forgetting. In: ICML, vol. 97, pp. 3925–3934 (2019)

    Google Scholar 

  21. Li, Z., Hoiem, D.: Learning without forgetting. IEEE Trans. Pattern Anal. Mach. Intell. 12, 2935–2947 (2018)

    Article  Google Scholar 

  22. Lovón-Melgarejo, J., Soulier, L., Pinel-Sauvagnat, K., Tamine, L.: Studying catastrophic forgetting in neural ranking models. In: Hiemstra, D., Moens, M.-F., Mothe, J., Perego, R., Potthast, M., Sebastiani, F. (eds.) ECIR 2021. LNCS, vol. 12656, pp. 375–390. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-72113-8_25

    Chapter  Google Scholar 

  23. Ma, X., dos Santos, C.N., Arnold, A.O.: Contrastive fine-tuning improves robustness for neural rankers. In: Zong, C., Xia, F., Li, W., Navigli, R. (eds.) Findings of the Association for Computational Linguistics: ACL/IJCNLP 2021, Findings of ACL, pp. 570–582, Online Event, 1–6 August 2021. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.findings-acl.51

  24. MacAvaney, S., Yates, A., Cohan, A., Goharian, N.: CEDR: contextualized embeddings for document ranking. In: SIGIR, pp. 1101–1104 (2019)

    Google Scholar 

  25. de Masson d’Autume, C., Ruder, S., Kong, L., Yogatama, D.: Episodic memory in lifelong language learning. CoRR arXiv:1906.01076 (2019)

  26. McCreadie, R., et al.: University of Glasgow at TREC 2014: experiments with terrier in contextual suggestion, temporal summarisation and web tracks. In: Voorhees, E.M., Ellis, A. (eds.) Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, vol. 500–308, 19–21 November 2014. National Institute of Standards and Technology (NIST), NIST Special Publication (2014). http://trec.nist.gov/pubs/trec23/papers/pro-uogTr_cs-ts-web.pdf

  27. McDonald, R.T., Brokos, G., Androutsopoulos, I.: Deep relevance ranking using enhanced document-query interactions. In: EMNLP, pp. 1849–1860 (2018)

    Google Scholar 

  28. Mitra, B., Craswell, N.: An introduction to neural information retrieval. Found. Trends Inf. Retr. 13(1), 1–126 (2018). https://doi.org/10.1561/1500000061

    Article  Google Scholar 

  29. Nguyen, T., et al.: MS MARCO: a human generated machine reading comprehension dataset. In: Besold, T.R., Bordes, A., d’Avila Garcez, A.S., Wayne, G. (eds.) Proceedings of the Workshop on Cognitive Computation: Integrating Neural and Symbolic Approaches 2016 Co-located with the 30th Annual Conference on Neural Information Processing Systems (NIPS 2016), Barcelona, Spain, December 9, 2016, CEUR Workshop Proceedings, vol. 1773. CEUR-WS.org (2016). http://ceur-ws.org/Vol-1773/CoCoNIPS_2016_paper9.pdf

  30. Nishida, K., Saito, I., Otsuka, A., Asano, H., Tomita, J.: Retrieve-and-read: multi-task learning of information retrieval and reading comprehension. In: Proceedings of the 27th ACM International Conference on Information and Knowledge Management, CIKM 2018, pp. 647–656. Association for Computing Machinery, New York (2018). https://doi.org/10.1145/3269206.3271702

  31. Nogueira, R., Jiang, Z., Pradeep, R., Lin, J.: Document ranking with a pretrained sequence-to-sequence model. In: EMNLP, pp. 708–718 (2020)

    Google Scholar 

  32. Onal, K.D., et al.: Neural information retrieval: at the end of the early years. Inf. Retr. J. 21(2–3), 111–182 (2018). https://doi.org/10.1007/s10791-017-9321-y

    Article  Google Scholar 

  33. Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22(10), 1345–1359 (2010)

    Article  Google Scholar 

  34. Rebuffi, S., Kolesnikov, A., Sperl, G., Lampert, C.H.: iCaRL: incremental classifier and representation learning. In: CVPR, pp. 5533–5542 (2017)

    Google Scholar 

  35. Reimers, N., Gurevych, I.: Sentence-BERT: sentence embeddings using Siamese BERT-networks. In: EMNLP (2019). http://arxiv.org/abs/1908.10084

  36. Robertson, S.E., Walker, S., Hancock-Beaulieu, M., Gull, A., Lau, M.: Okapi at TREC. In: TREC, vol. 500–207, pp. 21–30 (1992)

    Google Scholar 

  37. Sankepally, R.: Event information retrieval from text. In: Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, p. 1447. Association for Computing Machinery, New York (2019). https://doi.org/10.1145/3331184.3331415

  38. Sun, F., Ho, C., Lee, H.: LAMOL: language modeling for lifelong language learning. In: ICLR (2020)

    Google Scholar 

  39. Vaswani, A., et al.: Attention is all you need. In: NeurIPS, pp. 5998–6008 (2017)

    Google Scholar 

  40. Veniat, T., Denoyer, L., Ranzato, M.: Efficient continual learning with modular networks and task-driven priors. CoRR arXiv:2012.12631 (2020)

  41. Wiese, G., Weissenborn, D., Neves, M.: Neural domain adaptation for biomedical question answering. In: CoNLL 2017, pp. 281–289 (2017)

    Google Scholar 

  42. Xiong, C., Dai, Z., Callan, J., Liu, Z., Power, R.: End-to-end neural ad-hoc ranking with kernel pooling. In: SIGIR, pp. 55–64 (2017)

    Google Scholar 

  43. Yang, W., Xie, Y., Tan, L., Xiong, K., Li, M., Lin, J.: Data augmentation for BERT fine-tuning in open-domain question answering. CoRR arXiv:1904.06652 (2019)

  44. Zhao, T., Lu, X., Lee, K.: SPARTA: efficient open-domain question answering via sparse transformer matching retrieval. In: Toutanova, K., et al. (eds.) Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2021, Online, 6–11 June 2021, pp. 565–575. Association for Computational Linguistics (2021). https://doi.org/10.18653/v1/2021.naacl-main.47

Download references

Acknowledgements

We thank the ANR JCJC SESAMS project (ANR-18-CE23-0001) for supporting this work. This work was performed using HPC resources from GENCI-IDRIS (Grant 2021-101681).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Thomas Gerald .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Gerald, T., Soulier, L. (2022). Continual Learning of Long Topic Sequences in Neural Information Retrieval. In: Hagen, M., et al. Advances in Information Retrieval. ECIR 2022. Lecture Notes in Computer Science, vol 13185. Springer, Cham. https://doi.org/10.1007/978-3-030-99736-6_17

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-99736-6_17

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-99735-9

  • Online ISBN: 978-3-030-99736-6

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics