Skip to main content

Dataset Creation Framework for Personalized Type-Based Facet Ranking Tasks Evaluation

  • Conference paper
  • First Online:
Experimental IR Meets Multilinguality, Multimodality, and Interaction (CLEF 2021)

Abstract

Faceted Search Systems (FSS) have gained prominence in many existing vertical search systems. They provide facets to assist users in allocating their desired search target quickly. In this paper, we present a framework to generate datasets appropriate for simulation-based evaluation of these systems. We focus on the task of personalized type-based facet ranking. Type-based facets (t-facets) represent the categories of the resources being searched in the FSS. They are usually organized in a large multilevel taxonomy. Personalized t-facet ranking methods aim at identifying and ranking the parts of the taxonomy which reflects query relevance as well as user interests. While evaluation protocols have been developed for facet ranking, the problem of personalising the facet rank based on user profiles has lagged behind due to the lack of appropriate datasets. To fill this gap, this paper introduces a framework to reuse and customise existing real-life data collections. The framework outlines the eligibility criteria and the data structure requirements needed for this task. It also details the process to transform the data into a ground-truth dataset. We apply this framework to two existing data collections in the domain of Point-of-Interest (POI) suggestion. The generated datasets are analysed with respect to the taxonomy richness (variety of types) and user profile diversity and length. In order to experiment with the generated datasets, we combine this framework with a widely adopted simulated user-facet interaction model to evaluate a number of existing personalized t-facet ranking baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Notes

  1. 1.

    In the scope of this work, the term ‘documents’ is used to refer to the information objects being searched. According to the FSS domain, documents can be places, web pages, products, books or images, etc.

  2. 2.

    How the document ranking is performed is outside scope of this research.

  3. 3.

    User picks are the user’s interaction with the system that expresses a preference, like a rating, review, or feedback.

  4. 4.

    https://www.yelp.com/dataset, accessed June 2021.

  5. 5.

    https://developer.foursquare.com/docs/resources/categories, version:20180323.

  6. 6.

    https://www.yelp.com/developers/documentation/v3/all_category_list/categories.json.

  7. 7.

    https://github.com/csurfer/rake-nltk.

References

  1. Abel, F., Celik, I., Houben, G.J., Siehndel, P.: Leveraging the semantics of tweets for adaptive faceted search on twitter. The Semantic Web (2011)

    Google Scholar 

  2. Aliannejadi, M., Mele, I., Crestani, F.: A cross-platform collection for contextual suggestion. In: SIGIR. ACM (2017)

    Google Scholar 

  3. Bayomi, M., Lawless, S.: ADAPT_TCD: an ontology-based context aware approach for contextual suggestion. In: TREC (2016)

    Google Scholar 

  4. Chantamunee, S., Wong, K.W., Fung, C.C.: Collaborative filtering for personalised facet selection. In: IAIT. ACM (2018)

    Google Scholar 

  5. Ali, E., Annalina Caputo, S.L., Conlan, O.: Personalizing type-based facet ranking using BERT embeddings. In: SEMANTiCS (2021)

    Google Scholar 

  6. Ali, E., Caputo, A., Lawless, S., Conlan, O.: A probabilistic approach to personalize type-based facet ranking for POI suggestion. In: Brambilla, M., Chbeir, R., Frasincar, F., Manolescu, I. (eds.) ICWE 2021. LNCS, vol. 12706, pp. 175–182. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-74296-6_14

    Chapter  Google Scholar 

  7. Hashemi, S.H., Clarke, C.L., Kamps, J., Kiseleva, J., Voorhees, E.M.: Overview of the TREC 2016 contextual suggestion track. In: TREC (2016)

    Google Scholar 

  8. Koren, J., Zhang, Y., Liu, X.: Personalized interactive faceted search. In: WWW. ACM (2008)

    Google Scholar 

  9. Tunkelang, D.: Faceted search. Synth. Lect. Inf. Concepts Retrieval Serv. 1, 1–80 (2009)

    Google Scholar 

  10. Vandic, D., Aanen, S., Frasincar, F., Kaymak, U.: Dynamic facet ordering for faceted product search engines. IEEE Trans. Knowl. Data Eng. PP(99), 1 (2017). https://doi.org/10.1109/TKDE.2017.2652461

    Article  Google Scholar 

  11. Vandic, D., Frasincar, F., Kaymak, U.: Facet selection algorithms for web product search. In: Proceedings of the 22nd ACM International Conference on Conference on Information & Knowledge Management, pp. 2327–2332. ACM (2013)

    Google Scholar 

  12. Wang, Q., Ramírez, G., Marx, M., Theobald, M., Kamps, J.: Overview of the INEX 2011 data-centric track. In: Geva, S., Kamps, J., Schenkel, R. (eds.) INEX 2011. LNCS, vol. 7424, pp. 118–137. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-35734-3_10

    Chapter  Google Scholar 

Download references

Acknowledgements

This work was supported by the ADAPT Centre, funded by Science Foundation Ireland Research Centres Programme (Grant 13/RC/2106; 13/RC/2106_P2) and co-funded by the European Regional Development Fund.

Author information

Authors and Affiliations

Authors

Corresponding authors

Correspondence to Esraa Ali or Annalina Caputo .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2021 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Ali, E., Caputo, A., Lawless, S., Conlan, O. (2021). Dataset Creation Framework for Personalized Type-Based Facet Ranking Tasks Evaluation. In: Candan, K.S., et al. Experimental IR Meets Multilinguality, Multimodality, and Interaction. CLEF 2021. Lecture Notes in Computer Science(), vol 12880. Springer, Cham. https://doi.org/10.1007/978-3-030-85251-1_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-85251-1_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-85250-4

  • Online ISBN: 978-3-030-85251-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics