short-paper

Booking.com Multi-Destination Trips Dataset

Authors:
Dmitri Goldenberg

Booking.com, Tel Aviv, Israel

Booking.com, Tel Aviv, Israel
View Profile

,
Pavel Levin

Booking.com, Amsterdam, Netherlands

Booking.com, Amsterdam, Netherlands
View Profile

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information RetrievalJuly 2021Pages 2457–2462https://doi.org/10.1145/3404835.3463240

Published:11 July 2021Publication History

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2457–2462

ABSTRACT

We introduce a novel dataset of real multi-destination trips booked through Booking.com's online travel platform. The dataset consists of 1.5 million reservations representing 359,000 unique journeys made across 39,000 destinations. As such, the data is particularly well suited to model sequential recommendation and retrieval problems in a high cardinality target space. To preserve user privacy and protect business-sensitive statistics, the data is fully anonymized, sampled and limited to five user origin markets. Even so, the dataset is representative of the general travel purchase behavior and therefore presents a uniquely valuable resource for Machine Learning and information retrieval researchers. This work provides an overview of the dataset. It reports several benchmark results for relevant recommendation problems obtained as part of the recently held Booking.com data challenge during the WSDM WebTour workshop.

Supplemental Material

sigir_bda.mp4

mp4

101.9 MB

Download

References

Jens Adamczak, Gerard-Paul Leyson, Peter Knees, Yashar Deldjoo, Farshad Bakhshandegan Moghaddam, Julia Neidhardt, Wolfgang Wörndl, and Philipp Monreal. 2019. Session-based hotel recommendations: Challenges and future directions. arXiv preprint arXiv:1908.00071 (2019).Google Scholar
Mart'in Baigorria Alonso. 2021. Data Augmentation Using Many-To-Many RNNs for Session-Aware Recommender Systems. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Alex Beutel, Paul Covington, Sagar Jain, Can Xu, Jia Li, Vince Gatto, and Ed H Chi. 2018. Latent cross: Making use of context in recurrent recommender systems. In Proceedings of the Eleventh ACM International Conference on Web Search and Data Mining. 46--54.Google ScholarDigital Library
Qiwei Chen, Huan Zhao, Wei Li, Pipei Huang, and Wenwu Ou. 2019. Behavior sequence transformer for e-commerce recommendation in alibaba. In Proceedings of the 1st International Workshop on Deep Learning Practice for High-Dimensional Sparse Data. 1--4.Google ScholarDigital Library
Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. 2014. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555 (2014).Google ScholarDigital Library
Barbara Rychalska, Konrad Gouchowski, and Jacek Dbrowski. 2021. Modeling Multi-Destination Trips with Sketch-Based Model. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour'21) .Google Scholar
Dmitri Goldenberg, Kostia Kofman, Javier Albert, Sarai Mizrachi, Adam Horowitz, and Irene Teinemaa. 2021 a. Personalization in Practice: Methods and Applications. In Proceedings of the 14th International Conference on Web Search and Data Mining .Google ScholarDigital Library
Dmitri Goldenberg, Kostia Kofman, Pavel Levin, Sarai Mizrachi, Maayan Kafry, and Guy Nadav. 2021 b. Booking.com WSDM WebTour 2021 Data Challenge. http://www.bookingchallenge.com. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation, Vol. 9, 8 (1997), 1735--1780.Google Scholar
Shotaro Ishihara, Shuhei Goda, and Yuya Matsumura. 2021. Weighted Averaging of Various LSTM Models for Next Destination Recommendation. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Paweł Jankiewicz, Liudmyla Kyrashchuk, Paweł Sienkowski, and Magdalena Wójcik. 2019. Boosting algorithms for a session-based, context-aware recommender system in an online travel domain. In Proceedings of the Workshop on ACM Recommender Systems Challenge. 1--5.Google ScholarDigital Library
Dietmar Jannach, Gabriel de Souza P. Moreira, and Even Oldridge. 2020. Why Are Deep Learning Models Not Consistently Winning Recommender Systems Competitions Yet? A Position Paper. In Proceedings of the Recommender Systems Challenge 2020. 44--49.Google Scholar
Tsvi Kuflik, Catalin Mihai Barbu, Amra Deli?, Dmitri Goldenberg, Julia Neidhardt, Ludocik Coba, and Markus Zanker. 2021. WebTour 2021 Workshop on Web and Tourism. In Proceedings of the 14th International Conference on Web Search and Data Mining .Google ScholarDigital Library
Tobias Lang and Matthias Rettenmeier. 2017. Understanding consumer behavior with recurrent neural networks. In Workshop on Machine Learning Methods for Recommender Systems .Google Scholar
Leland McInnes, John Healy, and James Melville. 2018. Umap: Uniform manifold approximation and projection for dimension reduction. arXiv preprint arXiv:1802.03426 (2018).Google Scholar
Sarai Mizrachi and Pavel Levin. 2019. Combining Context Features in Sequence-Aware Recommender Systems.. In RecSys (Late-Breaking Results). 11--15.Google Scholar
Aleksandr Petrov and Yuriy Makarov. 2021. Attention-based neural re-ranking approach for next city in trip recommendations. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Massimo Quadrana, Paolo Cremonesi, and Dietmar Jannach. 2018. Sequence-aware recommender systems. ACM Computing Surveys (CSUR), Vol. 51, 4 (2018), 1--36.Google ScholarDigital Library
C Quoc and Viet Le. 2007. Learning to rank with nonsmooth cost functions. Proceedings of the Advances in Neural Information Processing Systems, Vol. 19 (2007), 193--200.Google Scholar
Barbara Rychalska, Piotr Bka bel, Konrad Gołuchowski, Andrzej Michałowski, and Jacek Dka browski. 2021. Cleora: A Simple, Strong and Scalable Graph Embedding Scheme. arXiv preprint arXiv:2102.02302 (2021).Google Scholar
Marlesson RO Santana and Anderson Soares. 2021. Hybrid Model with Time Modeling for Sequential Recommender Systems. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Benedikt Schifferer, Chris Deotte, Jean-Francois Puget, Gabriel de Souza Pereira Moreira, Gilberto Titericz, Jiwei Liu, and Ronay Ak. 2021. Using Deep Learning to Win the Booking.com WSDM WebTour21 Challenge on Sequential Recommendations. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21) .Google Scholar
Benedikt Schifferer, Gilberto Titericz, Chris Deotte, Christof Henkel, Kazuki Onodera, Jiwei Liu, Bojan Tunguz, Even Oldridge, Gabriel De Souza Pereira Moreira, and Ahmet Erdem. 2020. GPU Accelerated Feature Engineering and Training for Recommender Systems. In Proceedings of the Recommender Systems Challenge 2020. 16--23.Google Scholar
Gourav G Shenoy, Mangirish A Wagle, and Anwar Shaikh. 2017. Kaggle competition: Expedia hotel recommendations. arXiv preprint arXiv:1703.02915 (2017).Google Scholar
Fei Sun, Jun Liu, Jian Wu, Changhua Pei, Xiao Lin, Wenwu Ou, and Peng Jiang. 2019. BERT4Rec: Sequential recommendation with bidirectional encoder representations from transformer. In Proceedings of the 28th ACM International Conference on Information and Knowledge Management. 1441--1450.Google ScholarDigital Library
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention is All You Need. https://arxiv.org/pdf/1706.03762.pdfGoogle Scholar
Yuanzhe Zhou, Shikang Wu, and Chenyang Zheng. 2021. Explore next destination prediction. In Proceedings of the ACM WSDM Workshop on Web Tourism (WSDM Webtour '21).Google Scholar

Index Terms

Booking.com Multi-Destination Trips Dataset
1. Information systems
  1. Information retrieval
    1. Retrieval tasks and goals
      1. Recommender systems
  2. World Wide Web
    1. Web searching and information discovery
      1. Personalization

Recommendations

The plista dataset
NRS '13: Proceedings of the 2013 International News Recommender Systems Workshop and Challenge

Releasing datasets has fostered research in fields such as information retrieval and recommender systems. Datasets are typically tailored for specific scenarios. In this work, we present the plista dataset. The dataset contains a collection of news ...
Read More
RecSys Challenge 2022 Dataset: Dressipi 1M Fashion Sessions
RecSysChallenge '22: Proceedings of the Recommender Systems Challenge 2022

As part of the RecSys Challenge 2022, the Dressipi 1M Fashion Sessions dataset is publicly released. This paper gives an overview of the content and structure of the dataset, as well as explaining the process by which it was constructed. The dataset ...
Read More
A dataset of clone references with gaps
MSR 2014: Proceedings of the 11th Working Conference on Mining Software Repositories

This paper introduces a new dataset of clone references, which is a set of correct clones consisting of their locational information with their gapped lines. Bellon's dataset is one of widely used clone datasets. Bellon's dataset contains many clone ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval
July 2021
2998 pages
ISBN:9781450380379
DOI:10.1145/3404835
General Chairs:
Fernando Diaz
(Google)
,
Chirag Shah
University of Washington
,
Torsten Suel
New York University
,
Program Chairs:
Pablo Castells
Universidad Autónoma de Madrid, Amazon
,
Rosie Jones
Spotify
,
Tetsuya Sakai
Waseda University
Copyright © 2021 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 July 2021
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
dataset
personalization
recommender systems
travel
Qualifiers
- short-paper
Conference

Acceptance Rates
Overall Acceptance Rate792of3,983submissions,20%
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 5
  Total Citations
  View Citations
- 402
  Total Downloads
- Downloads (Last 12 months)90
- Downloads (Last 6 weeks)14
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Booking.com Multi-Destination Trips Dataset

SIGIR '21: Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval

ABSTRACT

Supplemental Material

References

Cited By

Index Terms

Recommendations

The plista dataset

RecSys Challenge 2022 Dataset: Dressipi 1M Fashion Sessions

A dataset of clone references with gaps