skip to main content
10.1145/2348283.2348508acmconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections
poster

Scheduling queries across replicas

Published:12 August 2012Publication History

ABSTRACT

For increased efficiency, an information retrieval system can split its index into multiple shards, and then replicate these shards across many query servers. For each new query, an appropriate replica for each shard must be selected, such that the query is answered as quickly as possible. Typically, the replica with the lowest number of queued queries is selected. However, not every query takes the same time to execute, particularly if a dynamic pruning strategy is applied by each query server. Hence, the replica's queue length is an inaccurate indicator of the workload of a replica, and can result in inefficient usage of the replicas. In this work, we propose that improved replica selection can be obtained by using query efficiency prediction to measure the expected workload of a replica. Experiments are conducted using 2.2k queries, over various numbers of shards and replicas for the large GOV2 collection. Our results show that query waiting and completion times can be markedly reduced, showing that accurate response time predictions can improve scheduling accuracy and attesting the benefit of the proposed scheduling algorithm.

References

  1. A. Z. Broder, D. Carmel, M. Herscovici, A. Soffer, and J. Zien. Efficient query evaluation using a two-level retrieval process. In Proc. CIKM 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. F. Cacheda, V. Carneiro, V. Plachouras, and I. Ounis. Performance analysis of distributed information retrieval architectures using an improved network simulation model. Information Processing and Management, 43:204--224, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. J. H. Friedman. Greedy function approximation: A gradient boosting machine. Annals of Statistics, 29:1189--1232, 2000.Google ScholarGoogle ScholarCross RefCross Ref
  4. C. Macdonald, N. Tonellotto, and I. Ounis. Learning to Predict Response Times for Online Query Scheduling. In Proc. SIGIR 2012. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. N. Tonellotto, C. Macdonald, and I. Ounis. Query efficiency prediction for dynamic pruning. In Proc. LSDS-IR 2011. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. F. Cacheda, V. Carneiro, V. Plachouras and I. Ounis. Performance Comparison of Clustered and Replicated Information Retrieval Systems. In Proc. ECIR 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. Scheduling queries across replicas

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in
    • Published in

      cover image ACM Conferences
      SIGIR '12: Proceedings of the 35th international ACM SIGIR conference on Research and development in information retrieval
      August 2012
      1236 pages
      ISBN:9781450314725
      DOI:10.1145/2348283

      Copyright © 2012 Authors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 12 August 2012

      Permissions

      Request permissions about this article.

      Request Permissions

      Check for updates

      Qualifiers

      • poster

      Acceptance Rates

      Overall Acceptance Rate792of3,983submissions,20%

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader