Article

Approximate counting of inversions in a data stream

Authors:
Miklós Ajtai

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

,
T. S. Jayram

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

,
Ravi Kumar

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

,
D. Sivakumar

IBM Almaden Research Center, San Jose, CA

IBM Almaden Research Center, San Jose, CA
View Profile

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computingMay 2002Pages 370–379https://doi.org/10.1145/509907.509964

Published:19 May 2002Publication History

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

Pages 370–379

ABSTRACT

(MATH) Inversions are used as a fundamental quantity to measure the sortedness of data, to evaluate different ranking methods for databases, and in the context of rank aggregation. Considering the volume of the data sets in these applications, the data stream model {14, 2] is a natural setting to design efficient algorithms.We obtain a suite of space-efficient streaming algorithms for approximating the number of inversions in a permutation. The best space bound we achieve is $O(\log n \log \log n)$ through a deterministic algorithm. In contrast, we derive an $\Omega(n)$ lower bound for randomized exact computation for this problem; thus approximation is essential.(MATH) We also consider two generalizations of this problem: (1) approximating the number of inversions between two permutations, for which we obtain a randomized $O(\sqrt{n} \log n)$-space algorithm, and (2) approximating the number of inversions in a general list, for which we obtain a randomized $O(\sqrt{n} \log^2 n)$-space two-pass algorithm. In contrast, we derive $\Omega(n)$-space lower bounds for deterministic approximate computation for these problems; thus both randomization and approximation are essential.All our algorithms use only O(log n) time per data item.

References

N. Alon, L. Babai, and A. Itai. A fast and simple randomized parallel algorithm for the maximal independent set problem. J. Algorithms, 7(4):567--583, 1986. Google ScholarDigital Library
N. Alon, Y. Matias, and M. Szegedy. The space complexity of approximating the frequency moments. JCSS, 58(1):137--147, 1999. Google ScholarDigital Library
A. Andersson and O. Petersson. Approximate indexed lists. J. Algorithms, 29(2):256--276, 1998. Google ScholarDigital Library
S. Brin and L. Page. The anatomy of a large-scale hypertextual Web search engine. WWW7/Computer Networks, 30(1-7):107--117, 1998. Google ScholarDigital Library
A. Caprara. Sorting permutations by reversals and Eulerian cycle decompositions. SIAM J. Discrete (MATH)., 12:91--110, 1999. Google ScholarDigital Library
P. Diaconis. Group Representation in Probability and Statistics. IMS Lecture Series 11, IMS, 1988.Google Scholar
P. Diaconis and R. Graham. Spearman's footrule as a measure of disarray. J. of the Royal Statistical Society, Series B, 39(2):262--268, 1977.Google ScholarCross Ref
P. F. Dietz. Optimal algorithms for list indexing and subset rank. Proc. WADS, Springer LNCS 382:39--46, 1989. Google ScholarDigital Library
C. Dwork, R. Kumar, M. Naor, and D. Sivakumar. Ranking aggregation methods for the Web. Proc. 10th WWW, pp. 613--622, 2001. Google ScholarDigital Library
F. Ergun, S. Kannan, R. Kumar. R. Rubinfeld, and M. Viswanathan. Spot-checkers. JCSS, 60(3):717--751, 2000. Google ScholarDigital Library
V. Estivill-Castro and D. Wood. A survey of adaptive sorting algorithms. ACM Computing Surveys, 24(4):441--476, 1992. Google ScholarDigital Library
J. Feigenbaum, S. Kannan, M. Strauss, and M. Viswanathan. An approximate {L1}-difference algorithm for massive data streams. Proc. 40th FOCS, pp. 501--511, 1999. Google ScholarDigital Library
P. Flajolet. Approximate counting: A detailed analysis. BIT, 25:113--134, 1985. Google ScholarDigital Library
A. C. Gilbert, S. Guha, P. Indyk, Y. Kotidis, S. Muthukrishnan, and M. J. Strauss. A few good terms: Efficient streaming computation of wavelet decompositions. Manuscript, 2001.Google Scholar
S. Guha, N. Koudas, and K. Shim. Data streams and histograms. Proc. 33rd STOC, pp. 471--475, 2001. Google ScholarDigital Library
M. Henzinger, P. Raghavan, and S. Rajagopalan. Computing on data streams. DIMACS series in Discr. (MATH). & Theor. Comp. Sc., 50:107--118, 1999. Google ScholarDigital Library
B. Kalyanasundaram and G. Schnitger. The probabilistic communication complexity of set intersection. SIAM J. Discrete (MATH)., 5(4):545--557, 1992. Google ScholarDigital Library
D. E. Knuth. The Art of Computer Programming III: Sorting and Searching. Addison-Wesley, 1998. Google ScholarDigital Library
E. Kushilevitz and N. Nisan. Communication complexity. Cambridge University Press, 1997. Google ScholarDigital Library
C. Levcopoulos and O. Petersson. Exploiting few inversions when sorting: sequential and parallel algorithms. TCS, 163(1&2):211--238, 1996. Google ScholarDigital Library
R. Morris. Counting large number of events in small registers. C. ACM, 21(10):840--842, 1978. Google ScholarDigital Library

Index Terms

Approximate counting of inversions in a data stream

Recommendations

Counting inversions, offline orthogonal range counting, and related problems
SODA '10: Proceedings of the twenty-first annual ACM-SIAM symposium on Discrete algorithms

We give an O(n √lg n)-time algorithm for counting the number of inversions in a permutation on n elements. This improves a long-standing previous bound of O(n lg n/ lg lg n) that followed from Dietz's data structure [WADS'89], and answers a question of ...
Read More
Counting inversions in lists
SODA '03: Proceedings of the fourteenth annual ACM-SIAM symposium on Discrete algorithms

In a recent paper, Ajtai et al. [1] give a streaming algorithm to count the number of inversions in a stream Lε[m]ⁿ using two passes and O(ε⁻¹-√n log n(log m + log n)) space. Here, we present a simple randomized streaming algorithm for the same problem ...
Read More
Adaptive and Approximate Orthogonal Range Counting

We present three new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All the results are in the w-bit word RAM model.

—It is well known that there are linear-space data structures for 2-D orthogonal ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing
May 2002
840 pages
ISBN:1581134959
DOI:10.1145/509907
Conference Chair:
John Reif
Duke University
Copyright © 2002 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 19 May 2002
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Qualifiers
- Article
Conference

Acceptance Rates
STOC '02 Paper Acceptance Rate91of287submissions,32%Overall Acceptance Rate1,469of4,586submissions,32%
More
Upcoming Conference
STOC '24

Sponsor:

sigact

56th Annual ACM Symposium on Theory of Computing (STOC 2024)

June 24 - 28, 2024

Vancouver , BC , Canada
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 32
  Total Citations
  View Citations
- 65
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)1
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Approximate counting of inversions in a data stream

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Counting inversions, offline orthogonal range counting, and related problems

Counting inversions in lists

Adaptive and Approximate Orthogonal Range Counting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Approximate counting of inversions in a data stream

STOC '02: Proceedings of the thiry-fourth annual ACM symposium on Theory of computing

ABSTRACT

References

Cited By

Index Terms

Recommendations

Counting inversions, offline orthogonal range counting, and related problems

Counting inversions in lists

Adaptive and Approximate Orthogonal Range Counting

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Conference

Acceptance Rates

Upcoming Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media