ABSTRACT
Users are increasingly engaging in buying and selling data over the web. Facilitated by the proliferation of online marketplaces that bring such users together, data brokers need to serve requests where they provide results for user queries over the underlying datasets, and price them fairly according to the information disclosed by the query. In this work, we present a novel pricing system, called QIRANA, that performs query-based data pricing for a large class of SQL queries (including aggregation) in real time. QIRANA provides prices with formal guarantees: for example, it avoids prices that create arbitrage opportunities. Our framework also allows flexible pricing, by allowing the data seller to choose from a variety of pricing functions, as well as specify relation and attribute-level parameters that control the price of queries and assign different value to different portions of the data. We test QIRANA on a variety of real-world datasets and query workloads, and we show that it can efficiently compute the prices for queries over large-scale data.
- S. Abiteboul and O. M. Duschka. Complexity of answering queries using materialized views. In PODS, pages 254--263. ACM Press, 1998. Google ScholarDigital Library
- Y. Ahmad, O. Kennedy, C. Koch, and M. Nikolic. Dbtoaster: Higher-order delta processing for dynamic, frequently fresh views. Proceedings of the VLDB Endowment, 5(10):968--979, 2012. Google ScholarDigital Library
- M. Balazinska, B. Howe, and D. Suciu. Data markets in the cloud: An opportunity for the database community. PVLDB, 4(12), 2011.Google ScholarDigital Library
- Banjo. ban.jo.Google Scholar
- Big Data Exchange. www.bigdataexchange.com.Google Scholar
- J. A. Blakeley, N. Coburn, P. Larson, et al. Updating derived relations: Detecting irrelevant and autonomously computable updates. ACM Transactions on Database Systems (TODS), 14(3):369--400, 1989. Google ScholarDigital Library
- Bloomberg Market Data. www.bloomberg.com/enterprise/content-data/market-data.Google Scholar
- O. P. Buneman and E. K. Clemons. Efficiently monitoring relational databases. ACM Transactions on Database Systems (TODS), 4(3):368--382, 1979. Google ScholarDigital Library
- S. Chaudhuri, R. Krishnamurthy, S. Potamianos, and K. Shim. Optimizing queries with materialized views. In Data Engineering, 1995. Proceedings of the Eleventh International Conference on, pages 190--200. IEEE, 1995. Google ScholarDigital Library
- N. N. Dalvi, C. Ré, and D. Suciu. Probabilistic databases: diamonds in the dirt. Commun. ACM, 52(7):86--94, 2009. Google ScholarDigital Library
- DataFinder. datafinder.com.Google Scholar
- DBLP dataset. https://snap.stanford.edu/data/com-DBLP.html.Google Scholar
- S. Deep and P. Koutris. The design of arbitrage-free data pricing schemes. arXiv preprint arXiv:1606.09376, 2016.Google Scholar
- S. Diamond and S. Boyd. CVXPY: A Python-embedded modeling language for convex optimization. Journal of Machine Learning Research, 17(83):1--5, 2016. Google ScholarDigital Library
- A. Gupta, I. S. Mumick, and V. S. Subrahmanian. Maintaining views incrementally. ACM SIGMOD Record, 22(2):157--166, 1993. Google ScholarDigital Library
- P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. Query-based data pricing. In M. Benedikt, M. Krötzsch, and M. Lenzerini, editors, PODS, pages 167--178. ACM, 2012. Google ScholarDigital Library
- P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. Querymarket demonstration: Pricing for online data markets. PVLDB, 5(12):1962--1965, 2012. Google ScholarDigital Library
- P. Koutris, P. Upadhyaya, M. Balazinska, B. Howe, and D. Suciu. Toward practical query pricing with querymarket. In K. A. Ross, D. Srivastava, and D. Papadias, editors, ACMSIGMOD 2013, pages 613--624. ACM, 2013. Google ScholarDigital Library
- Lattice Data Inc. lattice.io.Google Scholar
- C. Li and G. Miklau. Pricing aggregate queries in a data marketplace. In WebDB, 2012.Google Scholar
- B. Lin and D. Kifer. On arbitrage-free pricing for general data queries. PVLDB, 7(9):757--768, 2014. Google ScholarDigital Library
- A. Muschalle, F. Stahl, A. Löser, and G. Vossen. Pricing approaches for data markets. In International Workshop on Business Intelligence for the Real-Time Enterprise, pages 129--144. Springer, 2012.Google Scholar
- B. O'Donoghue, E. Chu, N. Parikh, and S. Boyd. SCS: Splitting conic solver, version 1.2.6. https://github.com/cvxgrp/scs, Apr. 2016.Google Scholar
- QLik Data Market. www.qlik.com/us/products/qlik-data-market.Google Scholar
- K. A. Ross, D. Srivastava, and S. Sudarshan. Materialized view maintenance and integrity constraint checking: Trading space for time. In ACM SIGMOD Record, volume 25, pages 447--458. ACM, 1996. Google ScholarDigital Library
- SSB Benchmark. http://www.cs.umb.edu/ poneil/StarSchemaB.PDF.Google Scholar
- M. Staudt and M. Jarke. Incremental maintenance of externally materialized views. In VLDB, volume 96, pages 3--6, 1996. Google ScholarDigital Library
- TPC-H Benchmark. http://www.tpc.org/tpch.Google Scholar
- Twitter GNIP Audience API. gnip.com/insights/audience.Google Scholar
- P. Upadhyaya, M. Balazinska, and D. Suciu. Price-optimal querying with data apis. In PVLDB, 2016. Google ScholarDigital Library
- USA Car crash 2011 dataset. https://datamarket.azure.com/dataset/bigml/carcrashusa2011.Google Scholar
- Windows Azure Marketplace. www.datamarket.azure.com.Google Scholar
- world dataset. https://dev.mysql.com/doc/world-setup/en/.Google Scholar
- Y. Zhuge, H. Garcia-Molina, J. Hammer, and J. Widom. View maintenance in a warehousing environment. ACM SIGMOD Record, 24(2):316--327, 1995. Google ScholarDigital Library
Index Terms
- QIRANA: A Framework for Scalable Query Pricing
Recommendations
Data Pricing -- From Economics to Data Science
KDD '20: Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery & Data MiningData are invaluable. How can we assess the value of data objectively and quantitatively? Pricing data, or information goods in general, has been studied and practiced in dispersed areas and principles, such as economics, data management, data mining, ...
Query-Based Data Pricing
Data is increasingly being bought and sold online, and Web-based marketplace services have emerged to facilitate these activities. However, current mechanisms for pricing data are very simple: buyers can choose only from a set of explicit views, each ...
Toward practical query pricing with QueryMarket
SIGMOD '13: Proceedings of the 2013 ACM SIGMOD International Conference on Management of DataWe develop a new pricing system, QueryMarket, for flexible query pricing in a data market based on an earlier theoretical framework (Koutris et al., PODS 2012). To build such a system, we show how to use an Integer Linear Programming formulation of the ...
Comments