skip to main content
column

The ORCHESTRA Collaborative Data Sharing System

Published:30 September 2008Publication History
Skip Abstract Section

Abstract

Sharing structured data today requires standardizing upon a single schema, then mapping and cleaning all of the data. This results in a single queriable mediated data instance. However, for settings in which structured data is being collaboratively authored by a large community, e.g., in the sciences, there is often a lack of consensus about how it should be represented, what is correct, and which sources are authoritative. Moreover, such data is seldom static: it is frequently updated, cleaned, and annotated. The ORCHESTRA collaborative data sharing system develops a new architecture and consistency model for such settings, based on the needs of data sharing in the life sciences. In this paper we describe the basic architecture and implementation of the ORCHESTRA system, and summarize some of the open challenges that arise in this setting.

References

  1. S. Abiteboul, R. Hull, and V. Vianu. Foundations of Databases. Addison-Wesley, 1995. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. L. Antova, C. Koch, and D. Olteanu. 10106 worlds and beyond: Efficient representation and processing of incomplete information. In ICDE, 2007.Google ScholarGoogle Scholar
  3. A. Balmin, V. Hristidis, and Y. Papakonstantinou. ObjectRank: Authority-based keyword search in databases. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. O. Benjelloun, A.D. Sarma, A.Y. Halevy, and J. Widom. ULDBs: Databases with uncertainty and lineage. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  5. P. A. Bernstein, F. Giunchiglia, A. Kementsietsidis, J. Mylopoulos, L. Serafini, and I. Zaihrayeu. Data management for peer-to-peer computing: A vision. In WebDB '02, June 2002.Google ScholarGoogle Scholar
  6. P. Buneman, S. Khanna, and W.C. Tan. Why and where: A characterization of data provenance. In ICDT, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  7. L. Chiticariu and W.-C. Tan. Debugging schema mappings with routes. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  8. K. Crammer, O. Dekel, J. Keshet, S. Shalev-Shwartz, and Y. Singer. Online passive-aggressive algorithms. Journal of Machine Learning Research, 7:551--585, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Y. Cui. Lineage Tracing in Data Warehouses. PhD thesis, Stanford University, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  10. F. Dabek, M.F. Kaashoek, D. Karger, R. Morris, and I. Stoica. Widearea cooperative storage with CFS. In SOSP, 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  11. N. Dalvi and D. Suciu. Efficient query evaluation on probabilistic databases. In VLDB, 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. U. Dayal and P.A. Bernstein. On the correct translation of update operations on relational views. TODS, 7(3), 1982. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. A. Deutsch, L. Popa, and V. Tannen. Query reformulation with constraints. SIGMOD Record, 35(1), 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  14. A. Deutsch and V. Tannen. Reformulation of XML queries and constraints. In ICDT, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. O.M. Duschka and M.R. Genesereth. Answering recursive queries using views. In PODS, 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. R. Fagin, P. Kolaitis, R.J. Miller, and L. Popa. Data exchange: Semantics and query answering. Theoretical Computer Science, 336:89--124, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. A. Fuxman, P.G. Kolaitis, R.J. Miller, and W.-C. Tan. Peer data exchange. In PODS, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. A. Fuxman and R.J. Miller. First-order query rewriting for inconsistent databases. J. Comput. Syst. Sci., 73(4), 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  19. T.J. Green, G. Karvounarakis, Z.G. Ives, and V. Tannen. Update exchange with mappings and provenance. In VLDB, 2007. Amended version available as Univ. of Pennsylvania report MS-CIS-07-26. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. T.J. Green, G. Karvounarakis, and V. Tannen. Provenance semirings. In PODS, 2007. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. T.J. Green, N. Taylor, G. Karvounarakis, O. Biton, Z. Ives, and V. Tannen. ORCHESTRA: Facilitating collaborative data sharing. In SIGMOD, 2007. Demonstration description. Google ScholarGoogle ScholarDigital LibraryDigital Library
  22. L. Guo, F. Shao, C. Botev, and J. Shanmugasundaram. XRANK: Ranked keyword search over XML documents. In SIGMOD, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  23. A. Gupta, I.S. Mumick, and V.S. Subrahmanian. Maintaining views incrementally. In SIGMOD, 1993. Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. A.Y. Halevy. Answering queries using views: A survey. VLDB J., 10(4), 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. A.Y. Halevy, Z.G. Ives, D. Suciu, and I. Tatarinov. Schema mediation in peer data management systems. In ICDE, March 2003.Google ScholarGoogle ScholarCross RefCross Ref
  26. V. Hristidis and Y. Papakonstantinou. Discover: Keyword search in relational databases. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  27. R. Huebsch, J.M. Hellerstein, N. Lanham, B.T. Loo, S. Shenker, and I. Stoica. Quering the Internet with PIER. In VLDB, 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  28. Z. Ives, N. Khandelwal, A. Kapur, and M. Cakir. ORCHESTRA: Rapid, collaborative sharing of dynamic data. In CIDR, January 2005.Google ScholarGoogle Scholar
  29. V. Kacholia, S. Pandit, S. Chakrabarti, S. Sudarshan, R. Desai, and H. Karambelkar. Bidirectional expansion for keyword search on graph databases. In VLDB, 2005. Google ScholarGoogle ScholarDigital LibraryDigital Library
  30. G. Karvounarakis and Z.G. Ives. Bidirectional mappings for data and update exchange. In WebDB, 2008.Google ScholarGoogle Scholar
  31. G. Kasneci, F.M. Suchanek, G. Ifrim, M. Ramanath, and G. Weikum. Naga: Searching and ranking knowledge. In ICDE, 2008.Google ScholarGoogle ScholarDigital LibraryDigital Library
  32. A. Kementsietsidis, M. Arenas, and R.J. Miller. Mapping data in peer-to-peer systems: Semantics and algorithmic issues. In SIGMOD, June 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  33. H.T. Kung and J.T. Robinson. On optimistic methods for concurrency control. TODS, 6(2), 1981. Google ScholarGoogle ScholarDigital LibraryDigital Library
  34. L.V.S. Lakshmanan, N. Leone, R. Ross, and V.S. Subrahmanian. Probview: a flexible probabilistic database system. ACM Trans. Database Syst., 22(3), 1997. Google ScholarGoogle ScholarDigital LibraryDigital Library
  35. M. Lenzerini. Tutorial - data integration: A theoretical perspective. In PODS, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  36. L. Libkin. Data exchange and incomplete information. In PODS, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  37. D. Narayanan, A. Donnelly, R. Mortier, and A. Rowstron. Delay aware querying with Seaweed. In VLDB, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library
  38. L. Popa, Y. Velegrakis, R.J. Miller, M.A. Hernández, and R. Fagin. Translating web data. In VLDB, 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  39. A. Rowstron and P. Druschel. Pastry: Scalable, distributed object location and routing for large-scale peer-to-peer systems. In Middleware, pages 329--350, Nov. 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  40. P.P. Talukdar, M. Jacob, M.S. Mehmood, K. Crammer, Z.G. Ives, F. Pereira, and S. Guha. Learning to create data-integrating queries. In VLDB, 2008. Google ScholarGoogle ScholarDigital LibraryDigital Library
  41. N.E. Taylor and Z.G. Ives. Reconciling while tolerating disagreement in collaborative data sharing. In SIGMOD, 2006. Google ScholarGoogle ScholarDigital LibraryDigital Library

Index Terms

  1. The ORCHESTRA Collaborative Data Sharing System

    Recommendations

    Comments

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    • Published in

      cover image ACM SIGMOD Record
      ACM SIGMOD Record  Volume 37, Issue 3
      September 2008
      44 pages
      ISSN:0163-5808
      DOI:10.1145/1462571
      Issue’s Table of Contents

      Copyright © 2008 Copyright is held by the owner/author(s)

      Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      • Published: 30 September 2008

      Check for updates

      Qualifiers

      • column

    PDF Format

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader