Abstract
We examine data definition languages (DDLs) from various computing era spanning almost 50 years to date. We prove that contemporary DDLs are indistinguishable from older ones using Zipf distribution of words, Zipf distributions of meanings, and information theory. None addresses the Law of Requisite Variety, which is necessary for enabling automatic data integration from autonomous heterogeneous data sources and for the realization of the Semantic Web. The growth of the entire computing industry is hampered by the lack of progress in the development of DDLs suitable for these two goals. Our findings set the stage for the future development of a mathematically sound DDL better suited for the aforementioned purposes.
Similar content being viewed by others
References
Ashby RW (1956) An introduction to cybernetics. Chapman & Hall, London
Bechhofer S, van Harmelen F et al (2004) Owl web ontology language reference. W3C Recommendations. http://www.w3.org/TR/owl-ref/. Accessed 7 July 2005
Berners-Lee T, Hendler J et al (2001) The semantic web. Sci Am May:34–43
Blasgen MW, Astrahan MM et al (1981) System R: an architectural overview. IBM Syst J 20(1): 41–62
Brickley D (1979) Visicalc information: history and commentary from the guys who created it. http://www.bricklin.com/visicalc.htm. Accessed 7 July 2005
Brickley D, Guha RV (2000) Resource Description Framework (Rdf) Schema Specification 1.0, 2000. http://www.w3.org/TR/rdf-schema/. Accessed 5 June 2002
Bricklin D, Kapor M et al (2003) The origins and impact of Visicalc Mountain View, CA, The Computer History Museum and Microsoft Corporation: Lecture given at the Computer History Museum
Casti JL (1985) Canonical models and the law of requisite variety. J Optim Theory Appl 46(4): 455–459
Codd EF (1970) A relational model of data for large shared data banks. Commun ACM 13(6): 377–387
Cohen WW (2000) Data integration using similarity joins and a word based information representation language. ACM Trans Inform Syst 18(3): 288–321
Duschka OM, Genesereth MR (1997) Query planning in infomaster. In: The twelfth annual ACM symposium on applied computing (SAC97), San Jose, CA. ACM, New York
Goldfarb CF (1973) Design considerations for integrated text processing systems. IBM Cambridge Scientific Center Technical Report G320-2094
Greaves M (2004) 2004 Daml Program Directions. http://www.daml.org/listarchive/daml-all/0301.html. Accessed 27 October 2005
Groppe S, Groppe J et al (2009) Optimizing the execution of Xslt stylesheets for querying transformed Xml data. Knowl Inform Syst 18(3): 331–391. doi:10.1007/s10115-008-0144-4
Gu H, Perl Y et al (2004) Contextual partitioning for comprehension of Oodb schemas. Knowl Inform Syst 6(3) (issn Print 0219-1377, Online 0219-3116). doi:10.1007/s10115-003-0102-0
Hakimpour F, Geppert A (2001) Ontologies: an approach to resolve semantic heterogeneity in databases. Databases. http://www.ifi.uzh.ch/arvo/dbtg/Projects/MIGI/publication/ontorep.pdf
Hakimpour F, Geppert A (2005) Resolution of semantic heterogeneity in database schema integration using formal ontologies. Inf Tech Manag 6(1): 97–122 (issn 1385-951X)
Hammer J, McLeod D (1993) An approach to resolving semantic heterogeneity in a federation of autonomous, heterogeneous database systems. Int J Cooperative Inform Syst (IJCIS) 2(1): 51–83
Höpken W (2005) Harmonise ontology. ECCA—Etourism Competence Center Austria, Innsbruck, Austria (email with the Harmonise Ontology attachment and meta-data attachement)
Horrocks I, Patel-Schneider PF et al (2003) From Shiq and Rdf to Owl: the making of a web ontology language. J Web Semant 1(1): 7–26
Hunter A, Liu W (2005) Merging uncertain information with semantic heterogeneity in Xml. Knowl Inform Syst 9(2): 230–258. doi:10.1007/s10115-005-0220-y (issn Print 0219-1377 Online 0219-3116)
Hyvönen E, Viljanen K et al (2009) Building a national semantic web ontology and ontology service infrastructure—the Finnonto. The semantic web: research and applications. Springer, Berlin (isbn 0302-9743)
IBM (2000) Enterprise Cobol for Z/Os Language Reference Manual # Gc27-1411-03. IBM, Armonk
Knox RE (2004) Hype Cycle for Xml Technologies for 2004, p. 25. Gartner Group, Stamford
Knox RE, Abrams C (2003) Hype cycle for Xml technologies for 2003, p. 25. Gartner Group, Stamford
Knox RE, Abrams C et al (2006) Hype cycle for Xml technologies, p. 46. Gartner Group, Stamford
Lassila O, Swick RR (1999) Resource description framework (Rdf) model and syntax specification. W3C Recommendation
Lee J, Malone T (1990) Partially shared views a scheme for communicating among groups that use different type hierarchies. ACM Trans Inform Syst 8(1): 1–26
Levene M, Borges J et al (2001) Zipf’s law for web surfers. Knowl Inform Syst 3(1): 120–129. doi:10.1007/PL00011657 (issn 0219-1377)
Linhalis F, Pontin de Mattos Fortes R et al (2009) Ontomap: an ontology-based architecture to perform the semantic mapping between an interlingua and software components. Knowl Inform Syst. doi:10.1007/s10115-009-0197-z
Lukasiewicz T, Straccia U (2008) Managing uncertainty and vagueness in description logics for the semantic web. Web Semant Sci Serv Agents World Wide Web 6(4): 291–308. doi:10.1016/j.websem.2008.04.001
Markus ML, Steinfield CW et al (2003) The evolution of vertical is standards: electronic interchange standards in the US home mortgage industry. MIS Quarterly (Special Issue). University of Frankfort, Frankfort
MDLI (2005) Edi in Minnesota. http://www.doli.state.mn.us/edi_2.html. Accessed 5 July 2005
Miller GA (1995) Wordnet: a lexical database for English. Commun ACM 38(11): 39–41
Sanderson M, van Rijsbergen C (1999) The impact on retrieval effectiveness of skewed frequency distributions. ACM Trans Inform Syst 17(4): 440–465
Sornette D, Knopoff L et al (1995) Rank-ordering statistics of extreme events: application to the distribution of large earthquakes. http://arxiv.org. Accessed 20 March 2006
Stanley M, Buldyrev S et al (1995) Zipf plots and the size distribution of firms. Econ Lett 49(4): 453–457
Thornbjorn K (2001) Zipf’s law for cities and beyond: the case of Denmark. Am J Econ Sociol 60: 123–146
Unitt M, Jones IC (1999) Edi—the grand daddy of electronic commerce. BT Technol J 17(3): 17–23
W3C (2001, 01 May 2008) W3c Semantic Web Activity. http://www.w3.org/2001/sw/. Accessed 15 May 2008
Walmsley J (1992) The foreign exchange and money markets guide. Wiley, New York (isbn 0471531049)
WEBONT (2001) W3c Daml+Oil Project. http://www.w3.org/2001/sw/WebOnt/. Accessed 3 March 2003
Williams AB, Padmanabhan A et al (2005) Experimentation with local consensus ontologies with implications for automated service composition. IEEE Trans Knowl Data Eng 17(7): 961–981
WORDNET (2005) Wordnet Website. http://www.cogsci.princeton.edu/cgi-bin/webwn1.7.1
Yan PW, Larson P (1994) Data reduction through early grouping. IBM Press, Toronto
Youyong Z (2005) Umbc travel ontology. http://taga.umbc.edu/ontologies/travel.owl. Accessed 19 May 2005
Zadeh LA, Desoer CA (1963) Linear system theory; the state space approach. McGraw-Hill, New York
Zipf GK (1949) Human behavior and the principle of least effort: an introduction to human ecology. Reading, Addison-Wesley
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Rohn, E. Generational analysis of variety in data structures: impact on automatic data integration and on the semantic web. Knowl Inf Syst 24, 283–304 (2010). https://doi.org/10.1007/s10115-009-0246-7
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10115-009-0246-7