Skip to main content

Algorithmic Data Analytics, Small Data Matters and Correlation versus Causation

  • Chapter
  • First Online:
Berechenbarkeit der Welt?

Zusammenfassung

This is a review of aspects of the theory of algorithmic information that may contribute to a framework for formulating questions related to complex, highly unpredictable systems. We start by contrasting Shannon entropy and Kolmogorov-Chaitin complexity, which epitomize correlation and causation respectively, and then surveying classical results from algorithmic complexity and algorithmic probability, highlighting their deep connection to the study of automata frequency distributions. We end by showing that though long-range algorithmic prediction models for economic and biological systems may require infinite computation, locally approximated short-range estimations are possible, thereby demonstrating how small data can deliver important insights into important features of complex “Big Data”.

The chapter is based an invited talk delivered to UNAM-CEIICH via videoconference from The University of Sheffield in the U.K. for the Alan Turing colloquium “From computers to life” (http://www.complexitycalculator.com/TuringUNAM.pdf) in June, 2012.

Hector Zenil is a Principal Investigator and Assistant Professor affiliated to the Department of Computer Science, University of Oxford in the UK; and the Unit of Computational Medicine and SciLifeLab of the Karolinska Institute in Sweden. After a PhD in Theoretical Computer Science from the University of Lille 1 in France and a PhD in Philosophy and Epistemology awarded by the Sorbonne (Paris 1), he joined the Behavioural and Evolutionary Lab, University of Sheffield in the UK. He is also the head of the Algorithmic Nature Group and has been a visiting scholar and professor at MIT/NASA, Carnegie Mellon University and the National University of Singapore.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 84.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Literatur

  • Calude, C. and G. Longo (2015). The deluge of spurious correlations in big data. CDMTCS Research Report Series CDMTCS-488.

    Google Scholar 

  • Calude, C. S. (2013). Information and randomness: an algorithmic perspective. Springer Science & Business Media.

    Google Scholar 

  • Chaitin, G. J. (1969). On the length of programs for computing finite binary sequences: Statistical considerations. Journal of the ACM (JACM) 16(1), 145–159.

    Google Scholar 

  • Chaitin, G. J. (1975). A theory of program size formally identical to information theory. Journal of the ACM (JACM) 22(3), 329–340.

    Google Scholar 

  • Chekaf, M., N. Gauvrit, A. Guida, and F. Mathy (2015). Chunking in working memory and its relationship to intelligence. In Proceedings of the 37th annual meeting of the cognitive science society, Pasadena, California.

    Google Scholar 

  • Cover, T. M. and J. A. Thomas (2012). Elements of information theory. John Wiley & Sons.

    Google Scholar 

  • Delahaye, J.-P. and H. Zenil (2012). Numerical evaluation of algorithmic complexity for short strings: A glance into the innermost structure of randomness. Applied Mathematics and Computation 219(1), 63–77.

    Google Scholar 

  • Gauvrit, N., H. Singmann, F. Soler-Toscano, and H. Zenil (2015). Algorithmic complexity for psychology: a user-friendly implementation of the coding theorem method. Behavior research methods, 1–16.

    Google Scholar 

  • Gauvrit, N., F. Soler-Toscano, and H. Zenil (2014). Natural scene statistics mediate the perception of image complexity. Visual Cognition 22(8), 1084– 1091.

    Google Scholar 

  • Kempe, V., N. Gauvrit, and D. Forsyth (2015). Structure emerges faster during cultural transmission in children than in adults. Cognition 136, 247–254.

    Google Scholar 

  • Kirchherr, W., M. Li, and P. Vitányi (1997). The miraculous universal distribution. The Mathematical Intelligencer 19(4), 7–15.

    Google Scholar 

  • Kolmogorov, A. N. (1968). Three approaches to the quantitative definition of information. International Journal of Computer Mathematics 2(1-4), 157–168.

    Google Scholar 

  • Levin, L. A. (1974). Laws of information conservation (nongrowth) and aspects of the foundation of probability theory. Problemy Peredachi Informatsii 10(3), 30–35.

    Google Scholar 

  • Mandelbrot, B. B. and R. L. Hudson (2005). The (mis) behavior of markets: a fractal view of risk, ruin, and reward. Basic Books.

    Google Scholar 

  • Martin-Löf, P. (1966). The definition of random sequences. Information and control 9(6), 602–619.

    Google Scholar 

  • Milo, R., S. Shen-Orr, S. Itzkovitz, N. Kashtan, D. Chklovskii, and U. Alon (2002). Network motifs: simple building blocks of complex networks. Science 298(5594), 824–827.

    Google Scholar 

  • Schnorr, C. P. (1971). Zufzälligkeit und wahrscheinlichkeit: eine algorithmische begründung der wahrscheinlichkeitstheorie. Springer.

    Google Scholar 

  • Soler-Toscano, F., H. Zenil, J.-P. Delahaye, and N. Gauvrit (2013). Correspondence and independence of numerical evaluations of algorithmic information measures. Computability 2(2), 125–140.

    Google Scholar 

  • Soler-Toscano, F., H. Zenil, J.-P. Delahaye, and N. Gauvrit (2014). Calculating kolmogorov complexity from the frequency output distributions of small turing machines. PLoS ONE 9(5).

    Google Scholar 

  • Solomonoff, R. J. (1964). A formal theory of inductive inference: Parts i and ii. Information and control.

    Google Scholar 

  • Zenil, H. (2011). Une approche expérimentale à la théorie de la complexité algorithmique. Ph. D. thesis, University of Lille 1. dissertation in fulfilment of the degree of Doctor in Computer Science (committee: J.-P. Delahaye, C.S. Calude, G. Chaitin, S. Grigorieff, P. Mathieu and H. Zwirn.

    Google Scholar 

  • Zenil, H. and J.-P. Delahaye (2011). An algorithmic information theoretic approach to the behaviour of financial markets. Journal of Economic Surveys 25(3), 431–463.

    Google Scholar 

  • Zenil, H., N. A. Kiani, and J. Tegnér (2015a). Numerical investigation of graph spectra and information interpretability of eigenvalues. In Bioinformatics and Biomedical Engineering, pp. 395–405. Springer.

    Google Scholar 

  • Zenil, H., N. A. Kiani, and J. Tegnér (2015b). Quantifying loss of information in network-based dimensionality reduction techniques. Journal of Complex Networks. (online ahead of press).

    Google Scholar 

  • Zenil, H., N. A. Kiani, and J. Tegnér (2016). Methods of information theory and algorithmic complexity for network biology. Seminars in Cell and Developmental Biology. (online ahead of press).

    Google Scholar 

  • Zenil, H., F. Soler-Toscano, J.-P. Delahaye, and N. Gauvrit (2015). Two-dimensional kolmogorov complexity and validation of the coding theorem method by compressibility. PeerJ Computer Science 1(e23).

    Google Scholar 

  • Zenil, H., F. Soler-Toscano, K. Dingle, and A. A. Louis (2014). Correlation of automorphism group size and topological properties with program-size complexity evaluations of graphs and complex networks. Physica A: Statistical Mechanics and its Applications 404, 341–358.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Hector Zenil .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer Fachmedien Wiesbaden GmbH

About this chapter

Cite this chapter

Zenil, H. (2017). Algorithmic Data Analytics, Small Data Matters and Correlation versus Causation. In: Pietsch, W., Wernecke, J., Ott, M. (eds) Berechenbarkeit der Welt?. Springer VS, Wiesbaden. https://doi.org/10.1007/978-3-658-12153-2_22

Download citation

  • DOI: https://doi.org/10.1007/978-3-658-12153-2_22

  • Published:

  • Publisher Name: Springer VS, Wiesbaden

  • Print ISBN: 978-3-658-12152-5

  • Online ISBN: 978-3-658-12153-2

  • eBook Packages: Social Science and Law (German Language)

Publish with us

Policies and ethics