We inhabit a very particular place in the universe. The planet we find ourselves residing on is unlike any other patch of cosmic space containing matter. Every day we witness the interaction of a myriad of structures creating a vast richness of intricate behavior. We are surrounded by, and embedded in, a microcosm seething with complexity. Specifically, we are exposed to chemical, biological, and, foremost, technological and socio-economical complexity.

Until very recently in the history of human thought, the adjective “complex” was thought to be synonymous with “complicated”—in other words, intractable. While the universe unveiled its fundamental mysteries through the Book of Nature—the age-old metaphor for the circumstance that the regularities in the physical world are explained mathematically by the human mind—the complexity surrounding us seemed incomprehensible. However, one specific cosmic coincidence allowed the human mind to also tackle and decode the behavior of complex systems. Before disentangling complexity itself, the next section will briefly review the notions introduced throughout the narrative of Part I: the two volumes of the Book of Nature.

Some general references on complexity are Holland (1995), Gladwell (2000), Johnson (2001, 2009), Strogatz (2004), Fisher (2009), Green (2014), Hidalgo (2015).

1 Reviewing the Book of Nature

Chapter 2 opened with the search for the Book of Nature. The belief that the human mind can read the universe like a book and extract knowledge has echoed throughout the ages. Over 300 years ago this belief materialized by the development of Newtonian mechanics. After this initial spark, mathematics reigned supreme as the most resourceful and efficient human knowledge generation system. In Chap. 3 a stunning tale of this success is told. Namely, how the notion of symmetry underlies most of theoretical physics. This then allows for very separate phenomena to be described by overarching and unified theories, as discussed in Chap. 4.

Analyzing this “unreasonable effectiveness of mathematics in the natural sciences” (Wigner 1960) leads to the following observation. The reality domain that is decoded by mathematics (or more generally speaking, formal thought systems) excludes the complexity surrounding us and contained within us. Consequentially, exclusively fundamental aspects of nature—ranging from the quantum foam comprising reality to the incomprehensible vastness of the cosmic fabric—are understood by analytical mathematical representations. This defines a paradigm of knowledge generation, called the fundamental-analytical classification here (Sect. 5.1).

It is unfortunate that the understanding of complex phenomena is not contained within this knowledge paradigm. Complexity, characterized by self-organization, structure formation, and emergence, giving rise to adaptive, resilient, and sustainable behavior, defies our mathematical tools. Complex systems transcend equations.Footnote 1 Unexpectedly, a few decades ago, some scientists started to see the first hints of something unexpected. Behind the mask of intimidating complexity lurked benign simplicity. More precisely, macroscopic complexity showed itself to be the result of simple rules of interaction at the micro level. In the words of Stephen Wolfram, a theoretical physicist, computer scientist, and entrepreneur (Wolfram 2002, see also Sect. 5.2.2):

And I realized, that I had seen a sign of a quite remarkable and unexpected phenomenon: that even from very simple programs behavior of great complexity could emerge.

\([\dots ]\)

It took me more than a decade to come to terms with this result, and to realize just how fundamental and far-reaching its consequences are.

By the turn of the millennium, a new paradigm was born: the complex-algorithmic classification of knowledge generation (Sect. 5.2). The simple rules of interaction driving complex systems allow for a computational approach to understanding them. In other words, the mathematical tools are exchanged for algorithms and simulations running on computers.

In the context of the metaphor of the Book of Nature, the two paradigms of knowledge—the fundamental-analytical and complex-algorithmic dichotomies—represent two volumes. In effect, the Book of Nature is an expanded series comprised of Volume I and Volume II (Sects. 5.3.3 and 5.4). This evolution in the structure of knowledge was saliently highlighted by the eminent theoretical physicist and cosmologist Stephen Hawking (quoted in Chui 2000, p. 29A):

I think the next century [the 21st Century] will be the century of complexity.

An assertion that is remarkable in the face of Hawking’s previous stance, where he predicted the end of theoretical physics in the 1980s and 1990s (see Sects. 4.3.2 and 9.2.2). He then believed that a unified theory of quantum gravityc would soon be found, explaining everything. Even after all the excitement of string/M-theory, we today appear no closer to this goal (see Sect. 10.2.2).

2 A Brief History of Complexity Thinking

The term complexity science is not rigorously defined. The study of complex phenomena is not a single discipline, but represents an approach taken by various fields to study diverse complex behavior. In its historical roots one finds a diversity of intellectual traditions. From cybernetics (1940 and 1950s; Wiener 1948) , systems theory (1950 and 1960s; Von Bertalanffy 1969), early artificial intelligence research (1950 and 1960s; Turing 1950) to non-linear dynamics, fractal geometry, and chaos theory (1960–1980s; Sects. 5.1.3 and 5.2.2). There exists a plethora of themes being investigated, for instance:

  • cellular automata (Sect. 5.2.2);

  • agent-based modeling (Sect. 7.3.1);

  • algorithmic complexity theory (Chaitin 1977);

  • computational complexity theory (Papadimitriou 2003);

  • systems biology (Kitano 2002);

  • data science (James et al. 2013).

For an illustration visualizing the rich and intertwined history of complexity thinking, see The Map of Complexity Sciences and the links within, first published in Castellani and Hafferty (2009) and updated since.Footnote 2

2.1 Complex Systems Theory

In the remainder of this chapter, the focus lies on complex systems theory (Haken 1977, 1983; Simon 1977; Prigogine 1980; Bar-Yam 1997; Eigen 2013; Ladyman et al. 2013), a field emerging from cybernetics and systems theory at the beginning of the 1970s. The theory of complex systems can be understood as an interdisciplinary field of research utilizing a formal framework for studying interconnected dynamical systems (Bar-Yam 1997). Two central themes are self-organization (Prigogine and Nicolis 1977; Prigogine et al. 1984; Kauffman 1993) and emergence (Darley 1994; Holland 1998). The former notion is related to the question of how order emerges spontaneously from chaos in systems which are not in a thermodynamic equilibrium. The latter concept is concerned with the question of how the macro behavior of a system emerges from the interactions of the elements at a micro level. The notion of emergence has a long and muddied history in the philosophy of science (Goldstein 1999). Other themes relating to complex systems theory include the study of complex adaptive systems (Holland 2006) and swarming behavior, i.e., swarm intelligence (Bonabeau et al. 1999). The domains complex systems originate from are mostly socio-economical, biological, or physio-chemical. Some examples of successfully decoding complex systems include earthquake correlations (Sornette and Sornette 1989), crowd dynamics (Helbing et al. 2000), traffic dynamics (Treiber et al. 2000), pedestrian dynamics (Moussaïd et al. 2010), population dynamics (Turchin 2003), urban dynamics (Bettencourt et al. 2008), social cooperation (Helbing and Yu 2009), molecule formation (Simon 1977), and weather formation (Cilliers and Spurrett 1999).

Complex systems are characterized by feedback loops (Bar-Yam 1997; Cilliers and Spurrett 1999; Ladyman et al. 2013), where both damping and amplifying feedback is found. Moreover, linear and non-linear behavior can be observed in complex systems. The term “at the edge of chaos” denotes the transition zone between the regimes of order and disorder (Langton 1990). This is a region of bounded instability that enables a constant dynamic interplay between order and disorder. The edge of chaos is where complexity resides. Furthermore, complex systems can also be characterized by the way they process or exchange information (Haken 2006; Quax et al. 2013; Ladyman et al. 2013). Information is the core theme of Chap. 13.

The study of complex systems represents a new way of approaching nature. Put in the simplest terms, a major focus of science lay on things in isolation—on the tangible, the tractable, the malleable. Through the lens of complexity this focus has shifted to a subtler dimension of reality, where the isolation is overcome. Seemingly single and independent entities are always components of larger units of organization and hence influence each other. Indeed, our modern world, while still being comprised of many of the same “things” as in the past, has become highly networked and interdependent—and, therefore, much more complex. From the interaction of independent entities, the notion of a system emerges. The choice of which components are seen as fundamental in a system is arbitrary and depends on the chosen level of abstraction.

A tentative definition of a complex system is the following:

A complex system is composed of an ensemble of many interacting (or interconnected) elements.

In other words, there exist many parts, or agents, which interact in a disordered, manner resulting in an emergent property or structure. The whole exhibits features not found in the structure or behavior of the individual parts comprising it. This is a literal example of the adage, that the whole is more than the sum of its parts. The emphasis of this definition lies on the notions of components, multiplicity, and interactions (Ladyman et al. 2013). As mentioned, the observed macroscopic complexity of a system is a result of simple rules of interaction of the agents comprising it at the micro level (Sect. 5.2.2). This unexpected universal modus operandi allows complex systems to be reduced to:

  • a set of objects (representing the agents);

  • a set of functions between the objects (representing the interactions among agents).

A natural formal representation of this abstraction is a network (Sect. 5.2.1). Now, the agents are characterized by featureless nodes and the interactions are given by the links connecting the nodes. The mathematical structure describing networks is a graph (Sect. 5.3.2). In essence, complex networks (being the main theme of Sect. 6.3) mirror the organizational properties of real-world complex systems.

This insight gives rise to a new interaction-base worldview and marks a departure from a top-down to a bottom-up approach to the understanding of reality. Traditional problem-solving methods have been strongly influenced by the success of the centuries-old reductionist approach taken in science (Volume I of the Book of Nature). However, the unprecedented success of reductionism can not be replicated in the realm of complexity. In the words of the theoretical physicists Hermann Haken, founder of synergetics, the interdisciplinary approach describing self-organization of non-equilibrium systems (Haken 2006, p. 6):

But the more we are dealing with complex systems, the more we realize that reductionism has its own limitations. For example, knowing chemistry does not mean that we understand life.

In the same vein, a quote taken from an early and much-noticed publication by the physics Nobel laureate Philip Warren Anderson (Anderson 1972, see also Sect. 5.2.1):

At each stage [of complexity] entirely new laws, concepts, and generalizations are necessary [. . . ]. Psychology is not applied biology, nor is biology applied chemistry.

Driven by the desire to comprehend complexity, reductionist methods are replaced or augmented by an embracing of a systems-based and holistic outlook (Kauffman 2008). A revolution in understanding is ignited and a “new science of networks” born (Sect. 5.2.3). In other words, Volume II of the Book of Nature is unearthed.

2.2 The Philosophy of Complexity: From Structural Realism to Poststructuralism

Realism is a central philosophical theme that is closely intertwined with Volume I of the Book of Nature. By ignoring the pragmatic advice “Shut up and calculate!” given in Sect. 2.2.1—the invitation to focus one’s mental capacities on the mathematical machinery driving science instead of grappling with meaning and context—one can wander into the philosophical undergrowth. Here one finds the notion of scientific realism (Ladyman 2016):

Scientific realism is the view that we ought to believe in the unobservable entities posited by our most successful scientific theories. It is widely held that the most powerful argument in favor of scientific realism is the no-miracles argument, according to which the success of science would be miraculous if scientific theories were not at least approximately true descriptions of the world.

One specific form of scientific realism is structural realism, a commitment to the mathematical or structural content of scientific theories. It is the “belief in the existence of structures in the world to which the laws of mathematical physics may approximately correspond” (Falkenburg 2007, p. 2). In a general sense, structural realism only admits a reality to the way things are related to one another, invoking the metaphor of a network (Wittgenstein 1922, 6.35):

Laws [...] are about the net and not about what the net describes.

In a similar vein, “the universe is made of processes, not things” (Smolin 2001, Chap. 4). See also Sect. 2.2 for more details.

Structural realism can take on two forms, as the epistemic or ontic versions (Ladyman 1998). Epistemic structural realism is the view that scientific theories tell us only about the form or structure of the unobservable world and not about its true nature. In other words, one cannot know anything about the real nature of things but only how they relate to one another. In contrast, and more radically, ontic structural realism assumes that relations are all that exist, without assuming the existence of tangible entities. In essence, the world is made up solely of structures, a network of relations without relata (Morganti 2011; Esfeld and Lam 2010). While it might seem outlandish to suppose relations without relata, the ideas of symmetry and invariance, discussed in Chap. 2, lend support to ontic structural realism. Symmetry transformations that exchange the individual things that make up a system but leave their relations unchanged become important. Indeed, ontic structural realism has resonated with the intuition of some eminent physicists: “[...] only the relationship of objects to each other can have significance.” (Roger Penrose quoted by Lee Smolin in Huggett et al. 1998, p. 291). Furthermore, the philosophy has been proposed as an ontology for quantum field theory (French and Ladyman 2003; Cao 2003; Kuhlmann 2015). Explanations relating to the reality and nature of elementary particles and fields have been found lacking (Kuhlmann 2013):

Clearly, then, the standard picture of elementary particles and mediating force fields is not a satisfactory ontology of the physical world. It is not at all clear what a particle or a field even is.

Alternatively, “ontic structural realism has become the most fashionable ontological framework for modern physics” (Kuhlmann 2015). See also Kuhlmann (2010) and Sects. 2.2.1 and 10.4.1.

Structural realism is a form of structuralism, the notion that all aspects of reality are best understood in terms of (scientific) constructs of entities, rather than in terms of concrete entities in themselves. Poststructuralism is defined by the rejection of the self-sufficiency of the structures that structuralism posits (Derrida 1993, based on a 1966 lecture). Specifically, knowledge and truths about structures are always subjective. Poststructuralism does not simply represent the polar opposite of structuralism, it has also been interpreted as anti-scientific, as it “stresses the proliferation of meaning, the breaking down of existing hierarchies, the shortcomings of logic, and the failures of analytical approaches” (Cilliers 1998, p. 22). It is a philosophical stance which is difficult to define as it represents a rich tapestry of thinking (Belsey 2002). Poststructuralism is an intellectual stream closely related to postmodernism, which is discussed in detail in Sect. 9.1.4. Notably, poststructuralism and postmodernism have been proposed as philosophies of complexity (Cilliers 1998; Cilliers and Spurrett 1999; Woermann 2016). For instance, the philosopher and complexity researcher Paul Cilliers observes (Cilliers 1998, p. ix):

The most obvious conclusion drawn from this [poststructural/postmodern] perspective is that there is no overarching theory of complexity that allows us to ignore the contingent aspects of complex systems. If something is really complex, it cannot be adequately described by means of a simple theory.

This outlook implies the following (Woermann 2016, p. 3):

Along with Edgar Morin, Cilliers argues that complexity cannot be resolved through means of a reductive strategy, which is the preferred methodology of those who understand complexity merely as a theory of causation.

While Volume I of the Book of Nature is rooted in structural realism, Volume II invites a philosophy that transcends the borders of clear-cut and orderly interpretations and opens up to inquisitive exploration (Woermann 2016, p. 1):

To my mind, the hallmark of a successful philosophy is thus related to the degree to which it resonates with our views on, and experiences in, the world.

A philosophy grappling with complex systems needs to address the following (Cilliers 1998):

  • complex systems consist of a large number of elements;

  • a large number of elements is necessary, but not sufficient;

  • interactions are rich, non-linear, and short-ranged;

  • there exist loops in the interactions;

  • complex systems tend to be open systems and operate under conditions far from equilibrium;

  • complex systems have a rich history in that the past is co-responsible for the present behavior;

  • each element in the system is ignorant of the behavior of the system as a whole.

In a nutshell (Cilliers 1998, p. 5):

Complexity is the result of a rich interaction of simple elements that only respond to the limited information each of the elements are presented with. When we look at the behavior of a complex system as a whole, our focus shifts from the individual element in the system to the complex structure of the system. The complexity emerges as a result of the patterns of interaction between the elements.

Finally, in Chap. 2, the notion of Platonism Footnote 3 was introduced. Platonic realism posits the existence of mathematical objects that are independent of the mind and language and it is a philosophical stance adopted by many notable mathematicians. Although there also exists a structuralist interpretation of mathematics (Colyvan 2012), other scholars have argued that postmodern thought should be seen as the continuation of debates on the foundations of mathematics (Tasić 2001).

3 Complex Network Theory

The key to the success of complex network theory lies in the courage to ignore the complexity of the components of a system while only quantifying their structure of interactions. In other words, the individual components fade out of focus while their network of interdependence comes into the spotlight. Technically speaking, the analysis focuses on the structure, function, dynamics, and topology of the network. Hence the neurons in a brain, the chemicals interacting in metabolic systems, the ants foraging, the animals in swarms, the humans in a market, etc., can all be understood as being represented by featureless nodes in a network of interactions. Only their relational aspects are decoded for information content.

3.1 The Ubiquity of Complex Networks

Complex networks are ubiquitous in nature, resulting in an abundance of scientific literature (Strogatz 2001; Albert and Barabási 2002; Dorogovtsev and Mendes 2002, 2003a; Newman 2003; Buchanan 2003; Newman et al. 2006; Caldarelli 2007; Costa et al. 2007; Vega-Redondo 2007; Caldarelli and Catanzaro 2012; Barabási 2016). A great variety of processes are best understood if formally described by complex networks. For instance, the following phenomena found in

  • the physical world, e.g.,

    • computer-related systems (Albert et al. 1999; Barabási et al. 2000; Tadić 2001; Pastor-Satorras et al. 2001; Capocci et al. 2006),

    • various transportation structures (Banavar et al. 1999; Guimera et al. 2005; Kühnert et al. 2006),

    • power grids (Albert et al. 2004),

    • spontaneous synchronization of systems (Gómez-Gardenes et al. 2007),

  • biological systems, e.g.,

    • neural networks (Ripley 2008; Bullmore and Sporns 2009),

    • epidemiology (Meyers et al. 2005) ,

    • food chains (Garlaschelli et al. 2003; McKane and Drossel 2005),

    • gene regulation (Brazhnik et al. 2002; Bennett et al. 2008),

    • spontaneous synchronization in biological systems (Gonze et al. 2005),

  • socialFootnote 4 and economic realms, e.g.,

    • diffusion of innovation (Schilling and Phelps 2007; König et al. 2009),

    • trust-based interactions (Walter et al. 2008),

    • various collaborations (Newman 2001a, b),

    • social affiliation (Brown et al. 2007),

    • trade relations (Serrano and Boguñá 2003; Garlaschelli and Loffredo 2004a, c; Reichardt and White 2007; Fagiolo et al. 2008, 2009),

    • shared board directors (Strogatz 2001; Battiston and Catanzaro 2004),

    • similarity of products (Hidalgo et al. 2007),

    • credit relations (Boss et al. 2004; Iori et al. 2008),

    • price correlation (Bonanno et al. 2003; Onnela et al. 2003),

    • corporate ownership structures (Glattfelder and Battiston 2009; Vitali et al. 2011; Glattfelder 2013, 2016; Garcia-Bernardo et al. 2017; Fichtner et al. 2017; Glattfelder and Battiston 2019).

The explosion of research in the field of complex networks has been driven by two changes that have ushered in this new era of comprehending the complex and interdependent world surrounding us. The first is the mentioned departure from reductionist thinking to a systemic and holistic paradigm. The other change is related to the increased influx of data, furnishing the raw material for this revolution. The buzzword “big data” has been replaced by what is established data science. While the cost of computer storage is continually falling, storage capacity is increasing at an exponential rate. Seemingly endless streams of data—originating, for instance, from dynamic natural processes anywhere on the globe, a vast spectrum of observed biological interplay, or countless human endeavors—are continually flowing along global information highways and are being stored in server farms, the cloud, and, importantly, the researchers’ local databases.

Fig. 6.1
figure 1

Visualization example of the simple network defined in (6.1)

3.2 Three Levels of Network Analysis

Complex systems are characterized by complex networks and graphs are the mathematical entity representing the networks. An adjacency matrix is an object that encodes the graph’s topology. As an example, imagine a graph comprised of three nodes \(n_1, n_2, n_3, n_4\) connected by three links \(l_1, l_2, l_3\) such that

$$\begin{aligned} n_1 \xrightarrow []{l_1} n_2, \,\; n_1 \xrightarrow []{l_2} n_3 \,\; \text {and} \,\; n_2 \xrightarrow []{l_3} n_3. \end{aligned}$$
(6.1)

A network layout is shown in Fig. 6.1. The corresponding adjacency matrix is

$$\begin{aligned} A= \begin{pmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 1 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0\\ \end{pmatrix}. \end{aligned}$$
(6.2)

Imagine each row and column of the matrix corresponding to a node (ordered by the label). The element \(A(1,3)=1\) encodes the directed link from node \(n_1\) to node \(n_3\), i.e., \(l_2\). Self-links would be represent by \(A(i,i)=1\) (for \(i=1,2,3\)). In essence, the physical notion of a network has been translated into a matrix, a mathematical object obeying the powerful rules of linear algebra. More information on graph theory can be found in Sect. 5.3.2.

The study of real-world complex networks can be performed at three levels of abstraction. Level 1 represents the purely topological approach, where the network is encoded as a binary adjacency matrix and links exists (1) or do not (0). The simple example above is already a directed network. Removing the direction of the links yields the following Level 1 adjacency matrix

$$\begin{aligned} A= \begin{pmatrix} 0 &{}\quad 1 &{}\quad 1 &{}\quad 0\\ 1 &{}\quad 0 &{}\quad 1 &{}\quad 0\\ 1 &{}\quad 1 &{}\quad 0 &{}\quad 0\\ 0 &{}\quad 0 &{}\quad 0 &{}\quad 0\\ \end{pmatrix}. \end{aligned}$$
(6.3)

All the links remain, but the symmetry of the matrix, e.g., \(A(1,3)=A(3,1)\), removes all traces of directedness. Allowing the links to carry information, i.e., have directions and weights, defines Level 2 (Newman 2004; Barrat et al. 2004; Barthelemy et al. 2004; Onnela et al. 2005; Ahnert et al. 2007). In the example of Fig. 6.1, the directedness can be augmented by weighted links. Formally, \(l_i \in ]0,1]\) for \(i=1,2,3\) or, equivalently, \(A(i,j) \in ]0,1]\) for \(i,j=1,2,3\). Finally, at the highest level of detail, the nodes themselves are assigned a degree of freedom, in the guise of non-topological state variables that shape the topology of the network (Garlaschelli and Loffredo 2004b; Garlaschelli et al. 2005). These variables are sometimes also called fitness (Caldarelli et al. 2002; Servedio et al. 2004; Garlaschelli and Loffredo 2004a; De Masi et al. 2006). See Fig. 6.2 for a visualization of the 3-level approach to complex networks.

However, the Level 3 type analysis of real-world complex networks is very specific. Simply incorporating all three levels of detail into the analysis does not necessarily yield new insights. The specific domain the network originates from has to be considered and the employed network measures require appropriate adaption and tailoring. Only by accounting for the specific nature of the network under investigation new insights can be gained. The example of corporate ownership networks is discussed in Sects. 7.3.2.1 and 7.3.2.2.

Fig. 6.2
figure 2

Visualization examples of the same underlying network. (Left) a directed layout. (Right) the full-fledged 3-level layout, where the thickness of the links represents their weight and the nodes are scaled by some non-topological state variable. The graph layouts are taken from Glattfelder (2013)

4 Laws of Nature in Complex Systems

Laws of nature can be understood as regularities and structures in a highly complex universe. They critically depend on only a small set of conditions and are independent of many other conditions which could also possibly have an effect (Wigner 1960). Science is the quest to capture fundamental regularities of nature within formal analytical representations (Volume I of the Book of Nature). So then, are there laws of nature to be found for complex systems (Volume II)?

The quest to discover universal laws in complex systems has taken many turns. For instance, the macroscopic theory of thermodynamics allows arbitrary complex systems to be described from a universal point of view. Its foundations lie in statistical physics, explaining the phenomena of irreversible thermodynamics. A different approach, striving for universality, is synergetics (Haken 1977, 1983). In contrast to thermodynamics, this field deals with systems far away from thermal equilibrium. See Haken (2006) for a brief overview of the aforementioned approaches.

In the following, the focus of universality will lie on a purely empirical and descriptive phenomenological investigation. In this context the question “What are the laws of nature for complex systems?” has a clear answer.

4.1 Universal Scaling Laws

The empirical analysis of real-world complex systems has revealed an unsuspected regularity which is robust across a great variety of domains. This regularity is captured by what is known as scaling laws, also called power laws (Müller et al. 1990; Mantegna and Stanley 1995; Ghashghaie et al. 1996; West et al. 1997; Gabaix et al. 2003; Guillaume et al. 1997; Galluccio et al. 1997; Amaral et al. 1998; Barabási and Albert 1999; Ballocchi et al. 1999; Albert et al. 1999; Sornette 2000b; Pastor-Satorras et al. 2001; Dacorogna et al. 2001; Corsi et al. 2001; Newman et al. 2002; Garlaschelli et al. 2003; Newman 2005; Di Matteo et al. 2005; Lux 2006; Kühnert et al. 2006; Di Matteo 2007; Bettencourt et al. 2008; Bettencourt and West 2010; Glattfelder et al. 2011; West 2017). This distinct pattern of organization suggests that universal mechanisms are at work in the structure formation and evolution of many complex systems. Varying origins for these scaling laws have been proposed and insights have been gained from the study of critical phenomena and phase transitions, stochastic processes, rich-get-richer mechanisms and so-called self-organized criticality (Bouchaud 2001; Barndorff-Nielsen and Prause 2001; Farmer and Lillo 2004; Newman 2005; Joulin et al. 2008; Lux and Alfarano 2016). Tools and concepts from statistical physics have played a crucial role in discovering and describing these laws (Dorogovtsev and Mendes 2003b; Caldarelli 2007). In essence:

Scaling laws can be understood as laws of nature describing complex systems.

Put in the simplest terms, a scaling law is a basic polynomial functional relationship

$$\begin{aligned} y = f(x) = C x^{\alpha }, \end{aligned}$$
(6.4)

characterized by a (positive or negative) scaling parameter \(\alpha \) and a constant C. In other words, a relative change in the quantity x results in a proportional relative change in the quantity y, independent of the initial size of those quantities: y always varies as a power of x. A simple property of scaling laws can easily be shown. By varying the value of the function’s argument (x), the shape of the function (y) is preserved. As this is true for all scales, the property is called scale invariance. In mathematical terms

$$\begin{aligned} f(a x) = C(a x)^{\alpha } = a^{\alpha } f(x) \sim f(x). \end{aligned}$$
(6.5)

Another defining property of scaling laws is the trivial form which emerges when the function is plotted. Specifically, a logarithmic mapping yields a linear relationship. Taking the logarithm of (6.4) yields

$$\begin{aligned} Y = \alpha X + B, \end{aligned}$$
(6.6)

where \(X = \log (x)\) and \(B=\log (C)\). See Fig. 6.3 for an illustration and, for instance, Newman (2005), Sornette (2000a) for further details.

Fig. 6.3
figure 3

A scaling-law relation. (Left) Graph of the function (6.4) with \(\alpha =-0.75\) and \(C=2.0\). (Right) Log-log scale plot of the same function, i.e., (6.6)

Scaling-law relations characterize an immense number of natural processes, prominently in the form of

  1. 1.

    allometric scaling laws;

  2. 2.

    scaling-law distributions;

  3. 3.

    scale-free networks;

  4. 4.

    cumulative relations of stochastic processes.

Before presenting these four types of universal scaling, some historical context is given in the following section.

4.2 Historical Background: Pareto, Zipf, and Benford

The first study of scaling laws and scaling effects can be traced back to Galileo Galilei. He investigated how ships and animals cannot be naively scaled up, as different physical attributes obey different scaling properties, such as the weight, area, and perimeter (Ghosh 2011). Over 250 years later, the economist and sociologist Vilfredo Pareto brought the concept of scaling laws to prominence (Pareto 1964, originally published in 1896). While investigating the probability distribution of the allocation of wealth among individuals, he discovered the first signs of universal scaling. Put simply, a large portion of the wealth of any society is owned only by a small percentage of the people in that society. Specifically, the Pareto principle says that 20% of the population controls 80% of the wealth. Hence this observation has also been called the 80-20 rule. To this day, the Pareto distribution is detected in the distribution of income or wealth. A more detailed treatment of Pareto’s observed inequality was given by the Lorenz curve (Lorenz 1905). This is a graph representing the ranked cumulative income distribution. At any point on the x-axis, corresponding to the bottom \(x\% \) of households, it shows what percentage (\(y\%\)) of the total income they have. A further refinement was the introduction of the Gini coefficient G (Gini 1921). In effect, G is a statistical measure of inequality, capturing how much an observed Lorenz curve deviates from perfect equality \(G=0.0\). Perfect inequality, or \(G=1.0\), corresponds to a step-function representing a single household earning all the available income. For the United States, in 1979 \(G=0.346\) and in 2013 \(G=0.41\). The interplay between this rise of the Gini coefficient and the rise of the share of total income going to the top earners, seen beginning at the end of the 1970s, is discussed in Atkinson et al. (2011). In 2011, South Africa saw a maximal \(G=0.634\) and in 2014, Ukraine a minimal \(G=0.241\). The data is available from the World Bank.Footnote 5

Another popularizer of the universal scaling patterns found in many types of data, analyzed in the physical and social sciences, was the linguist and philologist George Kingsley Zipf. He studied rank-frequency distributions (Zipf 1949), which order distributions of size by rank. In other words, the x-axis shows the ordered ranks, while the y-axis shows the frequency of observations. For instance, the frequency of the use of words in any human language follows a Zipf distribution. For English, unsurprisingly, the most common words are “the”, “of”, and “and”, while all remaining other ones follow Zipf’s law of diminishing frequency. This law, characterized by a scaling-law probability distribution, is the discrete counterpart of the continuous Pareto probability distribution. See also Newman (2005).

A final pattern emerging in seemingly random data samples was discovered by the electrical engineer Frank Benford. He found an unexpected regularity which was only recently shown to be related to Zipf’s law (Pietronero et al. 2001; Altamirano and Robledo 2011). In 1881, a seemingly bizarre result was published, based on the observation that the first pages of logarithm books, used at that time to perform calculations, were much more worn than the other pages (Newcomb 1881) . In other words, people where mostly computing the logarithms of numbers for which the first digit was a one: \(d_1=1\). The phenomenon was rediscovered in 1938 by Benford, who confirmed the pattern for a large number of random variables drawn from geographical, biological, physical, demographical, economical, and sociological data sets. The pattern even holds for randomly compiled numbers taken from newspaper articles. Benford’s law is an observation about the frequency distribution of leading digits (Benford 1938). If \(d_1 \in \{1,\dots ,9\}\) denotes the first digit of a number, then the probability of its occurrence is equal to

$$\begin{aligned} {p}(d_1)=\log _{10} \left( \frac{d_1+1}{d_1}\right) . \end{aligned}$$
(6.7)

Specifically, \(p(1) = 30.1\%, p(2) = 17.6\%, p(3) = 12.5\%, p(4) = 9.7\%, p(5) = 7.9\%, p(6) = 6.7\%, p(7) = 5.8\%, p(8) = 5.1\% , p(9) = 4.6\% \). In effect, seeing a one as a leading digit in a number is over six times more likely than observing a nine. The law also holds for the second digit \(d_2\) and so on. In general terms, for a number \(B \ge 2\), \(d_i \in \{1,\dots ,B-1\}\)

$$\begin{aligned} {p}(d_i)=\log _{B} \left( \frac{d_i+1}{d_i}\right) . \end{aligned}$$
(6.8)

First explanations of this phenomena, which appears to suspend the notions of probability, focused on the law’s logarithmic nature which implies a scale-invariant distribution. If the first digits universally obey a specific pattern of distribution, this property is thus independent of the measuring system. In other words, conversions from one system of units to other ones—for instance, moving from metric to imperial units—do not affect the pattern. This requirement, that physical quantities are independent of a chosen representation is called covariance and is one of the cornerstones of general relativity (Sects. 4.1 and 10.1.2). In essence, the common sense requirement that the dimensions of arbitrary measurement systems should not affect the measured physical quantities is encoded in Benford’s law. In addition, the fact that many processes in nature show exponential growth is also captured by the law, which assumes that the logarithms of numbers are uniformly distributed. In 1996, the law was mathematically rigorously proved. It was shown that, if one repeatedly chooses different probability distributions and then randomly chooses a number according to that distribution, the resulting list of numbers will obey Benford’s law (Hill 1995. Hence the law reflects the behavior of distributions of distributions. Benford’s law is also present in very distinct phenomena, such as the statistical distribution of leading digits in the prime number sequence (Luque and Lacasa 2009), quantum phase transitions (Sen De and Sen 2011), and earthquake detection (Díaz et al. 2014). The law has also been utilized to detect fraud in insurance, accounting, expenses, or election data, where people forging numbers tend to distribute their digits uniformly.

4.3 The Types of Universal Scaling

Notwithstanding the spectacular number of occurrences of scaling-law relations in a vast diversity of complex systems, there are four basic types of scaling laws to be distinguished.

4.3.1 Allometric Scaling Laws

Allometric scaling describes how various properties of living organisms change with size. This was first observed by Galileo, when he was analyzing the skeletal structures of mammals of varying size. In 1932, the biologist Max Kleiber discovered that, for the vast majority of animals, the animal’s metabolic rate B scales to the \(\frac{3}{4}\) power of the animal’s mass M (Kleiber 1932). Mathematically

$$\begin{aligned} B \sim M^\frac{3}{4}. \end{aligned}$$
(6.9)

Thus, a male African bush elephant, weighing an average of 6 tons, has a metabolism roughly 13,320 times slower than a house mouse, weighing 19 grams. In Fig. 6.4 an overview is shown, ranging from mice to humans to elephants. Moreover, the metabolic rate in animals and the power consumption in computers have been found to scale similarly with size and an energy-time minimization principle is postulated which governs the design of many complex systems that process energy, materials, and information (Moses et al. 2016).

Fig. 6.4
figure 4

Empirical validation of Kleiber’s \(\frac{3}{4}\)-power law, relating body mass to metabolic rate. Figure taken from West (2017). This scaling relation has been extended to span 27 orders of magnitude to include cell structures and unicellular organisms (West and Brown 2005)

Other allometric scaling laws relate the lifespan L and the number of heartbeats H of mammals to their weight M:

$$\begin{aligned} \begin{aligned} L&\sim M^\frac{1}{4}, \\ H&\sim M^{-\frac{1}{4}}. \end{aligned} \end{aligned}$$
(6.10)

Consequently, heavier animals live longer and have slower heart-rates. As both of these scaling laws have the same absolute value but varying sign for their exponent, a fundamental invariant of life emerges: the number of heart-beats per lifetime is constant (approximately \(1.5 \times 10^9\)). The existence of these biological scaling laws, and the fact that the exponents are always simple multiples of \(\frac{1}{4}\), suggest the workings of general underlying mechanisms which are independent of the specific nature of the individual organisms. Hidden behind the mystifying diversity of life lies an organizational process which becomes visible through self-similar scaling laws. This implies the existence of average, idealized biological systems at various scales. More details can be found in West et al. (1997), West and Brown (2005).

Allometric scaling also has medical implications, namely related to drug administration and weight. An example taken from a websiteFootnote 6 offering a calculator which estimates interspecies dosage scaling between animals of different weights: “If the dosage for a 0.25 kg rat is 0.1 mg, then using an exponent of 0.75, the estimated dosage for a 70 kg human would be 6.8 mg. While the dose to weight ratio for the rat is 0.4 mg/kg, the value for the human is only about 0.1 mg/kg.” In 1962, two psychiatrists decided to test the effects of the psychedelic substance lysergic acid diethylamide (LSD) on an elephant. The animal weighed about 3,000 kg and the researchers estimated that a dose of about 300 mg would be appropriate. Not taking the non-linear scaling behavior into account, this turned out to be a fatal dose. The elephant died and the ordeal was reported in the prestigious journal Science (West et al. 1962). Knowledge of allometric scaling would have revealed the following. For humans, a standard amount of LSD is 100 micrograms.Footnote 7 Assuming a body-weight of 70 kg, this dose translates into roughly 1.6 milligrams for an elephant. The administered 300 mg correspond to about 17 mg of LSD for a human. However, there are no verified cases of death by means of an LSD overdose in humans (Passie et al. 2008). See also West (2017).

Allometric scaling laws are also found in the plant kingdom (Niklas 1994). Vascular plants vary in size by about twelve orders of magnitude and scaling laws explain many features. For instance, the self-similar and fractal branching architecture follows a scaling relation. There also exist parallels in the characteristics of plants and animals which are described by allometric scaling with respect to mass: the metabolic rate (\(M^{\frac{3}{4}}\)) and the radius of trunk and aorta (\(M^{\frac{3}{8}}\)). See West et al. (1999).

A recent biological scaling law was discovered, describing a universal mathematical relation for folding mammalian brains (Mota and Herculano-Houzel 2015).

4.3.2 Scaling-Law Distributions

In 1809, Carl Friedrich Gauss published a monograph in which he introduced fundamental statistical concepts (Gauss 1809). A key insight was the description of random data by means of what is today know as normal (or Gaussian) distributions. To this day, many phenomena are approximated by this type of probability distribution. For instance, observations related to intelligence (IQ) , blood pressure, test results, and height (see Fig. 6.5). Moreover, measurement errors in a variety of physical experiments, under general conditions, will follow a normal distribution. In a nutshell, any random phenomena, related to a large number of small and independent causes, can be approximated by a normal distribution. This statement is made mathematically rigorous by the central limit theorem, to which the ubiquity of normal distributions in nature is linked (Voit 2005). In detail, data which is normally distributed is characterized by the mean (\(\mu \)), around which most of the observations cluster. The standard deviation (\(\sigma \)) captures the amount of variation in the data. The functional form is given by

$$\begin{aligned} \mathcal {N}(x,\mu , \sigma ^2) = \frac{1}{\sqrt{2\pi \sigma ^2} } \, e^{ -\frac{(x-\mu )^2}{2\sigma ^2}}. \end{aligned}$$
(6.11)

A precise prediction of this equation is that values less than three standard deviations away from the mean (related to what are called 3-sigma events) account for 99.73% of observations. In other words, it is extremely rare to observe outliers in data following a normal distribution.

Fig. 6.5
figure 5

Comparing probability distributions. (Left) A normal distribution showing the heights of 5,647 male individuals, age 20 and over from the US, with \(\mu =175.9\) [cm] and \(\sigma =7.5147\) [cm] in (6.11) (Fryar et al. 2012) (Right). The scaling-law distribution of city sizes in log-log scale, approximated from Newman (2005), with \(\alpha =2.0\) and \(C=800,000\) in (6.12)

In contrast, data following scaling-law distributions represent the other extreme, where very large occurrences are expected and observations span many orders of magnitude. In Fig. 6.5 two examples are shown, in which a normal distribution of height is compared to a scaling-law distribution of city sizes. Scaling-law distributions, quantifying the probability distributions of complex systems, are the most ubiquitous scaling relation found in nature. Expressed in mathematical terms

$$\begin{aligned} {p}(x) = C x^{-\alpha }, \end{aligned}$$
(6.12)

for \(\alpha > 0\). Note that p is a probability density function. The corresponding cumulative distribution function is defined as

$$\begin{aligned} \mathcal {P}(x) = \int _x^{\infty } p(x^\prime ) \text {d}x^\prime . \end{aligned}$$
(6.13)

If p(x) follows a scaling law with (positive) exponent \(\alpha \), then the cumulative distribution function \(\mathcal {P} ( x )\) also follows a power law, with an exponent \(\alpha - 1\). Pareto’s 80-20 rule can be derived using \(\mathcal {P}\). In detail, the question, above what point \(x_F\) the fraction F of the distribution lies, can be formalized as

$$\begin{aligned} \mathcal {P}(x_F) = \int _{x_F}^{\infty } p(x^\prime ) \text {d}x^\prime = F \int _{x_{\text {min}}}^{\infty } p(x^\prime ) \text {d}x^\prime . \end{aligned}$$
(6.14)

The solution is given by

$$\begin{aligned} x_F = F^{\frac{1}{-\alpha +1}} x_{\text {min}}. \end{aligned}$$
(6.15)

From this, the fraction W of wealth in the hands of the richest P percent of the population can be derived as

$$\begin{aligned} W = P^{\frac{-\alpha +2}{-\alpha +1}}. \end{aligned}$$
(6.16)

As an example, for the US, the empirical wealth distribution exponent is \(\alpha = 2.1\) (Newman 2005). Hence, 86.38% (i.e., \(8.638=0.2^{-0.1/-1.1}\)) of the wealth is held by the richest 20% (0.2). Or, about 64.5% of the wealth is held by the richest 8%. Statistically speaking, cumulative distribution functions perform better, because the tail of the distribution is not affected by the diminishing number observations, as is the case for the density function, where outliers can skew the results.

Scaling-law distributions have been observed in an extraordinary wide range of natural phenomena: from physics, biology, earth and planetary sciences, economics and finance, computer science, demography to the social sciences (Amaral et al. 1998; Albert et al. 1999; Sornette 2000a; Pastor-Satorras et al. 2001; Bouchaud 2001; Newman et al. 2002; Caldarelli et al. 2002; Garlaschelli et al. 2003; Gabaix et al. 2003; Newman 2005; Lux 2005; Di Matteo 2007; Bettencourt et al. 2007; Bettencourt and West 2010; West 2017). It is truly astounding, that such diverse topics as

  • the size of cities, earthquakes, moon craters, solar flares, computer files, sand particle, wars and price moves in financial markets;

  • the number of scientific papers written, citations received by publications, hits on webpages and species in biological taxa;

  • the sales of music, books and other commodities;

  • the population of cities;

  • the income of people;

  • the frequency of words used in human languages and of occurrences of personal names;

  • the areas burnt in forest fires;

are all characterized by scaling-law distributions.

As mentioned, processes following normal distributions have a characteristic scale given by the mean (\(\mu \)) of the distribution. In contrast, scaling-law distributions lack such a preferred scale, as measurements of scaling-law processes can yield values distributed across a vast dynamic range, spanning many orders of magnitude. Indeed, for \(\alpha \le 2\) the mean of the scaling-law distribution can be shown to diverge (Newman 2005). Moreover, analyzing any section of a scaling-law distribution yields similar proportions of small to large events. In other words, scaling-law distributions are characterized by scale-free and self-similar behavior. Historically, Benoît Mandelbrot observed these properties in the changes of cotton prices, which represented the starting point for his research leading to the discovery of fractal geometry (Mandelbrot 1963, see also Sects. 5.2.2, and 5.1.3). Finally, for normal distributions, events that deviate from the mean by, e.g., 10 standard deviations (10-sigma events) are practically impossible to observe. Scaling laws, in contrast, imply that small occurrences are extremely common, whereas large instances become rarer. However, these large events occur nevertheless much more frequently compared to a normal distribution: for scaling-law distributions, extreme events have a small but very real probability of occurring. This fact is summed up by saying that the distribution has a “fat tail” (Anderson 2004). In the terminology of probability theory and statistics, distributions with fat tails are said to be leptokurtic or to display positive kurtosis. The presence of fat tails greatly impacts risk assessments: although most earthquakes, price moves in financial markets, intensities of solar flares , etc., will be very small most of the time, the possibility that a catastrophic event will happen cannot be neglected.

4.3.3 Scale-Free Networks

The turn of the millennium brought about a revolution in the fundamental understanding of the structure and dynamics of real-world complex networks (Barabási and Albert 1999; Watts and Strogatz 1998; Albert and Barabási 2002; Dorogovtsev and Mendes 2002; Newman et al. 2006). The discovery of small-world (Watts and Strogatz 1998) and scale-free (Barabási and Albert 1999; Albert and Barabási 2002) networks were the initial sparks, bringing about a new science of networks (Section 6.3.1). A historical outline of events can be found in Sect. 5.2.3. Figure 6.6 shows an example of a scale-free network.

Fig. 6.6
figure 6

Layout of a scale-free directed network showing two hubs. See Sect. 5.2.3

In a nutshell, scale-free networks have a degree distribution following a scaling law. Let \(\mathcal {P} (k_i)\) describe the probability of finding a node i in the network with a degree of \(k_i\). In other words, i has \(k_i\) direct neighbors and \(\mathcal {P} (k_i)\) expresses the probability of occurrences throughout the network. In general, for scale-free networks, the following relation holds

$$\begin{aligned} \mathcal P (k) \sim k^{-\alpha }. \end{aligned}$$
(6.17)

The detailed expression is given in (5.18) on p. 168 of Sect. 5.3.2, introducing elements of graph theory. For directed networks, one needs to distinguish between the in-degree (\(k^{\text {in}}_i\)) and out-degree (\(k^{\text {out}}_i\)).

A prototypical example of a scale-free network is given by the World Wide Web (WWW) , the set of all online documents interlinked by hypertext links (Barabási et al. 2000). This should not be confused with the Internet, the global network of interconnected computers that utilize the TCP/IP protocol to link devices. In essence, the vast majority of webpages in the WWW are irrelevant and there exist only a few extremely interconnected hubs. Ranked by page views, the most popular five webpages are Google, YouTube, Facebook, Baidu, and Wikipedia.Footnote 8 To illustrate, Google’s search engine processes an average of 3.5 billion searches per day.Footnote 9 Indeed, the success of Google initially depended on the development of a network measure. In detail, the founders Larry Page and Sergey Brin introduced the PageRank search algorithm, which is based on network centralityFootnote 10 (Katz 1953; Hubbell 1965; Bonacich 1987; Borgatti 2005; Glattfelder 2019). In a nutshell, a webpage is important if important webpages link to it. Technically, PageRank assigns a numerical weighting to each element of a hyperlinked set of documents, with the purpose of measuring its relative importance within the set (Brin and Page 1998).

PageRank is formally defined by an iterative equation \(PR_i\) for each node i:

$$\begin{aligned} PR_i(t+1) = \alpha \sum _{j \in \Gamma (i)} \frac{PR_j (t)}{k^{\text {out}}_j} + \frac{1-\alpha }{N}, \end{aligned}$$
(6.18)

where \(\Gamma (i)\) is the set of labels of the neighboring nodes of i, \(\alpha \) is a dampening factor usually set to 0.85, and N is a size coefficient. In matrix notation

$$\begin{aligned} PR(t+1) = \alpha \mathcal {M} PR(t) + \frac{1-\alpha }{N} \mathbf {1}, \end{aligned}$$
(6.19)

where \(\mathbf {1}\) is the unit column-vector and the matrix \(\mathcal {M}\) is

$$\begin{aligned} \mathcal {M}_{ij} = {\left\{ \begin{array}{ll} 1 /k_j^{\text {out}}, &{} \text{ if } j \text{ links } \text{ to } i;\ \\ 0, &{} \text{ otherwise. } \end{array}\right. } \end{aligned}$$
(6.20)

Alternatively, \(\mathcal {M} = (K^{-1} A)^t\), if K is the diagonal matrix with the out-degrees in the diagonal and A is the adjacency matrix of the network. The solution is given, in the steady state, by

$$\begin{aligned} {PR} = (\mathbbm {1}- \alpha \mathcal {M})^{-1} \frac{1-\alpha }{N} \mathbf {1}, \end{aligned}$$
(6.21)

with the identity matrix \(\mathbb {1}\). A solution exists and is unique for \(0< \alpha < 1\).

Conceptually, the PageRank formula reflects a model of a random surfer in the WWW who gets bored after several clicks and switches to a random page. The PageRank value of a page measures the chance that the random surfer will land on that page by clicking on a link. If a page has no links to other pages, it becomes a sink and therefore terminates the random surfing process, unless \(\alpha < 1\). In this case, the random surfer arriving at a sink page, jumps to a random webpage chosen uniformly at random. Hence \((1-\alpha )/N\) in Eqs. (6.18) and (6.19) is interpreted as a teleportation term.

Scale-free networks are characterized by high robustness against the random failure of nodes, but susceptible to coordinated attacks on the hubs. Theoretically, they are thought to arise from a dynamical growth process, called preferential attachment, in which new nodes favor linking to existing nodes with high degree (Barabási and Albert 1999). Albert-László Barabási was highly influential in popularizing the study of complex networks by explaining the ubiquity of scale-free networks with preferential attachment models of network growth. However, the statistician Udny Yule already introduced the notion of preferential attachment in 1925, when he analyzed the power-law distribution of the number of species per genus of flowering plants (Udny Yule 1925) . Alternative formation mechanisms for scale-free networks have been proposed, such as fitness-based models (Caldarelli et al. 2002).

4.3.4 Cumulative Scaling-Law Relations

A final type of scaling-law relation appears in collections of random variables, called stochastic processes (see Sect. 7.1.1.1). Prominent empirical examples are financial time series, where one finds empirical scaling laws governing the relationship between various observed quantities. Time series are simple series of data points ordered by time. For instance, the price of a security or asset at time t is given by x(t). The collection of prices at different times during some time horizon, i.e., \(t \in [t_{\text {S}}, t_{\text {E}}]\), constitutes a series, which can be plotted as a chart. De facto, financial instruments are associated with a spread s(t) , quantifying the difference between the ask \(x^{\text {ask}}(t)\) and the bid \(x^{\text {bid}}(t)\) prices of the security or asset, quoted to sellers and buyers, respectively. The mid price is defined as

$$\begin{aligned} x (t) = \frac{ x^{\text {ask}}(t) + x^{\text {bid}}(t) }{2}. \end{aligned}$$
(6.22)

As prices are quoted discretely, the parameter t can be replaced by a set of time instances \(t_i\). Consequently, \(x (t_i)\) can be denoted by \(x_i\), yielding a simpler notation. Price moves are defined as percentages

$$\begin{aligned} \varDelta x_i = \frac{x_i-x_{i-1}}{x_{i-1}}. \end{aligned}$$
(6.23)

The average price moves for a time horizon is thus

$$\begin{aligned} \langle \varDelta x \rangle = \sqrt{\frac{1}{n} \sum _{j=1}^n (\varDelta x_{j})^2}. \end{aligned}$$
(6.24)

In Glattfelder et al. (2011) 18 empirical scaling-law relations were uncovered in the foreign exchange market, 12 of them being independent of each other. The foreign exchange market can be characterized as a complex network consisting of interacting agents: corporations, institutional and retail traders, and brokers trading through market makers, who themselves form an intricate web of interdependence. With an average daily turnover of approximately USD five trillion (Bank of International Settlement 2016) and with price changes nearly every second, the foreign exchange market offers a unique opportunity to analyze the functioning of a highly liquid, over-the-counter market that is not constrained by specific exchange-based rules. This market is an order of magnitude bigger than the futures or equity markets (ISDA 2014). An example of such an emerging scaling law in the foreign exchange market is the following: The average time interval \(\langle \varDelta t \rangle \) for a price change of size \(\varDelta x\) to occur follows a scaling-law relation

$$\begin{aligned} \langle \varDelta t\rangle \sim \varDelta x^\alpha . \end{aligned}$$
(6.25)

Figure 6.7 shows an illustration of this scaling law for the Euro to US Dollar exchange rate. Another cumulative scaling-law relation counts the average yearly number of price moves of size \(\varDelta x\)

$$\begin{aligned} \textsf {N} (\varDelta x) \sim \varDelta x^\alpha . \end{aligned}$$
(6.26)

In Fig. 6.8 this scaling law is plotted for 13 currency pairs and a benchmark Gaussian random walk. Finally, a salient novel scaling-law relation unveils that after any directional change of a price move, measured by a threshold \(\delta \), the price will continue, on average, to move by a percentage \(\omega \) . This overshoot scaling law has a trivial form

$$\begin{aligned} \langle \omega \rangle \sim \delta . \end{aligned}$$
(6.27)

In other words, any directional change in a time series will be followed by an overshoot of the same size, on all scales. See Glattfelder et al. (2011) for more details and the remaining 15 scaling laws.

Fig. 6.7
figure 7

Financial scaling-law relation. The log-log plot shows the empirical results for the EUR/USD currency pair over a five year data sample. The time (\(\varDelta t\)) during which a price move of \(\varDelta x \in [0.045\%, \dots , 4.0\%]\) happens is related to the magnitude of these moves. The exponent is estimated at \(\alpha = 1.9608 \pm 0.0025\). The y-axis data range is approximately from 16 min to 69 days

Fig. 6.8
figure 8

Number of yearly price moves scaling law. Plots based on five years of tick-by-tick data for 13 exchange rates and a Gaussian random walk model. See Glattfelder et al. (2011) for details

These laws represent the foundation of a new generation of tools for studying volatility, measuring risk, and creating better forecasting (Golub et al. 2016). They also substantially extend the catalog of stylized facts found in financial time series (Guillaume et al. 1997; Dacorogna et al. 2001) and sharply constrain the space of possible theoretical explanations of the market mechanisms. The laws can be used to define an event-based framework, substituting the passage of physical time with market activity (Guillaume et al. 1997; Glattfelder et al. 2011; Aloud et al. 2011). Consolidating all these building blocks cumulates in a new generation of automated trading algorithms Footnote 11 which not only generate profits, but also provide liquidity and stability to financial markets (Golub et al. 2018). See also Müller et al. (1990), Mantegna and Stanley (1995), Galluccio et al. (1997) for early accounts of the scaling properties in foreign exchange markets.

4.3.5 A Word of Caution

Finding laws of nature for complex systems is a challenging task. By design, there exist many levels of organization which interact with each other in theses systems. Moreover, the laws represent idealizations lurking in the murky depths hidden beneath layers of messy data. While the laws of nature relating to Volume I of the Book of Nature are clear-cut and orderly, Volume II struggles with this clarity. The importance of the four types of universal scaling laws previously discussed has been challenged by some.

The debate boils down to the following question: How far can one deviate from statistical rigor to detect an approximation of an organizational principle in nature? In the early days of scaling laws, some physicists have been accused of simply plotting their data in a log-log plot and squinting at the screen to declare a scaling law. However, the statistical criteria for a true scaling law to be found in empirical data are quite involved (Clauset et al. 2009). Recently, the ubiquity of scale-free networks has been questioned (Broido and Clauset 2018).

Network scientists have adapted to such challenges. For one, the strict adherence to a precise power law is relaxed. As an example, a broader class called “heavy-tailed” networks is invoked. Many real-world networks show such a distribution in their degree distribution. In other words, they share the characteristics of a scale-free network (such as robustness and vulnerability) without actually obeying a strict power law. In essence, real-world networks are determined by many different mechanisms and processes which nudge the network away from pure scale-freeness, making them heavy-tailed. However, the biggest and simplest factor responsible for spoiling any neat idealized scaling-law behavior in all real applications could be, in the words of physicist-turned-network scientist Alessandro Vespignani, that (quoted in Klarreich 2018):

In the real world, there is dirt and dust, and this dirt and dust will be on your data. You will never see the perfect power law.

Nonetheless, perhaps the real impediment that network researchers face is far deeper, echoing the poststructural and postmodern sentiment from Sect. 6.2.2. Again Vespignani (Klarreich 2018):

There is no general theory of networks.

Barabási replied to the accusations in a blog post.Footnote 12 In essence, (Broido and Clauset 2018) utilize a “fictional criterion of scale-free networks,” which “fails the most elementary tests.”