The following article is Open access

The meta book and size-dependent properties of written language

, and

Published 10 December 2009 Published under licence by IOP Publishing Ltd
, , Citation Sebastian Bernhardsson et al 2009 New J. Phys. 11 123015 DOI 10.1088/1367-2630/11/12/123015

1367-2630/11/12/123015

Abstract

Evidence is presented for a systematic text-length dependence of the power-law index γ of a single book. The estimated γ values are consistent with a monotonic decrease from 2 to 1 with increasing text length. A direct connection to an extended Heap's law is explored. The infinite book limit is, as a consequence, proposed to be given by γ=1 instead of the value γ=2 expected if Zipf's law is universally applicable. In addition, we explore the idea that the systematic text-length dependence can be described by a meta book concept, which is an abstract representation reflecting the word-frequency structure of a text. According to this concept the word-frequency distribution of a text, with a certain length written by a single author, has the same characteristics as a text of the same length extracted from an imaginary complete infinite corpus written by the same author.

Export citation and abstract BibTeX RIS

Please wait… references are loading.
10.1088/1367-2630/11/12/123015