figshare
Browse
netrevindex.zip (75.73 kB)

Word Frequency Count from Network Review

Download (0 kB)
dataset
posted on 2012-12-12, 16:47 authored by Tim EvansTim Evans

These are files containing the data used in one of the plots shown in Figure 10 in my basic overview of Complex Networks (a review for Contemporary Physics, see below). Please cite the source if you use this data. However you could also do this yourself by using the LaTeX file via arXiv. It was produced by using various UNIX tools to strip the LaTeX commands to produce a list of words (one per line) followed by counting the number of times each line was repeated. I can see that there was no stopping or stemming

e.g. "The" and "the" appear separately, "vertex" and "vertices" are counted separately.

Files:-

netrevcountrawdata.xls = rank and count for each word, along with plots

netrevcountTabSeparated.txt = rank and count for each word in simple text format

netrevindex.txt = raw data, unsorted (note there are some silly 'words' like "x"

 

Original Text:-

T.S.Evans
Complex Networks
Contemporary Physics 45 (2004) 455-475
DOI: 10.1080/00107510412331283531
arXiv:cond-mat/0405123
http://arxiv.org/abs/cond-mat/0405123

 

History