Intra-Firm Information Flow: A Content-Structure Perspective

Berchenko, Yakir; Daliot, Or; Brueller, Nir N.

doi:10.1007/978-3-642-24800-9_6

Intra-Firm Information Flow: A Content-Structure Perspective

Yakir Berchenko^19,20,
Or Daliot²¹ &
Nir N. Brueller²²

Conference paper

1340 Accesses

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 7014))

Abstract

This paper endeavors to bring together two largely disparate areas of research. On one hand, text mining methods treat each document as an independent instance despite the fact that in many text domains, documents are linked and their topics are correlated. For example, web pages of related topics are often connected by hyperlinks and scientific papers from related fields are typically linked by citations. On the other hand, Social Network Analysis (SNA) typically treats edges between nodes according to ”flat” attributes in binary form alone. This paper proposes a simple approach that addresses both these issues in data mining scenarios involving corpora of linked documents. According to this approach, after assigning weights to the edges between documents, based on the content of the documents associated with each edge, we apply standard SNA and network theory tools to the network. The method is tested on the Enron email corpus and successfully discovers the central people in the organization and the relevant communications between them. Furthermore, Our findings suggest that due to the non-conservative nature of information, conservative centrality measures (such as PageRank) are less adequate here than non-conservative centrality measures (such as eigenvector centrality).

This is a preview of subscription content, log in via an institution.

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Learn about institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

References

Wasserman, S., Faust, K.: Social network analysis: Methods and applications. Cambridge University Press, Cambridge (1994)
Book MATH Google Scholar
Newman, M.E.J.: Who is the best connected scientist? A study of scientific coauthorship networks in Complex Networks. In: Ben-Naim, E., Frauenfelder, H., Toroczkai, Z. (eds.) pp. 337–370. Springer, Berlin (2004)
Google Scholar
Onnela, J.-P., Saramäki, J., Hyvonen, J., Szabó, G., Argollo de Menezes, M., Kaski, K., Barabási, A.-L., Kertész, J.: Analysis of a large-scale weighted network of one-to-one human communication. New J. Phys. 9, 179 (2007)
Article Google Scholar
Wu, F., Huberman, B.A., Adamic, L.A., Tyler, J.R.: Information flow in social groups. Physica A 337, 327–335 (2004)
Article MathSciNet Google Scholar
Kleinbaum, A.M., Stuart, T.E., Tushman, M.L.: Communication (and Coordination?) in a Modern, Complex Organization. Harvard Business School Working Paper, no. 09-004 (July 2008)
Google Scholar
Manning, C.D., Schütze, H.: Foundations of Statistical Natural Language Processing, 1st edn. The MIT Press, Cambridge (1999)
MATH Google Scholar
Boccaletti, S., Latora, V., Moreno, Y., Chavez, M., Hwang, D.U.: Complex networks: structure and dynamics. Physics Reports 424, 175–308 (2006)
Google Scholar
Athreya, K.B., Ney, P.E.: Branching Processes. Courier Dover Publications (2004)
Google Scholar
Shetty, J., Adibi, J.: The Enron email dataset database schema and brief statistical report (Technical Report). Information Sciences Institute (2004)
Google Scholar
McCallum, A., Corrada-Emmanuel, A., Wang, X.: Topic and Role Discovery in Social Networks. In: IJCAI (2005)
Google Scholar
Kleinberg Jon, M.: Authoritative sources in a hyperlinked environment. Journal of the ACM 46(5), 604–632 (1999)
Article MathSciNet MATH Google Scholar
Cristianini, N., Shawe-Taylor, J.: An Introduction to Support Vector Machines and other kernel-based learning methods. Cambridge University Press, Cambridge (2000)
Book MATH Google Scholar
Kurland, O., Lee, L.: Respect my authority! HITS without hyperlinks, utilizing cluster-based language models. In: Proceedings of SIGIR 2006, pp. 83–90 (2006)
Google Scholar
Page, L., Brin, S., Motwani, R., Winograd, T.: The PageRank citation ranking: bringing order to the web. Tech. rep. Stanford Digital Library Technologies Project (1998)
Google Scholar
Burgess, M., Canright, G., Engø-Monsen, K.: Mining location importance from the eigenvectors of directed graphs (2006), http://research.iu.hio.no/papers/directed.pdf
Langville Amy, N., Meyer Carl, D.: Deeper inside PageRank. Internet Mathematics Journal (2004)
Google Scholar
Rosen-Zvi, M., Griffiths, T., Steyvers, M., Smyth, P.: The author-topic model for authors and documents. In: Proceedings of the Conference on Uncertainty in Artificial Intelligence, vol. 20 (2004)
Google Scholar
Hirsch, J.E.: An index to quantify an individual’s scientific research output. PNAS 102(46), 16569–16572 (2005)
Article Google Scholar
Mimno, D., McCallum, A.: Mining a digital library for influential authors. In: Joint Conference on Digial Libraries, JCDL (2007)
Google Scholar
Frikh, B., Djanfar, A.S., Ouhbi, B.: An intelligent surfer model combining web contents and links based on simultaneous multiple-term query. In: Computer Systems and Applications, AICCSA 2009 (2009)
Google Scholar
Richardson, M., Domingos, P.: Combining Link and Content Information in Web Search. Web Dynamics, 179–194 (2004)
Google Scholar

Download references

Author information

Authors and Affiliations

The Leslie and Susan Gonda Multidisciplinary Brain Research Center Bar Ilan University, Ramat Gan, 52900, Israel
Yakir Berchenko
Department of Veterinary Medicine, University of Cambridge, Madingley Road, Cambridge, CB3 0ES, United Kingdom
Yakir Berchenko
Faculty of Law, Hebrew University of Jerusalem, Mt Scopus, Jerusalem, 91905, Israel
Or Daliot
Faculty of Management, Tel-Aviv University, Ramat-Aviv, Tel-Aviv, 69978, Israel
Nir N. Brueller

Authors

Yakir Berchenko
View author publications
You can also search for this author in PubMed Google Scholar
Or Daliot
View author publications
You can also search for this author in PubMed Google Scholar
Nir N. Brueller
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Faculty of Economics, LIAAD-INESC Porto, L.A., University of Porto, Rua de Ceuta, 118, 6, 4050-190, Porto, Portugal
João Gama
Department of Computer Science, University of Colorado, 80309-0430, Boulder, CO, USA
Elizabeth Bradley
Department of Information and Computer Science, Aalto University School of Science, P.O. Box 15400, 00076, Aalto, Finland
Jaakko Hollmén

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Berchenko, Y., Daliot, O., Brueller, N.N. (2011). Intra-Firm Information Flow: A Content-Structure Perspective. In: Gama, J., Bradley, E., Hollmén, J. (eds) Advances in Intelligent Data Analysis X. IDA 2011. Lecture Notes in Computer Science, vol 7014. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24800-9_6

Download citation

DOI: https://doi.org/10.1007/978-3-642-24800-9_6
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24799-6
Online ISBN: 978-3-642-24800-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics