ALL Metrics
-
Views
-
Downloads
Get PDF
Get XML
Cite
Export
Track
Software Tool Article
Revised

Contextual Hub Analysis Tool (CHAT): A Cytoscape app for identifying contextually relevant hubs in biological networks

[version 2; peer review: 2 approved]
* Equal contributors
PUBLISHED 30 Aug 2016
Author details Author details
OPEN PEER REVIEW
REVIEWER STATUS

This article is included in the Cytoscape gateway.

Abstract

Highly connected nodes (hubs) in biological networks are topologically important to the structure of the network and have also been shown to be preferentially associated with a range of phenotypes of interest. The relative importance of a hub node, however, can change depending on the biological context. Here, we report a Cytoscape app, the Contextual Hub Analysis Tool (CHAT), which enables users to easily construct and visualize a network of interactions from a gene or protein list of interest, integrate contextual information, such as gene expression or mass spectrometry data, and identify hub nodes that are more highly connected to contextual nodes (e.g. genes or proteins that are differentially expressed) than expected by chance. In a case study, we use CHAT to construct a network of genes that are differentially expressed in Dengue fever, a viral infection. CHAT was used to identify and compare contextual and degree-based hubs in this network. The top 20 degree-based hubs were enriched in pathways related to the cell cycle and cancer, which is likely due to the fact that proteins involved in these processes tend to be highly connected in general. In comparison, the top 20 contextual hubs were enriched in pathways commonly observed in a viral infection including pathways related to the immune response to viral infection. This analysis shows that such contextual hubs are considerably more biologically relevant than degree-based hubs and that analyses which rely on the identification of hubs solely based on their connectivity may be biased towards nodes that are highly connected in general rather than in the specific context of interest.
 
Availability: CHAT is available for Cytoscape 3.0+ and can be installed via the Cytoscape App Store (http://apps.cytoscape.org/apps/chat).

Keywords

Network analysis, hypergeometric test, hubs, gene expression data, contextual hub analysis, CHAT

Revised Amendments from Version 1

Incorporated Sandra Orchard's and Pablo Porras's suggestions to add a reference, to clarify that interactions between input list genes or proteins are considered in the network creation and calculations and to elucidate that CHAT can not only be applied to gene lists but also works for protein lists.

See the authors' detailed response to the review by Sandra Orchard and Pablo Porras Millán

Introduction

Network analysis has emerged as a powerful approach to elucidate biological and disease processes1. Biological networks (and many other types of networks) have been shown to have a power law distribution of node connectivity, with most nodes having few connections and a few nodes being highly connected2. The identification of such highly connected nodes, termed hubs, is often of interest as hubs have been shown to be topologically and functionally important. The deletion of genes encoding hub proteins, for example, has been shown to correlate with lethality in yeast (the centrality-lethality rule)3. Hubs have also been found to be preferentially targeted by both bacterial and viral pathogens4 and may be master regulators of biological processes5. Biological networks, such as the human interactome, however, are not static entities6, and the extent to which a node acts as a hub can change depending on the biological context e.g. the network present in a specific cell type at a particular point in time7,8. Integrating contextual information, such as gene or protein expression data, with standard network analysis can provide insight into what are the most relevant network features in a particular study or context911.

Cytoscape has a number of applications to identify hubs in networks including cytoHubba12, APID2Net13, PinnacleZ14, NetworkAnalyzer15,16 and CentiScaPe17, however, only the latter two are compatible with Cytoscape 3+. All of the applications available to date identify hubs based on node connectivity (degree) in a network of interest. To construct a network, users frequently query interaction databases to identify the interactors of a list of genes of interest, e.g. differentially expressed genes, and then identify the high degree nodes in this network. This approach to constructing a network is useful because it identifies a more fully connected network for analysis than would be the case if one restricted interactions to only those that occur between nodes in the gene list. Analysis of these networks can, for example, identify subnetworks that are enriched in (but do not exclusively consist of) differentially expressed genes, or identify non-differentially expressed nodes that are topologically important in the network, both of which would otherwise not be identified. Identifying hubs in these networks, however, is biased towards identifying nodes that are highly connected in general such as promiscuous, ubiquitous or well-studied nodes, because nodes with many interactions in the query database have a higher probability of being included in the network by chance alone. Analysis of these degree-based hubs, for example identifying what biological processes or pathways these nodes are enriched in, tells us little about the experimental context of interest and more about the properties of highly connected nodes in general. A more appropriate analysis is to determine which nodes interact with relevant nodes in the network (which we term contextual nodes) more than is statistically expected.

Here, we introduce the Contextual Hub Analysis Tool (CHAT), a Cytoscape App that identifies hub nodes that interact with more "contextual" nodes (e.g. differentially expressed genes or proteins) than statistically expected in networks integrated with user-supplied contextual data (e.g. gene expression data). We term these nodes contextual hubs. We show that such contextual hubs are considerably more relevant than degree-based hubs to the specific experimental context under investigation. As such, these nodes are promising candidates for further functional validation studies and potentially represent important points in the network for drug targeting.

Methods

Implementation

CHAT was written in Java 8 as an Open Services Gateway Initiative (OSGi) bundle for Cytoscape 3.0+18. It adds a “CHAT” option in the “Apps” menu that launches a popup window, which allows users to adjust different network initialization parameters. CHAT prompts users to input a list of gene identifiers (the supported ID types are dependent on the database selected by the user) and any associated contextual data, e.g. gene expression data associated with the genes. While the focus of this paper is on genes, CHAT can equally be applied to proteins. The OK button triggers Cytoscape’s TaskManager to run a task that initiates the network construction and adds a tab to the results panel that provides functionality to further modify and analyze the network. To create the network, CHAT finds all the first neighbor interactors of the user-provided genes (or their encoded products). Interaction data is retrieved from one of the databases included in the PSICQUIC registry19, which the user can select. Note that interactions between the first neighbors are considered by CHAT but these are not included in the network visualization for clarity reasons. Once the network has been constructed, CHAT performs a hypergeometric test on each node in the network to identify nodes that interact with contextual nodes more than expected by chance. The probability that a given hub has k or more contextual interactors among its n interactors is given by the hypergeometric distribution:

p(Xk)=x=kn(Kx)(NKnx)(Nn)

Where N is the number of genes with at least one interaction in the database queried and K is the number of contextually relevant nodes provided by the user (with at least one interaction in the database queried). Overrepresentation analysis heavily depends on the choice of background dataset for the determination of N. To estimate the background frequency K/N, CHAT provides access to interaction data from databases available in the PSICQUIC registry. Databases with less than 10,000 interactions are excluded. The number of genes in the user-selected database that have at least one interaction (of the specified type) in which both interactors match the user-selected criteria for constructing the network (species, interaction type and ID type) determine the node population size N. Self-interactions are disregarded. Interactions between input genes and between their first neighbors are considered in the CHAT analysis. P-values calculated by CHAT are automatically corrected for multiple testing using the Benjamini-Hochberg procedure20, a method widely used in bioinformatics to avoid high false discovery rates. The Bonferroni approach is widely considered to be too strict21.

A right click on a node brings up an option to activate the “Node Analyzer” mode, which allows the user to analyze the connectivity pattern of individual hubs of interest. Using this function will display the node analyzer table on the results panel and all nodes except the selected node and its interactors will be hidden in the network visualization. The execution time of CHAT varies between a few seconds and a few minutes based on the number of user-supplied (contextual) genes, the size of the chosen database and its connection speed as well as the user-selected network layout. These factors also influence memory consumption.

Operation

The identification of the top contextual hubs consists of three primary steps: 1) input of a user-supplied gene list and contextual data, 2) network construction and statistical analysis to identify nodes that preferentially interact with contextual nodes and 3) visualization of the top contextual hubs and their interactions and comparison to the top degree-based hubs. To construct a network using CHAT, the user must provide a list of gene identifiers and associated numerical or categorical attributes in the text box in tab-delimited format, or upload the data as a csv or tab-delimited file via the upload button (Figure 1) (.csv or .txt file types). The user can then specify which genes in the uploaded list are contextually important based on the user-provided contextual data (e.g. genes with > 2 fold-change in expression). The user then selects one of the databases in the PSICQUIC registry to query, and specifies the relevant species, ID type and interaction type for the query. The user can then choose to visualize the network using any of the layout algorithms available in Cytoscape. Clicking the OK button creates the network and a new tab in the results panel, which allows the user to visualize the network and to analyze the results further (Figure 2). The results panel is split into several parts. In the first part, the parameters used to generate the network (database, species, id type and interaction type(s)) are displayed. The second panel allows the user to compare the top contextual hubs and the top degree-based hubs at the click of a button. By default, node size and node color are proportional to the node’s corrected p-value calculated by CHAT, such that the smaller the p-value (i.e. more statistically significant), the larger the node size and the darker the red coloring of the node. The user can customize the color scheme, however. In contrast, if the users selects “Show degree hubs”, the visualization changes and the node size and coloring will now be proportional to each node’s degree in the selected database. By default, CHAT displays the top 20 contextual hubs but the user can adjust this by using the slider provided. To investigate a single node in detail the user can employ CHAT’s “Node Analyzer” by right clicking on a node. This will limit the network view to show only the selected node and its interactors and will display a table at the bottom of the results panel tab with information on the node’s name, p-value and its interactors.

0f3be9f0-14f8-4277-94eb-f847b22799dd_figure1.gif

Figure 1. CHAT network analysis.

To construct a network using CHAT, the user provides a list of gene identifiers and associated numerical or categorical attributes relevant in the context of interest.

0f3be9f0-14f8-4277-94eb-f847b22799dd_figure2.gif

Figure 2. Network visualization.

CHAT provides a number of options to customize the network visualization.

Use case

ENSG0000015409957.81999969
ENSG0000009229527.93000031
ENSG0000010869127.23999977
ENSG0000016482525.62999916
ENSG0000016366623.94000053
ENSG0000010870022.17000008
ENSG0000013310122.04999924
ENSG0000010495121.42000008
ENSG0000017896521.29000092
ENSG0000010759320.45000076
ENSG0000016924518.17000008
ENSG0000012535517.62999916
ENSG0000014956416.77000046
ENSG0000015136411.81999969
ENSG0000024364911.35000038
ENSG0000018533911.31999969
ENSG0000016627810.89999962
ENSG0000025522110.56000042
ENSG0000008882710.44999981
ENSG0000016692010.23999977
ENSG0000016760110.17000008
ENSG000001961419.140000343
ENSG000000780988.840000153
ENSG000001172668.729999542
ENSG000001857458.56000042
ENSG000001987858.050000191
ENSG000001591897.880000114
ENSG000001627727.519999981
ENSG000001733697.349999905
ENSG000001083877.170000076
ENSG000000780817.150000095
ENSG000001491317.099999905
ENSG000001626147.019999981
ENSG000001343266.96999979
ENSG000000205776.960000038
ENSG000001692486.889999866
ENSG000000060756.809999943
ENSG000001972726.789999962
ENSG000001366896.78000021
ENSG000001377576.730000019
ENSG000001876086.610000134
ENSG000001842706.599999905
ENSG000001104926.480000019
ENSG000001426876.460000038
ENSG000001738016.420000076
ENSG000001067856.320000172
ENSG000001616406.300000191
ENSG000001087716.289999962
ENSG000001377266.050000191
ENSG000000793856.039999962
ENSG000001151555.96999979
ENSG000001853385.96999979
ENSG000001455555.789999962
ENSG000001199175.760000229
ENSG000001003425.730000019
ENSG000001988295.71999979
ENSG000001659975.699999809
ENSG000002053625.590000153
ENSG000001086795.579999924
ENSG000001659495.550000191
ENSG000001251485.519999981
ENSG000001733725.480000019
ENSG000001746005.429999828
ENSG000001680625.429999828
ENSG000001882905.360000134
ENSG000002135335.349999905
ENSG000001527665.349999905
ENSG000001717295.309999943
ENSG000000545985.289999962
ENSG000001369605.28000021
ENSG000001202175.260000229
ENSG000001312035.210000038
ENSG000001395725.199999809
ENSG000000389455.179999828
ENSG000001119125.150000095
ENSG000001855075.050000191
ENSG000001113355.03000021
ENSG000001251445.019999981
ENSG000001151595
ENSG000001326695
ENSG000001365144.940000057
ENSG000001343214.889999866
ENSG000001433444.820000172
ENSG000001236104.789999962
ENSG000001351144.75
ENSG000001842604.670000076
ENSG000001583734.630000114
ENSG000001781754.53000021
ENSG000001348094.5
ENSG000001350474.46999979
ENSG000001716314.429999828
ENSG000001218584.380000114
ENSG000001438914.380000114
ENSG000001152674.360000134
ENSG000001708664.269999981
ENSG000001226434.269999981
ENSG000001113314.230000019
ENSG000001379594.179999828
ENSG000002446174.159999847
ENSG000001858804.150000095
ENSG000001635684.059999943
ENSG000001386424.050000191
ENSG000001656824.050000191
ENSG000001212364.050000191
ENSG000001683064.050000191
ENSG000001019864.039999962
ENSG000001553633.960000038
ENSG000001786853.940000057
ENSG000001362313.920000076
ENSG000001883133.900000095
ENSG000001199153.900000095
ENSG000001304893.900000095
ENSG000001846783.880000114
ENSG000001394103.869999886
ENSG000001680263.859999895
ENSG000000680793.839999914
ENSG000001346273.839999914
ENSG000001966643.809999943
ENSG000001205393.789999962
ENSG000001721593.779999971
ENSG000001650293.769999981
ENSG000001689613.75
ENSG000001072013.74000001
ENSG000001592283.730000019
ENSG000001247623.720000029
ENSG000001172283.710000038
ENSG000001166913.710000038
ENSG000001755183.700000048
ENSG000000957393.660000086
ENSG000001468593.630000114
ENSG000001988483.630000114
ENSG000000891273.619999886
ENSG000001718603.609999895
ENSG000001433673.599999905
ENSG000001120533.599999905
ENSG000002041033.569999933
ENSG000002467053.559999943
ENSG000001440353.529999971
ENSG000001888203.529999971
ENSG000001418373.519999981
ENSG000001833473.50999999
ENSG000001386463.49000001
ENSG000000760673.49000001
ENSG000001799213.460000038
ENSG000001638233.420000076
ENSG000001242563.420000076
ENSG000002545213.359999895
ENSG000001774093.359999895
ENSG000001972493.349999905
ENSG000001170103.339999914
ENSG000001199223.329999924
ENSG000001325303.289999962
ENSG000001295383.289999962
ENSG000001073173.24000001
ENSG000001871163.210000038
ENSG000000025493.200000048
ENSG000001381193.190000057
ENSG000001297573.180000067
ENSG000001166633.160000086
ENSG000001368163.160000086
ENSG000002219633.150000095
ENSG000001446553.140000105
ENSG000002539583.130000114
ENSG000000553323.119999886
ENSG000001987193.109999895
ENSG000001404643.079999924
ENSG000000593783.069999933
ENSG000002582273.069999933
ENSG000001066053.069999933
ENSG000001834863.049999952
ENSG000001344703.039999962
ENSG000001119113.029999971
ENSG000001464253.029999971
ENSG000001127733.029999971
ENSG000001111813.019999981
ENSG000001160163.019999981
ENSG000001452873.00999999
ENSG000001002983
ENSG000000100302.99000001
ENSG000001262622.99000001
ENSG000000357202.980000019
ENSG000001130702.980000019
ENSG000001581042.970000029
ENSG000001376282.970000029
ENSG000001121372.960000038
ENSG000001319792.960000038
ENSG000001379652.960000038
ENSG000001305892.950000048
ENSG000001398322.940000057
ENSG000001165142.930000067
ENSG000002054132.920000076
ENSG000000670662.900000095
ENSG000001154152.890000105
ENSG000001683942.880000114
ENSG000001489262.880000114
ENSG000001520612.869999886
ENSG000001704392.859999895
ENSG000001708352.839999914
ENSG000001041472.829999924
ENSG000001680032.819999933
This is a portion of the data; to view all the data, please download the file.
Dataset 1.Use case data.
462 genes that have been reported to be up-regulated during Dengue fever infection

As a demonstration of its potential utility and as validation, CHAT was used to construct a network using a dataset of 462 genes that have been reported to be up-regulated during Dengue fever, a mosquito-borne viral infection22 (Ensembl gene IDs for these 462 genes are provided in Dataset 1). These 462 genes represent the contextual data for this case study. CHAT was used to construct a network of these genes and their first neighbor interactors using interaction data that was sourced from InnateDB23,24 via the PSICQUIC web service (InnateDB-All). A network of 4,910 nodes was generated. CHAT was then used to identify the top 20 conventional hub nodes (based solely on degree) and the top 20 contextual hub nodes in the network (Figure 3). No nodes were in common in the two top 20 lists. InnateDB pathway analysis23,24 revealed that the top 20 degree-based hubs were enriched in pathways related to the cell cycle and cancer (Supplementary Table 1), which is likely due to the fact that proteins involved in these processes tend to be highly connected in general. In comparison to degree-based hubs, the top 20 contextual hubs were statistically enriched in pathways related to the immune response to viral infection, such as the interferon signaling pathway; the Retinoic acid inducible gene-I (RIG-I) pathway; the Toll-like receptor (TLR) pathway; and the Janus kinase (JAK) - Signal Transducer and Activator of Transcription (STAT) pathway (Supplementary Table 2). All of these pathways have been shown to play key roles in the host response to Dengue infection25,26. Indeed, many of the top 20 contextual hubs (but not degree-based hubs) were well-known transcription factors involved in the host interferon response including STAT1, STAT2 and the interferon regulatory factors (IRFs); IRF1, 3, 8 and 9, which is a key cellular response to viral infection including Dengue27,28. Another gene identified in the contextual hub analysis but not the degree-based analysis was interferon-stimulated gene 15 (ISG15). Cells in which ISG15 has been silenced have been shown to have significantly higher Dengue viral loads29. The results of the pathway analysis were reinforced by a Gene Ontology analysis using innatedb.com23,24, which identified terms including cytokine-mediated signaling pathway, type I interferon signaling pathway, and innate immune response among the top 10 enriched terms (FDR < 0.05) for the contextual hubs but not the degree-based hubs (Supplementary Table 3 and Supplementary Table 4).

0f3be9f0-14f8-4277-94eb-f847b22799dd_figure3.gif

Figure 3. Visualization of a Dengue gene expression dataset.

A CHAT network visualization comparing contextual hubs (A) to degree-based hubs (B) in a network constructed using InnateDB23,24.

Conclusion

Through the integration of contextual information, such as gene or protein expression, contextual hub analysis as implemented in CHAT can identify context-specific hubs more relevant to the biological context under study, such as disease, treatment or cellular state. As shown in the above case study, these hubs are of more functional relevance than genes found through analysis based on degree only. Given the current emphasis on the importance of considering the network model of biological pathways and the ever-increasing abundance of high-throughput data, CHAT provides a valuable addition to the biologists’ computational toolkit in using a network-based approach to help prioritize genes of interest for further investigation or drug discovery. In the future, CHAT can be extended to include the contextual analysis of other network features such as network bottlenecks.

Data availability

F1000Research: Dataset 1. Use case data: 462 genes that have been reported to be up-regulated during Dengue fever infection, 10.5256/f1000research.9118.d12812630

Software availability

Software available from: http://apps.cytoscape.org/apps/chat

Latest source code: https://bitbucket.org/dynetteam/chat

Archived source code at time of publication: http://www.dx.doi.org/10.5281/zenodo.5649631

Manual/Tutorial: https://bitbucket.org/dynetteam/chat/downloads

License: Lesser GNU Public License 3.0

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 19 Jul 2016
Comment
Author details Author details
Competing interests
Grant information
Copyright
Download
 
Export To
metrics
Views Downloads
F1000Research - -
PubMed Central
Data from PMC are received and updated monthly.
- -
Citations
CITE
how to cite this article
Muetze T, Goenawan IH, Wiencko HL et al. Contextual Hub Analysis Tool (CHAT): A Cytoscape app for identifying contextually relevant hubs in biological networks [version 2; peer review: 2 approved] F1000Research 2016, 5:1745 (https://doi.org/10.12688/f1000research.9118.2)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
track
receive updates on this article
Track an article to receive email alerts on any updates to this article.

Open Peer Review

Current Reviewer Status: ?
Key to Reviewer Statuses VIEW
ApprovedThe paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approvedFundamental flaws in the paper seriously undermine the findings and conclusions
Version 2
VERSION 2
PUBLISHED 30 Aug 2016
Revised
Views
19
Cite
Reviewer Report 01 Nov 2016
Christopher K. Tuggle, Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA 
Haibo Liu, Bioinformatics and Computational Biology Program, Iowa State University, Ames, IA, USA 
Approved
VIEWS 19
This Cytoscape app “CHAT” is valuable given its improvement over conventional biological network analysis methods by considering the context of network analysis. It is a good addition to the Cytoscape toolkits. However, we suggest some modifications that might make the ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Tuggle CK and Liu H. Reviewer Report For: Contextual Hub Analysis Tool (CHAT): A Cytoscape app for identifying contextually relevant hubs in biological networks [version 2; peer review: 2 approved]. F1000Research 2016, 5:1745 (https://doi.org/10.5256/f1000research.10240.r16856)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
Version 1
VERSION 1
PUBLISHED 19 Jul 2016
Views
28
Cite
Reviewer Report 09 Aug 2016
Sandra Orchard, Wellcome Trust Genome Campus, European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK 
Pablo Porras Millán, Wellcome Trust Genome Campus, European Molecular Biology Laboratory-European Bioinformatics Institute, Hinxton, UK 
Approved
VIEWS 28
This is a well written technical paper, clearly outlining a new Cytoscape App in terms that would make it easy for a new user, with some familiarity with Cytoscape, to download, install and use. The ability to generate contextual hubs ... Continue reading
CITE
CITE
HOW TO CITE THIS REPORT
Orchard S and Porras Millán P. Reviewer Report For: Contextual Hub Analysis Tool (CHAT): A Cytoscape app for identifying contextually relevant hubs in biological networks [version 2; peer review: 2 approved]. F1000Research 2016, 5:1745 (https://doi.org/10.5256/f1000research.9812.r15065)
NOTE: it is important to ensure the information in square brackets after the title is included in all citations of this article.
  • Author Response 30 Aug 2016
    Tanja Muetze, EMBL Australia Biomedical Informatics Group, Infection & Immunity Theme, South Australian Medical and Health Research Institute, Adelaide, Australia
    30 Aug 2016
    Author Response
    Thank you very much for your thoughtful review. Below we have addressed each of the points raised.
     
    The application searches for first-neighbour interactions of molecules in the list presented ... Continue reading
COMMENTS ON THIS REPORT
  • Author Response 30 Aug 2016
    Tanja Muetze, EMBL Australia Biomedical Informatics Group, Infection & Immunity Theme, South Australian Medical and Health Research Institute, Adelaide, Australia
    30 Aug 2016
    Author Response
    Thank you very much for your thoughtful review. Below we have addressed each of the points raised.
     
    The application searches for first-neighbour interactions of molecules in the list presented ... Continue reading

Comments on this article Comments (0)

Version 2
VERSION 2 PUBLISHED 19 Jul 2016
Comment
Alongside their report, reviewers assign a status to the article:
Approved - the paper is scientifically sound in its current form and only minor, if any, improvements are suggested
Approved with reservations - A number of small changes, sometimes more significant revisions are required to address specific details and improve the papers academic merit.
Not approved - fundamental flaws in the paper seriously undermine the findings and conclusions
Sign In
If you've forgotten your password, please enter your email address below and we'll send you instructions on how to reset your password.

The email address should be the one you originally registered with F1000.

Email address not valid, please try again

You registered with F1000 via Google, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Google account password, please click here.

You registered with F1000 via Facebook, so we cannot reset your password.

To sign in, please click here.

If you still need help with your Facebook account password, please click here.

Code not correct, please try again
Email us for further assistance.
Server error, please try again.