Presentation
Web usage
Data Management
Data Preprocessing
Expression Data Analysis
Genomic Data Analysis
Functional Profiling Analysis
SNOW stands for “Studying Networks in the Omic World”. SNOW extracts and evaluates the cooperative behavior of lists of proteins/genes in terms of protein-protein interactions. Thus, SNOW complements other Babelomics tools as FatiGO, introducing a new dimension in the functional profiling of high-throughput experiments results, this is, protein-protein interaction data.
Protein-Protein Interactions are a central point at almost every level of cell function. A great effort is being made to provide a high quality map of the complete set of protein-protein interactions in the cell: The Interactome. The introduction of this kind of data into functional genomics may give us important clues in the understanding of cell activity.
We used data from the main protein-protein interactions public databases (HPRD, IntAct, BIND, DIP and MINT) to generate two interactomes: a non-filtered interactome that takes all the protein-protein interactions from the databases and a filtered interactome, with the protein-protein interactions that are detected for at least two different methodologies. User may also submit their own interactions.
SNOW identifies hubs in the list of proteins/genes (nodes) and evaluates the global degree of connections, centrality and neighborhood aggregation of the list by comparing the distributions of nodes connections degree, betweenness centrality and clustering coefficient respectively against the complete distribution of these parameters into the interactome of reference. Besides this, SNOW extracts the minimum network that connects the proteins/genes in the list. A user-fixed number of external proteins to connect nodes in the list is allowed. The topology of this network is evaluated by comparing distributions of node, edge and graph parameters of this network against pre-calculated distributions of a set (10000) of random lists with same size range. By this, Snow extracts information about whether the network represented in the list have more hubs, is more connected or have a more regular connections distribution than a random network.
Snow also provides an interactive visualization of the network and a complete description of interactome and local network parameters of each protein/gene in the list as well as the external nodes introduced by the program. This information together with the functional annotation provided will guide the user to identify the important nodes within, or even outside, the list as well as evaluate the modular functionality of the list as an entity.
In the same terms, a two lists comparison is also implemented.
SNOW is a web-based tool that introduces protein-protein interaction data into the functional profiling of genome-scale experiments. It extracts from a list of pre-selected proteins or genes the minimal connected network (smallest network that connects all the elements of the list) that they conform in terms of physical interactions and then it evaluates its topological parameters comparing them versus same-size networks generated from random lists of genes/proteins.
SNOW has the possibility of performing the following analyses:
In this tutorial we will show the different usages of SNOW. Furthermore, there are several examples that includes the lists of genes to perform the analyses as well as the pre-calculated results.
The following sections are the recipes to follow according to the data you have and depending on what type of analysis you want to perform.
1.1. Scenario: You have a list of proteins or genes that you have selected for a particular reason, for instance they are the result of a differential expression analysis of a microarray experiment, they are the spots identiyied as differentiate two samples in a two-dimensional gel electrophoresis or they have an interesting pattern of expression along several samples in a trancriptomic analysis. From this list of proteins/genes you want to find out whether they have something in common in terms of functionality. The lits have been selected under some particular reason that a priori may suggest that they migth be functionaly related. There are several programs/tools that try to extract those functionalities that are behind the set of genes/proteins, FatiGO or Marmite are two examples (find them within Babelomics. SNOW is another program with this aim, its particularity is that it uses protein-protein interaction data and evaluates protein/gene modules with a structural component.
1.2. What can you get using SNOW?:SNOW can map the genes/proteins in your list into the human interactome and extract the important ones as well as evaluate the list as a unit and tell you whether it is enriched in hubs, central proteins or they are in highly interconnected areas. Furthermore, SNOW calculates the minimal connected network of the list (the smallest network that connects the elements in the list) and evaluate its topology comparing it versus same-size networks generated from random lists of proteins/genes.
1.3. Parameters to choose In the SNOW web from you can choose between performing analysis of one or two lists, choose one list tab and you will find the following options:
1.4. Some constraints: Currently SNOW accepts lists in the range of 3 to 500 proteins/genes that can be mapped into the reference interactome.
1.5. Proteins/Gene IDs supported:
1.6. Some specifications: If you submit your own interactions, the MCN will not be compared versus same-size networks generated from random lists because this process is highly time-consuming and the distributions of the network parameters must be pre-calculated.
1.7. Output:
Users may view the network generated through a user friendly window that allows to manipulated the network and obtain functional information interactively.
Nodes belonging to the same component are colored with same color. The color-intensity of the nodes within each component means the centrality of the node within the complete MCN. Higher intensity corrspond to higher betweenness centrality. The size of the nodes mean nothing, it is just a matter of visualization due to label-lenght variability.
The applet has several options to facilitate the exploration of the MCN, some examples are the posibility of hiding nodes or edges that can be restored afterwards (show/hide nodes/edges option), gene/protein names can be shown or hidden, the dynamical layout can be switched off to move the nodes as we more like, etc. Here is the legend (under info bottom) with some help on visualization.
Here are several examples of lists of genes selected to differentiate two samples in microarray experiments. The description of the experiment is given.
The SNOW parameters used to perform the analyses were:
Donwload the lists and perform your own SNOW analyses choosing same or different parameters. For a reference we give the results pages as you will obtain them, have a look at them and compare them with SNOW results using different parameters taking into account that results shown here may have been run with different version of ppi data.
Example number | Dataset | Description |
---|---|---|
2.1 | brca1_overexp_up | Upregulated by induction of exogenous BRCA1 in EcR-293 cells |
2.2 | brca1_overexp_dn | Downregulated by induction of exogenous BRCA1 in EcR-293 cells |
2.3 | serum_fibroblast_cellcycle | Cell-cycle dependent genes regulated following exposure to serum in a variety of human fibroblast cell lines |
2.4 | ageing_brain_dn | Age-downregulated in the human frontal cortex |
2.5 | brca1_sw480_up | Up-regulated by infection of human colon adenocarcinoma cells (SW480) with Ad-BRCA1, versus Ad-LacZ control |
2.6 | et743_resist_dn | Down-regulated in two Et-743-resistant cell lines (chondrosarcoma and ovarian carcinoma) compared to sensitive parental lines |
2.7 | hematop_stem_all_up | Up-regulated in populations of human hematopoietic stem cells (CD34+/CD38-/Lin-) from bone marrow, umbilical cord blood, and peripheral blood stem-progenitor cells, compared to the stem cell-depleted population (CD34+/[CD38/Lin++]) |
2.8 | oldage_dn | Downregulated in fibroblasts from old individuals, compared to young |
2.9 | brca2_brca1_up | Genes up-regulated in BRCA2-linked breast tumors, relative to BRCA1-linked tumors |
2.10 | hdaci_colon_cur2hrs_up | Upregulated by curcumin at 2 hrs in SW260 colon carcinoma cells |
2.11 | p21_p53_any_dn | Down-regulated at any timepoint (4-24 hrs) following ectopic expression of p21 (CDKN1A) in OvCa cells, p53-dependent |
SNOW gives the facility of using your own ppi dataset as the backgroung interactome for the analysis. You may submit ppi data in .sif or tabulated format (see examples below).
When using own ppi data, SNOW calculates the topological parameters of the complete dataset of ppis given by the user. The list of proteins/genes submitted by the user is tested to check whether it is enriched in hubs, central nodes or well-interconnected areas in comparison to the whole dataset.
SNOW generates the MCN of the list of proteins/genes submitted and presents its functional annotation. The comparison of the MCN topological parameters versus a set of random lists is not done when using own interactions due to computational constraints. The generatuion of 10,000 MCNs can last one or two days.
To show an example of a SNOW analysis using your own interactions, we have generated a .sif file with all protein-protein interactions from the complete collection of KEGG signalling pathways . This dataset may represents a subset of the interactome concentrated in proteins associated with the signalling machinery of the cell.
The lists used for this examples where extracted from an study that gets essential genes in different types of cancers ( Luo B. et al., 2008). An SNOW analysis of this set of lists using as interactome the signalling pathways determines the role of these lists within the signalling machinery of the cell.
Example number | ppi dataset (.sif) | List | Description |
---|---|---|---|
3.1 | kegg_signalling_pathways.sif | UL2 | 200 essential genes in Glioblastoma (UL2 cell line) |
3.2 | kegg_signalling_pathways.sif | H1975 | 200 essential genes in Non-Small-cell Lung cancer (H1975 cell line) |
4.1. Scenario: You have two lists of proteins or genes that you have selected for a particular reason, for instance they are the result of a differential expression analysis of a microarray experiment (over and under expressed), they are the spots identifyed as differentiate two samples in a two-dimensional gel electrophoresis or they have an interesting pattern of expression along several samples in a trancriptomic analysis.
Now you want to compare both lists to see how different they are in terms of the internal structure that their physical interactions conform.
4.2. What can you get using SNOW?: SNOW can map the genes/proteins in your lists into the human interactome and extract the important ones as well as compare them in terms of connectivity, centrality and clustering coefficient within the whole interactome.
Furthermore, SNOW will calculate a minimal connected network from both lists and then compare their topology to check which one is more structured and in which terms.
4.3. Parameters to choose In the SNOW web from you can choose between performing analysis of one or two lists, choose two lists tab and you will find the following options (see image below)
4.4. Two lists analysis output
Set of tables to obtain the maximum of information about minimal connected networks functionality. One set of tables per each list. Functional information about proteins/genes in the list, and also about the ones introduced by Snow. Shortests paths within the network Components, bicomponents and articulation points functional information. Topological & Functional information
Users may view the networks generated through a user friendly window that allows to manipulated the network and obtain functional information interactively.
Nodes belonging to the same component are colored with same color. The color-intensity of the nodes within each component means the centrality of the node within the complete MCN. Higher intensity corrspond to higher betweenness centrality. The size of the nodes mean nothing, it is just a matter of visualization due to label lenghtvariability.
The applet has several options to facilitate the exploration of the MCN, some examples are the posibility of hiding nodes or edges that can be restored afterwards (show/hide nodes/edges option), gene/protein names can be shown or hidden, the dynamical layout can be switched off to move the nodes as we more like, etc. Here is the legend (under info bottom) with some help on visualization.