Exercise 2. Functional characterization of genes up and down expressed in RNAseq experiment


Objectives: To characterize functionally two gene sets that are up or down expressed from RNAseq data analysis where we compare the expression level of two experimental groups: motor vs. apoptosis.

Data: We have 105 up-expressed genes in apoptosis (and down-expressed in motor) and 124 genes up-expressed in motor group. They are included in files “apoptosis.txt” and “motor.txt”.

Workflow:

  1. Open both files in a text editor and inspect their content.
  2. There shouldn’t have common genes between them. Check it quickly using http://bioinfogp.cnb.csic.es/tools/venny/ or bash scripting.
  3. From Venny, why Venny numbers doesn’t match with the previous numbers (105 and 124)? Do repeated genes alter the functional enrichment results? Does Babelomics process these repeated IDs?
  4. Upload both files into Babelomics through the Upload button. We have to specify the data type: “Id list (Id)”.
  5. Select the functional enrichment analysis tool in the menu “Functional / Single Enrichment: FatiGO”.
  6. We are interested in several scenarios. Use molecular functions from Gene Ontology and this threshold: adjusted p-value < 0.1
    • a. Functional characterization of up-expressed genes in apoptosis group.
    • b. Functional characterization of up-expressed genes in motor group.
    • c. Functional characterization of both groups. checking those functions enriched in one group and not in the other. What analysis do we should use? Be careful with repeated genes!

Questions:

  1. How many GO term are significant for each situation?
  2. We are interested in GO term “Regulation of protein modification process” (molecular function). What gene set has this function over represented?
  3. What means sign and value of logarithm of Odds Ratio?
  4. What are the differences between p-values and adjusted p-values in significance?
  5. How do we interpret the graphic that appears under the significant results?