Activity 3
Objectives: Cluster Analysis for zebrafish embryogenesis data
Data: We would like to perform a hierarchical clustering analysis of genes in “zebrafish_embryo.txt”. This example file contains the first 999 genes of the 3,657 genes that showed significant levels of differential expression in Mathavan et al. study (2005).
Workflow + questions:
- Open the file and explore its structure.
- Upload your file to Babelomics 5.0. Go to section Expression>Clustering
- Cluster samples for different scenarios:
- UPGMA + Euclidean
- Do you see any patterns of gene expression between different developmental stages?
- Could you download files with newick format? Do you know this format?
- UPGMA + Correlation coeff. (Pearson)
- Which distance parameter is better for proper clustering?
- Repeat the analysis using the same distance parameters and SOTA method:
- SOTA + Euclidean
- SOTA + Correlation coeff. (Pearson)
- Do the results change based on the method or the distance parameter?