Exercise 6

Data description




Working plan

We have several candidate profiles of mutations to explore:

How many variants do you detect for each scenario?

A. Individual filters

  1. Recessive heritage
  2. Dominant heritage (father is affected).
  3. For this region: 1:10000-500000
  4. Evaluate these regions at the same time: 1:10000-500000, 11:1000-500000, 13:500-100000
  5. For theses genes: CELA2A,CELA2B,CEP104
  6. Are there variants for these SNPs: rs6682385,rs75478250,rs147502335,rs368140013,rs7951297?
  7. Description of variants. How many SNVs, INDELs, MNVs, SVs, CNVs?
  8. Variants with MAF (Minimum Allelic Frequency) < 0.001 for all populations in 1000 Genomes phase 1
  9. Variants with MAF (Minimum Allelic Frequency) < 0.001 for all populations in 1000 Genomes phase 3
  10. Variants with MAF (Minimum Allelic Frequency) < 0.001 for European American population in ESP 6500


B. Progressive selection

  1. We have several clues about our candidate variants. In addition of knowing the pattern of recessive heritage, we search variants with MAF < 0.1 (for all populations in 1000 Genomes phase 3) because it is a rare disease. Consequence type must be “missense_variant”
    • How many variants do you have including both characteristics?
    • Download these final results in a csv file
  2. We have several clues about our candidate variants. In addition of knowing the pattern of dominant heritage (father is affected), we search variants with MAF < 0.1 (for all populations in 1000 Genomes phase 3) because it is a rare disease. Consequence type must be “missense_variant”
    • How many variants do you have including both characteristics?
    • Download these final results in a csv file



Solutions

A. Individual filters

  1. Candidate variants: 917
  2. Candidate variants: 1085
  3. Candidate variants: 5
  4. Candidate variants: 62
  5. Candidate variants: 20
  6. Candidate variants: 5
  7. 36320 SNVs, 3620 INDELs, 17 MNVs, 3 SVs, 0 CNVs
  8. Candidate variants in 1000 Genomes, phase 1: 11275
  9. Candidate variants in 1000 Genomes, phase 3: 12212
  10. Candidate variants in European American population: 21330

B. Progressive selection

  1. 917 → 299 → 29
  2. 1085 → 528 → 88