Skip to content

pedigree inference

Brent Pedersen edited this page Feb 6, 2020 · 6 revisions

as of somalier version 0.2.8, the relate sub-command will infer some pedigree structure for high-quality sample pairs. It outputs this as a pedigree file with parent samples filled in as appropriate. It will only infer first-degree relatives. Below are the rules for inference:

trio inference

We can infer a kid, mom, dad trio as follows: find a $sample that has exactly 2 samples ($m, $d) that:

  • have relatedness value between 0.4 and 0.6 to $sample
  • have IBS0 / IBS2 < 0.005 to $sample
  • have a relatedness of < 0.06 to each other
  • consist of a male (dad) and a female (mom)

this ensures that the 3 samples form a trio. if mom and dad have multiple kids, then each trio will be discovered and the kids will be implicitly indicated as siblings because they will have the same paternal and maternal ids.

sibling inference

siblings can be inferred (with or without parents) because they have a relatedness between 0.38 and 0.62 and an IBS0 / IBS2 > 0.015 and < 0.052. The relationship is indicated in the pedigree file by creating family-specific paternal and maternal ids.

inference with only 1 parent

since siblings are inferred above, we already know the sibling status. if all siblings have IBS0 / IBS2 < 0.005 to same $sample, then that $sample can be assigned as the parent. (if both parents were present, they would have been assigned above). the relationship is indicated by updating the maternal (or paternal) id for each kid.

family inference

given 2 families A, B. If any pair of high-quality samples a, b have a relatedness > 0.2, the families will be merged. This

limitations

Other than simply joining families, this only works for first-degree relatives for high-quality samples. It will likely only work for exome and whole-genome. somalier will not try to infer families between consanguineous parents.

Clone this wiki locally