Charles Explorer logo

Genome-wide variant calling in reanalysis of exome sequencing data uncovered a pathogenic TUBB3 variant

Publication at Second Faculty of Medicine |


Almost half of all individuals affected by intellectual disability (ID) remain undiagnosed. In the Solve-RD project, exome sequencing (ES) datasets from unresolved individuals with (syndromic) ID (n = 1,472 probands) are systematically reanalyzed, starting from raw sequencing files, followed by genome-wide variant calling and new data interpretation.

This strategy led to the identification of a disease-causing de novo missense variant in TUBB3 in a girl with severe developmental delay, secondary microcephaly, brain imaging abnormalities, high hypermetropia, strabismus and short stature. Interestingly, the TUBB3 variant could only be identified through reanalysis of ES data using a genome-wide variant calling approach, despite being located in protein coding sequence.

More detailed analysis revealed that the position of the variant within exon 5 of TUBB3 was not targeted by the enrichment kit, although consistent high-quality coverage was obtained at this position, resulting from nearby targets that provide off-target coverage. In the initial analysis, variant calling was restricted to the exon targets +- 200 bases, allowing the variant to escape detection by the variant calling algorithm.

This phenomenon may potentially occur more often, as we determined that 36 established ID genes have robust off-target coverage in coding sequence. Moreover, within these regions, for 17 genes (likely) pathogenic variants have been identified before.

Therefore, this clinical report highlights that, although compute-intensive, performing genome-wide variant calling instead of target-based calling may lead to the detection of diagnostically relevant variants that would otherwise remain unnoticed.