• Media type: E-Article
  • Title: Human-specific tandem repeat expansion and differential gene expression during primate evolution
  • Contributor: Sulovari, Arvis; Li, Ruiyang; Audano, Peter A.; Porubsky, David; Vollger, Mitchell R.; Logsdon, Glennis A.; Warren, Wesley C.; Pollen, Alex A.; Chaisson, Mark J. P.; Eichler, Evan E.; Chaisson, Mark J.P.; Sanders, Ashley D.; Zhao, Xuefang; Malhotra, Ankit; Porubsky, David; Rausch, Tobias; Gardner, Eugene J.; Rodriguez, Oscar L.; Guo, Li; Collins, Ryan L.; Fan, Xian; Wen, Jia; Handsaker, Robert E.; Fairley, Susan; [...]
  • imprint: Proceedings of the National Academy of Sciences, 2019
  • Published in: Proceedings of the National Academy of Sciences
  • Language: English
  • DOI: 10.1073/pnas.1912175116
  • ISSN: 0027-8424; 1091-6490
  • Keywords: Multidisciplinary
  • Origination:
  • Footnote:
  • Description: <jats:p>Short tandem repeats (STRs) and variable number tandem repeats (VNTRs) are important sources of natural and disease-causing variation, yet they have been problematic to resolve in reference genomes and genotype with short-read technology. We created a framework to model the evolution and instability of STRs and VNTRs in apes. We phased and assembled 3 ape genomes (chimpanzee, gorilla, and orangutan) using long-read and 10x Genomics linked-read sequence data for 21,442 human tandem repeats discovered in 6 haplotype-resolved assemblies of Yoruban, Chinese, and Puerto Rican origin. We define a set of 1,584 STRs/VNTRs expanded specifically in humans, including large tandem repeats affecting coding and noncoding portions of genes (e.g.,<jats:italic>MUC3A</jats:italic>,<jats:italic>CACNA1C</jats:italic>). We show that short interspersed nuclear element–VNTR–<jats:italic>Alu</jats:italic>(SVA) retrotransposition is the main mechanism for distributing GC-rich human-specific tandem repeat expansions throughout the genome but with a bias against genes. In contrast, we observe that VNTRs not originating from retrotransposons have a propensity to cluster near genes, especially in the subtelomere. Using tissue-specific expression from human and chimpanzee brains, we identify genes where transcript isoform usage differs significantly, likely caused by cryptic splicing variation within VNTRs. Using single-cell expression from cerebral organoids, we observe a strong effect for genes associated with transcription profiles analogous to intermediate progenitor cells. Finally, we compare the sequence composition of some of the largest human-specific repeat expansions and identify 52 STRs/VNTRs with at least 40 uninterrupted pure tracts as candidates for genetically unstable regions associated with disease.</jats:p>
  • Access State: Open Access