Pathogenicity and selective constraint on variation near splice sites

  1. on behalf of the Deciphering Developmental Disorders study
  1. 1Wellcome Sanger Institute, Wellcome Genome Campus, Hinxton, Cambridge CB10 1SA, United Kingdom;
  2. 2Manchester Centre for Genomic Medicine, St. Mary's Hospital, Manchester University Hospitals NHS Foundation Trust, Manchester Academic Health Sciences Centre, Manchester M13 9WL, United Kingdom;
  3. 3Division of Evolution and Genomic Sciences, School of Biological Sciences, University of Manchester, Manchester M13 9NT, United Kingdom;
  4. 4Sheffield Clinical Genetics Service, Sheffield Children's Hospital, OPD2, Northern General Hospital, Sheffield S5 7AU, United Kingdom;
  5. 5Liverpool Women's Hospital Foundation Trust, Liverpool L8 7SS, United Kingdom;
  6. 6West of Scotland Regional Genetics Service, NHS Greater Glasgow and Clyde, Institute of Medical Genetics, Yorkhill Hospital, Glasgow G3 8SJ, United Kingdom;
  7. 7South East Thames Regional Genetics Centre, Guy's and St Thomas’ NHS Foundation Trust, Guy's Hospital, London SE1 9RT, United Kingdom;
  8. 8Faculty of Medicine, University of Southampton, Institute of Developmental Sciences, Southampton SO16 6YD, United Kingdom;
  9. 9Wessex Clinical Genetics Service, University Hospital Southampton, Princess Anne Hospital, Southampton SO16 5YA, United Kingdom;
  10. 10South West Thames Regional Genetics Centre, St. George's Healthcare NHS Trust, St. George's, University of London, London SW17 0RE, United Kingdom;
  11. 11Temple Street Children's Hospital, Dublin 1, Ireland;
  12. 12West of Scotland Regional Genetics Service, NHS Greater Glasgow and Clyde, Queen Elizabeth University Hospital, Glasgow G51 4TF, United Kingdom;
  13. 13Northern Ireland Regional Genetics Centre, Belfast Health and Social Care Trust, Belfast City Hospital, Belfast BT9 7AB, United Kingom;
  14. 14North West Thames Regional Genetics Service, London North West University Healthcare NHS Trust, Northwick Park and St. Mark's Hospitals, Harrow HA1 3UJ, United Kingdom;
  15. 15MRC Human Genetics Unit, MRC IGMM, University of Edinburgh, Western General Hospital, Edinburgh EH4 2XU, United Kingdom;
  16. 16Institute of Biomedical and Clinical Science, University of Exeter Medical School, Exeter EX2 5DW, United Kingdom;
  17. 17East Anglian Medical Genetics Service, Cambridge University Hospitals NHS Foundation Trust, Cambridge CB2 0QQ, United Kingdom
  • Corresponding author: meh{at}sanger.ac.uk
  • Abstract

    Mutations that perturb normal pre-mRNA splicing are significant contributors to human disease. We used exome sequencing data from 7833 probands with developmental disorders (DDs) and their unaffected parents, as well as more than 60,000 aggregated exomes from the Exome Aggregation Consortium, to investigate selection around the splice sites and quantify the contribution of splicing mutations to DDs. Patterns of purifying selection, a deficit of variants in highly constrained genes in healthy subjects, and excess de novo mutations in patients highlighted particular positions within and around the consensus splice site of greater functional relevance. By using mutational burden analyses in this large cohort of proband–parent trios, we could estimate in an unbiased manner the relative contributions of mutations at canonical dinucleotides (73%) and flanking noncanonical positions (27%), and calculate the positive predictive value of pathogenicity for different classes of mutations. We identified 18 patients with likely diagnostic de novo mutations in dominant DD-associated genes at noncanonical positions in splice sites. We estimate 35%–40% of pathogenic variants in noncanonical splice site positions are missing from public databases.

    Footnotes

    • [Supplemental material is available for this article.]

    • Article published online before print. Article, supplemental material, and publication date are at http://www.genome.org/cgi/doi/10.1101/gr.238444.118.

    • Freely available online through the Genome Research Open Access option.

    • Received April 13, 2018.
    • Accepted December 20, 2018.

    This article, published in Genome Research, is available under a Creative Commons License (Attribution 4.0 International), as described at http://creativecommons.org/licenses/by/4.0/.

    | Table of Contents
    OPEN ACCESS ARTICLE

    Preprint Server