Abstract
Autism is widely believed to be a heterogeneous disorder; diagnosis is currently based solely on clinical criteria, although genetic, as well as environmental, influences are thought to be prominent factors in the etiology of most forms of autism. Our goal is to determine whether a predictive model based on single-nucleotide polymorphisms (SNPs) can predict symptom severity of autism spectrum disorder (ASD). We divided 118 ASD children into a mild/moderate autism group (n = 65) and a severe autism group (n = 53), based on the Childhood Autism Rating Scale (CARS). For each child, we obtained 29 SNPs of 9 ASD-related genes. To generate predictive models, we employed three machine-learning techniques: decision stumps (DSs), alternating decision trees (ADTrees), and FlexTrees. DS and FlexTree generated modestly better classifiers, with accuracy = 67%, sensitivity = 0.88 and specificity = 0.42. The SNP rs878960 in GABRB3 was selected by all models, and was related associated with CARS assessment. Our results suggest that SNPs have the potential to offer accurate classification of ASD symptom severity.
Similar content being viewed by others
References
Aldred, S., Moore, K. M., Fitzgerald, M., & Waring, R. H. (2003). Plasma amino acid levels in children with autism and their families. Journal of Autism and Developmental Disorders, 33, 93–97.
Ashley-Koch, A. E., Mei, H., Jaworski, J., Ma, D. Q., Ritchie, M. D., Menold, M. M., et al. (2006). An analysis paradigm for investigating multi-locus effects in complex disease: Examination of three GABA receptor subunit genes on 15q11–q13 as risk factors for autistic disorder. Annals of Human Genetics, 70(Pt 3), 281–292.
Belmonte, M. K., Cook, E. H., Anderson, G. M., Rubenstein, J. L. R., Greenough, W. T., Beckel-Mitchener, A., et al. (2004). Autism as a disorder of neural information processing: Directions for research and targets for therapy. Molecular Psychiatry, 9(7), 646–663.
Blatt, G. J., Fitzgerald, C. M., Guptill, J. T., Booker, A. B., Kemper, T. L., & Bauman, M. L. (2001). Density and distribution of hippocampal neurotransmitter receptors in autism: An autoradiographic study. Journal of Autism and Developmental Disorders, 31, 537–543.
Breiman, L., Friedman, J. H., Olshen, R. A., & Stone, C. J. (1984). Classification and regression trees. Monterey, CA: Chapman and Hall/CRC.
Bureau, A., Dupuis, J., Falls, K., Lunetta, K. L., Hayward, B., Keith, T. P., et al. (2005). Identifying SNPs predictive of phenotype using random forests. Genetic Epidemiology, 28(2), 171–182.
Buxbaum, J. D., Silverman, J. M., Smith, C. J., Greenberg, D. A., Kilifarski, M., Reichert, J., et al. (2002). Association between a GABRB3 polymorphism and autism. Molecular Psychiatry, 7(3), 311–316.
Cheng, L., Ge, Q., Xiao, P., Sun, B., Ke, X., Bai, Y., et al. (2009). Association study between BDNF gene polymorphisms and Autism by three-dimensional gel-based microarray. International Journal of Molecular Sciences, 10(6), 2487–2500.
Collins, A. L., Ma, D. Q., Whitehead, P. L., Martin, E. R., Wright, H. H., Abramson, R. K., et al. (2006). Investigation of autism and GABA receptor subunit genes in multiple ethnic groups. Neurogenetics, 7(3), 167–174.
Cook, E. H., Jr., Courchesne, R. Y., Cox, N. J., Lord, C., Gonen, D., Guter, S. J., et al. (1998). Linkage-disequilibrium mapping of autistic disorder, with 15q11–13 markers. American Journal of Human Genetics, 62(5), 1077–1083.
Delahanty, R. J., Kang, J. Q., Brune, C. W., Kistner, E. O., Courchesne, E., Cox, N. J., et al. (2011). Maternal transmission of a rare GABRB3 signal peptide variant is associated with autism. Molecular Psychiatry, 16(1), 86–96.
Fatemi, S. H., Reutiman, T. J., Folsom, T. D., & Thuras, P. D. (2009). GABA(A) receptor downregulation in brains of subjects with Autism. Journal of Autism and Developmental Disorders, 39(2), 223–230.
Freitag, C. M. (2007). The genetics of autistic disorders and its clinical relevance: A review of the literature. Molecular Psychiatry, 12(1), 2–22.
Freitag, C. M., Staal, W., Klauck, S. M., Duketis, E., & Waltes, R. (2010). Genetics of autistic disorders: Review and clinical implications. European Child and Adolescent Psychiatry, 19(3), 169–178.
Freund, Y., & Mason, L. (1999). The alternating decision tree learning algorithm. In Proceedings of the sixteenth international conference on machine learning (pp. 124–133). Morgan Kaufmann Publishers Inc.
Geschwind, D. H. (2009). Advances in Autism. Annual Review of Medicine, 60, 367–380.
Gibbs, R. A., Belmont, J. W., Hardenbol, P., Willis, T. D., Yu, F. L., Yang, H. M., et al. (2003). The international HapMap project. Nature, 426(6968), 789–796.
Gmitrowicz, A., & Kucharska, A. (1994). Developmental disorders in the fourth edition of the American classification: Diagnostic and statistical manual of mental disorders (DSM IV–optional book). Psychiatria Polska, 28(5), 509–521.
Holmes, G., Pfahringer, B., Kirkby, R., Frank, E., & Hall, M. (2002). Multiclass alternating decision trees. In Proceedings of the 13th European conference on machine learning (pp. 161–172). Springer.
Hou, P., Ji, M., Li, S., & Lu, Z. (2004). Microarray-based approach for high-throughput genotyping of single-nucleotide polymorphisms with layer-by-layer dual-color fluorescence hybridization. Clinical Chemistry, 50(10), 1955–1957.
Huang, J., Lin, A., Narasimhan, B., Quertermous, T., Hsiung, C. A., Ho, L. T., et al. (2004). Tree-structured supervised learning and the genetics of hypertension. Proceedings of the National Academy of Sciences of the United States of America, 101(29), 10529–10534.
Iba, W., & Langley, P. (1992). Induction of one-level decision trees. In Proceedings of the ninth international workshop on Machine learning (pp. 233–240). Aberdeen, Scotland, United Kingdom Morgan Kaufmann Publishers Inc.
Ji, M., Hou, P., Li, S., He, N., & Lu, Z. (2004). Microarray-based method for genotyping of functional single nucleotide polymorphisms using dual-color fluorescence hybridization. Mutation Research, 548(1–2), 97–105.
Kim, S. J., Brune, C. W., Kistner, E. O., Christian, S. L., Courchesne, E. H., Cox, N. J., et al. (2008). Transmission disequilibrium testing of the chromosome 15q11–q13 region in autism. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 147B(7), 1116–1125.
Kim, S. A., Kim, J. H., Park, M., Cho, I. H., & Yoo, H. J. (2006). Association of GABRB3 polymorphisms with autism spectrum disorders in Korean trios. Neuropsychobiology, 54(3), 160–165.
Kooperberg, C., Ruczinski, I., LeBlanc, M. L., & Hsu, L. (2001). Sequence analysis using logic regression. Genetic Epidemiology, 21, S626–S631.
Lerer, E., Levi, S., Salomon, S., Darvasi, A., Yirmiya, N., & Ebstein, R. P. (2008). Association between the oxytocin receptor (OXTR) gene and autism: Relationship to vineland adaptive behavior scales and cognition. Molecular Psychiatry, 13(10), 980–988.
Li, H., Yamagata, T., Mori, M., & Momoi, M. Y. (2005). Absence of causative mutations and presence of autism-related allele in FOXP2 in Japanese autistic patients. Brain and Development, 27(3), 207–210.
Li, Z., Zhang, Z., He, Z., Tang, W., Li, T., Zeng, Z., et al. (2009). A partition-ligation-combination-subdivision EM algorithm for haplotype inference with multiallelic markers: Update of the SHEsis (http://analysis.bio-x.cn). Cell Res, 19(4), 519–523.
Loh, W. Y., & Shih, Y. S. (1997). Split selection methods for classification trees. Statistica Sinica, 7(4), 815–840.
Lord, C., Rutter, M., & Le Couteur, A. (1994). Autism diagnostic interview-revised: A revised version of a diagnostic interview for caregivers of individuals with possible pervasive developmental disorders. Journal of Autism and Developmental Disorders, 24(5), 659–685.
Ma, D. Q., Whitehead, P. L., Menold, M. M., Martin, E. R., Ashley-Koch, A. E., Mei, H., et al. (2005). Identification of significant association and gene–gene interaction of GABA receptor subunit genes in autism. American Journal of Human Genetics, 77(3), 377–388.
McCauley, J. L. (2005). Genetic and phenotypic dissection of autism susceptibility.
McCauley, J. L., Olson, L. M., Delahanty, R., Amin, T., Nurmi, E. L., Organ, E. L., et al. (2004). A linkage disequilibrium map of the 1-Mb 15q12 GABA(A) receptor subunit cluster and association to autism. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 131B(1), 51–59.
Miller, R. G. (1981). Simultaneous statistical inference (2nd ed.). New York: Springer.
Moreno-Fuenmayor, H., Borjas, L., Arrieta, A., Valera, V., & Socorro-Candanoza, L. (1996). Plasma excitatory amino acids in autism. Investigacion Clinica, 37, 113–128.
Nabi, R., Serajee, F. J., Chugani, D. C., Zhong, H., & Huq, A. H. M. M. (2004). Association of tryptophan 2, 3 dioxygenase gene polymorphism with autism. American Journal of Medical Genetics Part B: Neuropsychiatric Genetics, 125B(1), 63–68.
Nunkesser, R., Bernholt, T., Schwender, H., Ickstadt, K., & Wegener, I. (2007). Detecting high-order interactions of single nucleotide polymorphisms using genetic programming. Bioinformatics, 23(24), 3280–3288.
Nurmi, E. L., Amin, T., Olson, L. M., Jacobs, M. M., McCauley, J. L., Lam, A. Y., et al. (2003a). Dense linkage disequilibrium mapping in the 15q11–q13 maternal expression domain yields evidence for association in autism. Molecular Psychiatry, 8(6), 624–634.
Nurmi, E. L., Dowd, M., Tadevosyan-Leyfer, O., Haines, J. L., Folstein, S. E., & Sutcliffe, J. S. (2003b). Exploratory subsetting of autism families based on savant skills improves evidence of genetic linkage to 15q11–q13. Journal of the American Academy of Child and Adolescent Psychiatry, 42(7), 856–863.
Olsen, R. W., & Macdonald, R. L. (2002). GABAA receptor complex: Structure and function. In J. Egebjerg, A. Schousboe, & P. Krogsgaard-Larsen (Eds.), Glutamate and GABA receptors and transporters (pp. 202–235). London: Taylor & Francis.
Park, M. Y., & Hastie, T. (2008). Penalized logistic regression for detecting gene interactions. Biostatistics, 9(1), 30–50.
Parks, L. K., Hill, D. E., Thoma, R. J., Euler, M. J., Lewine, J. D., & Yeo, R. A. (2009). Neural correlates of communication skill and symptom severity in autism: A voxel-based morphometry study. Research in Autism Spectrum Disorders, 3(2), 444–454.
Ramoz, N., Reichert, J. G., Smith, C. J., Silverman, J. M., Bespalova, I. N., Davis, K. L., et al. (2004). Linkage and association of the mitochondrial aspartate/glutamate carrier SLC25A12 gene with autism. American Journal of Psychiatry, 161, 662–669.
Samaco, R. C., Hogart, A., & LaSalle, J. M. (2005). Epigenetic overlap in autism-spectrum neurodevelopmental disorders: MECP2 deficiency causes reduced expression of UBE3A and GABRB3. Human Molecular Genetics, 14(4), 483–492.
Samaco, R. C., Nagarajan, R. P., Braunschweig, D., & LaSalle, J. M. (2004). Multiple pathways regulate MeCP2 expression in normal brain development and exhibit defects in autism-spectrum disorders. Human Molecular Genetics, 13(6), 629–639.
Schopler, E., Reichler, R. J., DeVellis, R. F., & Daly, K. (1980). Toward objective classification of childhood autism: Childhood Autism Rating Scale (CARS). Journal of Autism and Developmental Disorders, 10(1), 91–103.
Schwender, H., Ickstadt, K., & Rahnenfuhrer, J. (2008). Classification with high-dimensional genetic data: Assigning patients and genetic features to known classes. Biometrical Journal, 50(6), 911–926.
Segurado, R., Conroy, J., Meally, E., Fitzgerald, M., Gill, M., & Gallagher, L. (2005). Confirmation of association between autism and the mitochondrial aspartate/glutamate carrier SLC25A12 gene on chromosome 2q31. American Journal of Psychiatry, 162, 2182–2184.
Shi, Y. Y., & He, L. (2005). SHEsis, a powerful software platform for analyses of linkage disequilibrium, haplotype construction, and genetic association at polymorphism loci. Cell Res, 15(2), 97–98.
Sparrow, S. S., & Cicchetti, D. V. (1985). Diagnostic uses of the vineland adaptive-behavior scales. Journal of Pediatric Psychology, 10(2), 215–225.
Sutcliffe, J. S., & Nurmi, E. L. (2003). Genetics of childhood disorders: XLVII. Autism, Part 6: Duplication and inherited susceptibility of chromosome 15q11–q13 genes in autism. Journal of the American Academy of Child and Adolescent Psychiatry, 42(2), 253–256.
Wermter, A. K., Kamp-Becker, I., Hesse, P., Schulte-Korne, G., Strauch, K., & Remschmidt, H. (2010). Evidence for the involvement of genetic variation in the Oxytocin Receptor Gene (OXTR) in the etiology of autistic disorders on high-functioning level. American Journal of Medical Genetics Part B-Neuropsychiatric Genetics, 153B(2), 629–639.
Witten, I. H., & Frank, E. (2005). Data mining: Practical machine learning tools and techniques (2nd ed.). San Francisco: Morgan Kaufmann.
Xiao, P. F., Cheng, L., Wan, Y., Sun, B. L., Chen, Z. Z., Zhang, S. Y., et al. (2006). An improved gel-based DNA microarray method for detecting single nucleotide mismatch. Electrophoresis, 27(19), 3904–3915.
Acknowledgments
Yun Jiao was supported by the China Scholarship Council (No. 2008101370), the National Natural Science foundation of China (No. 30570655), and the Scientific Research Foundation of Graduate School of Southeast University (No. YBJJ1011). Drs. Chen and Herskovits are supported by National Institutes of Health grant R01 AG13743, which is funded by the National Institute of Aging, the National Institute of Mental Health, and the National Cancer Institute. They are also supported by NIH R03 EB009310. Drs. Ke and Chu were supported by the Natural Science Foundation of Jiangsu, China (No. BK2008082). Drs. Lu and Cheng were supported by the National Natural Science foundation of China (No. 30570655). The authors also thank the International HapMap Project for the data of normal Asian population diversity on rs878960 in GABRB3, and Richard Olshen and Jing Huang for the code of Flextree.
Author information
Authors and Affiliations
Corresponding authors
Appendices
Appendices
Appendix 1: Histogram of CARS Scores
Please see Fig. 3.
Appendix 2: Tree Model Generated by ADTree
Please see Fig. 4.
Appendix 3: Data-Mining Methods
In this section, we provide an overview of DS, ADTree, and FlexTree.
DS is a single-level decision-tree model with a categorical or numeric class label (Iba and Langley 1992). It tends to find the main predictor variable in one step. It is widely used when researchers seek the single most significant feature with respect to classification (Iba and Langley 1992).
An ADTree is a method based on combining weak hypotheses generated during boosting into a single interpretable representation (Freund and Mason 1999). An ADTree model is more compact than standard boosting-based decision-tree models, which generate more than one tree (Freund and Mason 1999). As a result, an ADTree model is relatively straightforward to interpret. The application of boosting procedures may improve classification performance for ADTrees. The structure of an ADTree has three characteristics: (1) the root node is a prediction node, and has a numeric score only, which is based on the total weights of the positive and negative instances that satisfy the conditions in the training data (Holmes et al. 2002); (2) the nodes in the next layer are decision nodes, and are essentially a collection of decision-tree stumps; (3) the subsequent layers alternate layers of prediction nodes and decision nodes. To classify a new instance with an ADTree model, all paths for which all decision nodes are true are followed, summing any prediction nodes that are traversed by these paths.
FlexTree, a general supervised-learning method, extends the binary tree-structured approach [Classification and Regression Trees, CART (Breiman et al. 1984)], although it differs greatly in its selection and combination of predictors (Huang et al. 2004). It is particularly applicable for assessing gene–gene and gene-environment interactions as they bear on complex diseases. FlexTree creates a simple rooted binary tree with each split defined by a linear combination of selected variables. The linear combination is determined by regression with optimal scoring; the variables are selected by a backward pruning procedure. Using a selected variable subset to define each split increases interpretability, improves predictive robustness, and prevents overfitting. FlexTree deals with additive and interactive effects simultaneously. Sampling units can be families or individuals, depending on the application. Generally, FlexTree demonstrated performance that is better than many alternatives to which it was compared, particularly when a small fraction of candidate genes are useful for classification (Huang et al. 2004).
Appendix 4: Rationale for SNP Selection
GABA is the major inhibitory neurotransmitter in the adult brain, although it mediates excitatory transmission during development. As a result, many GABAA receptors encoding genes were involved in our study. Previous autism pathophysiology studies reported that: (1) The numbers of GABAA receptors were significantly decreased in brains of children with autism (Blatt et al. 2001); (2) Plasma GABA, and its essential precursor glutamate, were elevated in children with autism (Moreno-Fuenmayor et al. 1996; Dhossche et al. 2002; Aldred et al. 2003); (3) Benzodiazepines, which are effective in treating the seizures, anxiety, and social phobia that occur in the setting of autism, bind to, and act on, GABAA receptors (Olsen and Macdonald 2002); (4) GABA-ergic transmission has important trophic actions during development. Based on these data, the GABAA receptor subunit genes, particularly those in 15q11-q13 (Cook et al. 1998; Buxbaum et al. 2002; McCauley et al. 2004; Ashley-Koch et al. 2006), represent excellent candidates, allelic variants of which could confer genetic susceptibility for development of autism (McCauley 2005).
TDO2 (Nabi et al. 2004), SLC25A12 (Ramoz et al. 2004; Segurado et al. 2005), and BDNF (Cheng et al. 2009) were also found to be associated with ASD, so we included these genes in our studies.
Appendix 5: Genotyping
The first step in genotyping was PCR; primers were designed using Primer Premier 5.0 software, based on published DNA sequences. The primers were synthesized and HPLC purified by the TaKaRa Company (P.R. China). All reverse primers were modified with an acrylamide group at the 5’-terminal, in order to covalently bond to the polyacrylamide gel. After several cycles of PCR amplification, we used ethanol to precipitate PCR products.
Step 2 was immobilization of PCR products. We dissolved acrylamide-modified PCR products, and spotted them on 3-methacryloxypropyltrimethoxy silane-modified glass slides, using a microarrayer (Captial Biochip Corporation, P.R. China). Each slide was placed into a humid, 1,000 Pascal (Pa) pressure-sealed chamber full of tetramethylethylenediamine, to induce copolymerization between acrylamide groups and acryl groups. We then used electrophoresis to obtain single stranded DNA (ssDNA) for hybridization.
Step 3 was hybridization. We designed a pair of probes for every SNP locus, such that the probes could be matched with the polymorphic portion of the targets, and labeled with Cy3 or Cy5. For every SNP genotyped, we mixed the labeled probes in equimolar amounts, and suspended them in unihybridization solution (3:1 dilution) to obtain a final concentration of 2 μM. We achieved hybridization in a humid chamber at 37°C for 2–4 h.
The fourth and fifth steps were post-hybridization and scanning, respectively. We rinsed the slide in water and air dried it, after which we completed electrophoresis at 2 V/cm for 8 min in 1X TBE buffer at 4º C. We scanned the hybridization slides at 70% laser power and 65% photomultiplier tubes gain with a confocal scanner (Luxscan-10 K/A, CapitalBio Company, P.R. China) that had been fitted with filters for Cy3 and Cy5. We used QuantArray software (Packard BioChip Technologies, Billerica, MA) to analyze these images.
Appendix 6: rs878960 Genotype and Allele Distributions Between Our Study and Other Asian Cohorts
The genotypes and the allele distributions for rs878960 polymorphisms in our study (n = 118) and for the HapMap Han Chinese group (n = 43, total 45 subjects but 2 missing values) are presented in Table 5. There is no significant difference in genotype or allele distribution between these two groups. Similarly, there is no significant difference in genotype or allele distributions between our Chinese subjects and Japanese subjects (HapMap, n = 86).
Appendix 7: Haplotype Analysis for Each Gene
Table 6 shows the results of haplotype analysis using methods described in (Li et al. 2009).
We found that haplotypes of GABRA4, but not of GABRB3, are significantly associated with ASD symptom severity. The reason for the lack of association with GABRB3 is that we tested 6 SNPs in GABRB3, and some of these SNPs may contribute noise to the analysis. In particular, when we remove one SNP—rs1432007—and repeat the analysis (see Table 7), we obtain χ2 = 19.5, p value = 0.007. That is, haplotypes of GABRB3 are significantly associated with symptom severities of ASD when we remove SNP rs1432007 from the analysis.
Rights and permissions
About this article
Cite this article
Jiao, Y., Chen, R., Ke, X. et al. Single Nucleotide Polymorphisms Predict Symptom Severity of Autism Spectrum Disorder. J Autism Dev Disord 42, 971–983 (2012). https://doi.org/10.1007/s10803-011-1327-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10803-011-1327-5