A quantitative association study of SLC25A12 and restricted repetitive behavior traits in autism spectrum disorders

Background SLC25A12 was previously identified by a linkage-directed association analysis in autism. In this study, we investigated the relationship between three SLC25A12 single nucleotide polymorphisms (SNPs) (rs2056202, rs908670 and rs2292813) and restricted repetitive behavior (RRB) traits in autism spectrum disorders (ASDs), based on a positive correlation between the G allele of rs2056202 and an RRB subdomain score on the Autism Diagnostic Interview-Revised (ADI-R). Methods We used the Repetitive Behavior Scale-Revised (RBS-R) as a quantitative RRB measure, and conducted linear regression analyses for individual SNPs and a previously identified haplotype (rs2056202-rs2292813). We examined associations in our University of Illinois at Chicago-University of Florida (UIC-UF) sample (179 unrelated individuals with an ASD), and then attempted to replicate our findings in the Simons Simplex Collection (SSC) sample (720 ASD families). Results In the UIC-UF sample, three RBS-R scores (ritualistic, sameness, sum) had positive associations with the A allele of rs2292813 (p = 0.006-0.012) and with the rs2056202-rs2292813 haplotype (omnibus test, p = 0.025-0.040). The SSC sample had positive associations between the A allele of rs2056202 and four RBS-R scores (stereotyped, sameness, restricted, sum) (p = 0.006-0.010), between the A allele of rs908670 and three RBS-R scores (stereotyped, self-injurious, sum) (p = 0.003-0.015), and between the rs2056202-rs2292813 haplotype and six RBS-R scores (stereotyped, self-injurious, compulsive, sameness, restricted, sum)(omnibus test, p = 0.002-0.028). Taken together, the A alleles of rs2056202 and rs2292813 were consistently and positively associated with RRB traits in both the UIC-UF and SSC samples, but the most significant SNP with phenotype association varied in each dataset. Conclusions This study confirmed an association between SLC25A12 and RRB traits in ASDs, but the direction of the association was different from that in the initial study. This could be due to the examined SLC25A12 SNPs being in linkage disequilibrium with another risk allele, and/or genetic/phenotypic heterogeneity of the ASD samples across studies.


Background
Autism spectrum disorders (ASDs) are characterized by qualitative impairments in reciprocal social interaction and communication, and by the presence of restricted repetitive behavior (RRB) [1]. ASDs are highly heritable complex genetic disorders with rare variants, oligogenic inheritance, and interactions between susceptibility alleles [2][3][4][5][6][7][8]. The heterogeneity of ASDs makes it difficult to identify risk alleles, but also supports the validity of a model that requires more than one genetic variant to contribute to the full syndrome of autism [9,10]. SLC25A12 (solute carrier family 25 member 12; OMIM *603667) on chromosome 2q24 encodes aralar, a mitochondrial aspartate-glutamate carrier isoform 1 (AGC1) protein. SLC25A12 spans about 110 kb. SLC25A12 was initially identified as an autism-susceptibility gene through a linkage-directed association study and replication [11][12][13][14]. For instance, two independent groups reported overtransmission of the G alleles of two SLC25A12 SNPs in intron 3 (rs2056202) and intron 16 (rs2292813) in autism families [12,13]. Other groups also reported overtransmission of the G allele of either rs2056202 [11] or rs2292813 [14], or undertransmission of the A-A haplotype of rs2056202-rs2292813 [14] in autism families. Most recently, the G allele of rs908670, another SLC25A12 SNP in intron 8, showed an evidence for overtransmission in a genome-wide association study (GWAS) by the Autism Genome Project (AGP) Consortium (p = 0.0006 in combined AGP, Autism Genetic Resource Exchange (AGRE), and Study on Addiction: Genetics and Environment (SAGE) samples) [15]. However, not all studies have found evidence for association between SLC25A12 and autism [16][17][18][19]. This conflicting data may be explained by differences in phenotypic characteristics and/or genetic heterogeneity across study samples.
Interestingly, Silverman et al. (2008) examined the correlation between SLC25A12 and phenotypic data obtained from the Autism Diagnostic Interview-Revised (ADI-R), and found a positive correlation between the G allele of rs2056202 and an RRB-related subdomain, the 'routines and rituals' score [20]. This subdomain consists of two ADI-R items: 'verbal rituals' and 'compulsion/ritualistic behavior'. However, apart from the Silverman study, no other studies have examined the association between SLC25A12 and quantitative RRB traits.
In the present study, we hypothesized that SLC25A12 may confer risk for quantitative RRB traits in ASDs. We tested this hypothesis by examining associations between three SLC25A12 SNPs (rs2056202, rs908670 and rs2292813) and the Repetitive Behavior Scale-Revised (RBS-R), a quantitative measure of RRB. We examined associations first in our University of Illinois at Chicago-University of Florida (UIC-UF) sample (179 unrelated people with an ASD), and then attempted to replicate our findings in the Simons Simplex Collection (SSC) sample (720 ASD families). Because the SSC sample has parental genotype data available for these SNPs, we also examined transmission disequilibrium using family-based association tests.

Subjects and assessment UIC-UF sample
This study was approved by the UIC and UF Institutional review boards (IRBs). All participants were provided with a description of the study before informed consent was obtained. The study participants (179 unrelated people with an ASD) were recruited mainly from two geographical regions (UIC sample from the Chicago, Illinois area; UF sample from north central Florida).
All UIC participants were assessed with the ADI-R [21] and the Autism Diagnostic Observation Schedule (ADOS) [22]. For this report, we required all subjects to meet ASD or autism classification on both ADI-R and ADOS, along with a best-estimate diagnosis of an ASD (i.e., autistic disorder, Asperger disorder, or pervasive developmental disorder-not otherwise specified) by the criteria of the Diagnostic and Statistical Manual of Mental Disorders, Fourth Edition, Text Revision (DSM-IV-TR) [1]. We excluded probands with an insufficient DNA sample for genotyping, and/or who lacked RBS-R data. In total, 88 probands (75 male, 13 female; mean age 8.3 ± 4.8 years) were identified as meeting the above criteria as of the data freeze on 1 December 2009, coinciding with data submission to National Database for Autism Research. In this group, 64.8% of participants were white; 6.8% were on concurrent psychotropic medications (these subjects were excluded from the neurochemical analyses of Autism Centers of Excellence (ACE) but were included for this report); and 72.7% were classified as 'strictly defined autism', as they met the autism classification on both ADI-R and ADOS. There were seven missing RBS-R data points (completion rate of 99.8%). These missing data points were treated as 'missing' for both affected subscale and sum scores.
For the UF sample, the inclusion criteria were chronological age between 6 and 18 years, clinical diagnosis of an ASD, sufficient DNA sample available for genotyping, and an absence of any specific genetic diagnosis. Because the UF sample was recruited primarily from a mail survey study, an independent validation of the clinical diagnoses was not conducted, therefore, we used the Social Communication Questionnaire (SCQ) [23] and excluded anyone with an SCQ total score less than 15 in this report, as a validation study had suggested that the SCQ discriminated well between patients with ASD and those with non-ASD, with a sensitivity of 0.85 and a specificity of 0.75 [24]. The UF sample consisted of 91 subjects with an ASD (75 male, 16 female; mean age 10.8 ± 3.6 years); 74.7% were white and 63.7% were on concurrent psychotropic medications. There were no missing RBS-R data points in the UF sample.

SSC sample
The phenotype and genotype data were obtained through the Simons Foundation Autism Research Initiative (SFARI) Base [25] with approval from the UF IRB and SSC. In total, 737 probands had available genotype data released on 27 July 2010; we excluded seventeen probands for this report; three probands who were twins of designated probands, and fourteen probands whose phenotypic data were not yet available from the SFARI collection version 10 released on 1 November 2010. Therefore, the SSC sample used in this report included 720 children (age 6.9 ± 2.8 years) who met the criteria for ASD or autism on both the ADI-R and ADOS (563 with 'strictly defined autism') and their biological parents (720 trio families). There were 620 male and 100 female children, and 81% of the children were white. There were 56 missing RBS-R data points (completion rate 99.8%), which were treated as 'missing' for both affected subscale and sum scores.

RRB assessment
We used the RBS-R as our primary RRB measure. The RBS-R is an empirically derived, standardized and psychometrically sound rating scale, targeting a variety of abnormal repetitive behaviors [26,27]. The RBS-R includes forty-three individual items grouped into six empirically derived subscales: stereotyped behavior, self-injurious behavior, compulsive behavior, ritualistic behavior, sameness behavior, restricted behavior, and a sum score. A recent factor analysis by Lam and Aman (2007) produced a five-factor solution; stereotyped behavior, self-injurious behavior, compulsive behavior, ritualistic/sameness behavior and restricted interests [28].

SNP genotyping
Three SLC25A12 SNPs (rs2056202, rs908670, rs2292813) were selected for this study, because rs2056202 and rs2292813 were the original two SNPs associated with ASDs, and rs908670 was the most strongly associated SNP within SLC25A12 in the AGP GWAS [15]. The SNPs were genotyped using a commercial assay (TaqMan ® SNP Genotyping Assay; Applied Biosystems, Foster City, CA, USA) for the UIC-UF sample. For the SSC sample, the genotype data for these SNPs were available from the Illumina ® 1M/1Mduo Genechip data (https://sfari.org/sfari-base).

Statistical analyses
The associations between quantitative RRB measures (RBS-R subscale and sum scores) and specific SNPs or haplotypes were tested using linear regression analyses, as implemented in PLINK (http://pngu.mgh.harvard.edu/ purcell/plink/). The '-hap-omnibus' option was used to jointly estimate all haplotype effects at that position. The potential confounding factors including age, gender, population (white versus non-white) and recruitment site (UIC versus UF for the UIC-UF sample), were treated as covariates. For the SSC sample, we used the DFAM and QFAM tests implemented in PLINK to examine the family-based association for qualitative traits (ASDs) and quantitative traits (RRB). In addition, for the SSC sample only, we added full scale IQ (FSIQ) into the linear regression model as an additional covariate to examine the effect of FSIQ. The '-mperm 10,000' option was used to generate a single-point p value (EMP1) to correct for non-normal trait distributions, and a permutation p value to correct for multiple SNPs/haplotypes (but not multiple phenotypes) in this study. Furthermore, we transformed the mean and standard deviation of RBS-R subscale and sum scores to '0' and '1' to obtain a 'standardized' β (a regression coefficient) allowing β to be comparable across different samples and phenotypes. The criterion for significance was set at a permutation p value of <0.05. We also denoted any finding with a permutation p value of >0.05 but <0.1 as a trend in this report.
The IBM ® SPSS ® Statistics software (version 19; SPSS Inc., Chicago, IL, USA) was used for descriptive analyses to compare the characteristics across the three samples (UIC versus UF versus SSC), including age, gender, population (white versus non-white), three adaptive behavior domains (communication, daily living, socialization) and composite scores on the Vineland Adaptive Behavior Scale-II (VABS-II) [29], the RBS-R subscales and the Aberrant Behavior Checklist (ABC) [30] factor scores ( Table 1).
The effect of each individual covariate on RBS-R was examined using linear regression analyses in PLINK, while controlling for the effects of the other covariates and tested SNPs. The significance was set at an uncorrected p value of <0.05. The mean RBS-R subscale and sum scores for individual genotypes of the three SNPs were calculated using the PLINK option of '-qt-mean' ( Table 2). Hardy-Weinberg equilibrium (HWE) and Mendelian errors were examined using the PLINK options of '-hwe' and '-me'. Quanto software (version 1.1; University of Southern California, CA, USA; quanto@icarus2.usc.edu) was used for a power calculation assuming an additive mode of inheritance and type 1 error rate of 0.05 [31].

Post hoc analyses
We also conducted analysis of covariance (ANCOVA) to examine the four ADI-R scores highlighted in the Silverman study [20] (i.e., age at phrase speech, overall level of language, circumscribed interests, and routines and rituals) by rs2056202 and rs2292813 genotype groups in both the UIC (n = 88 probands) and SSC (n = 720 probands) samples, using the General Linear Model, as implemented in the IBM ® SPSS ® Statistics software, with gender and age treated as covariates. These analyses were performed mainly because the direction of associated alleles in our study differed from the Silverman study [20]. Because the 'age at phrase speech' item included non-scale codes (e.g., 994, 996 and 997), we treated these codes as 'missing'. The numbers of non-scale codes were 16 in the UIC sample and 50 in the SSC sample. The three genotype groups were reduced to two (A/+ versus G/G) due to the low frequency of the A/A genotype.
Descriptive analyses of sample characteristics (Table 1) showed that the levels of adaptive behaviors were much higher in the SSC sample, followed by the UIC and UF samples. In addition, the SSC participants had lower levels of maladaptive behaviors measured on the RBS-R and ABC, whereas there were no significant differences in the mean scores of RBS-R and ABC between the UIC and UF samples. Interestingly, the mean scores of the RBS-R appeared to be related to the individual SNP genotypes (Table 2); for instance, subjects with A/G genotypes rs2056202 and rs2292813 had higher RBS-R scores than those with A/A or G/G genotypes in the UIC-UF sample. However, there were too few subjects with A/A genotypes of rs2056202 or rs2292813 for a valid interpretation. In the SSC sample, a trend toward reduction in RBS-R subscale and sum scores was seen across three genotype groups (A/A > A/G > G/G).
The linear regression analyses for individual SNPs revealed significant positive associations between the A allele of rs2292813 and three RBS-R scores (ritualistic, sameness, sum; permutation p = 0.017, 0.018, 0.034, respectively) in the UIC-UF sample. The SSC sample had significant positive associations between the A allele of rs2056202 and four RBS-R scores (stereotyped, sameness, restricted, sum; permutation p = 0.026, 0.021, 0.016, 0.027, respectively) and significant positive associations between the A allele of rs908670 and two RBS-R subscales (stereotyped, self-injurious; permutation p = 0.040, 0.009, respectively) ( Table 3). For the SSC sample only, FSIQ was added into the linear regression model as a covariate along with age, gender and population (Table 4); the association results remained similar to the results shown in Table 3.
We also examined transmission disequilibrium (TD) of the three SLC25A12 SNPs in the SSC sample using the DFAM test as implemented in PLINK, because previous studies [11][12][13][14] suggested overtransmission of the G allele(s) of rs2056202 and/or rs2292813; however, we did not find any evidence of overtransmission of either allele (Table 7). In previous studies [11][12][13][14], the transmission rates of the G alleles of rs2056202 and rs2292813 were estimated at approximately 59 to 65% for rs2056202 and 57 to 65% for rs2292813. We examined if the SSC sample had enough power to detect TD, assuming an additive model for a qualitative trait and type 1 error rate of 0.05, using Quanto software. The SSC sample (720 trios) had 80% power to detect a locus with a relative risk of 1.4, which roughly translates to a transmission rate of 58%. Hence, the SSC sample had adequate power to detect an effect size similar to that detected in previous studies. On the other hand, the QFAM test for the quantitative RRB traits revealed positive associations between the A alleles of all three SNPs and the RBS-R scores (Table 8), which was consistent with linear regression analyses results. Table 9 shows our post hoc analyses, the attempted replication of the Silverman study [20] for the examination of four ADI-R scores by rs2056202 and rs2292813 genotype groups in the samples from UIC (n = 88 probands) and SSC (n = 720 probands). The results from the Silverman study are included in the table for a comparison. Neither UIC nor SSC sample replicated the findings of the Silverman study; however, the SSC sample showed a trend toward more severe 'overall level of language' score (p < 0.05) in the G/G genotype groups of both rs2056202 and rs2292813. We estimated that the Silverman study had a standardized β of 0.42 for 'routines and rituals' and rs2056202 (adjusted mean difference of 0.51 and standard deviation of 1.2). We calculated our study power to see whether our samples had an adequate power to replicate Silverman's finding, using Quanto with an assumption of an additive model of a quantitative locus and type 1 error rate of 0.05. We Analyses of the effect of an individual covariate on the RBS-R revealed negative correlations between age and stereotyped behavior (β = -0.004 to -0.005; p < 0.005 to 0.0001), and between female gender and restricted behavior (β = -0.228 to -0.242; p < 0.05). In addition, positive correlations were shown between age and sameness behavior (β = 0.002; p < 0.01), between female gender and self-injurious behavior (β = 0.217; p < 0.05), and between population other than white, and self-injurious, ritualistic, and restricted behaviors (β = -0.188 to 0.329; p < 0.05 to 0.005). Additionally, we found that FSIQ was a significant covariate for several RBS-R subscale scores in the SSC sample, which included stereotyped behavior (p = 0.00001 to 0.00003), self-injurious behavior (p = 0.019 to 0.027), compulsive behavior (p = 0.0002 to 0.0004) and RBS-R sum score (p = 0.0009 to 0.0017).

Discussion
SCL25A12 was implicated in ASD through a linkagedirected association study [12] and more than one independent replication association study [11,13,14], support from a recent GWAS [15], and its role in central nervous system development [32][33][34][35]. However, not all association studies between SLC25A12 and ASDs have been positive [17][18][19], indicating the need for further investigation of the basis of this inconsistency. In the present study, we examined SLC25A12 as a quantitative trait locus for RRB in people with ASDs, based on a positive correlation between the G allele of rs2056202 and an ADI-R 'routines and rituals' subdomain score [20]. We initially found evidence for positive associations between the A allele of rs2292813 and RBS-R scores (ritualistic, sameness, sum) and between the A-A haplotype of rs2056202-rs2292813 and the same RBS-R scores in our UIC-UF sample. Although our finding of association between SLC25A12 and quantitative RRB traits (ritualistic, sameness behaviors) in the UIC-UF sample appeared to be comparable with the previous association finding [20], the direction of the associated allele was different (the A alleles of rs2056202 and rs2292813 in our samples versus the G allele of rs2056202 in the Silverman study). This 'flip-flop' phenomenon brought up an important question about whether this study provides a confirmation of an association between SLC25A12 and RRB versus a false positive finding [36]. To clarify this issue, we examined the association in a much larger sample consisting of 720 trio families available from the SFARI database. Although the significantly associated SNPs-RRB varied between these two samples (i.e., rs2292813 in the UIC-UF sample, rs2056202/rs908670 in the SSC sample), both datasets showed consistently positive associations with the A alleles of rs2056202 and rs2292813, as evidenced by the positive β values in Table 3 and Table 4. The β values for the A-A and the G-G haplotypes were also consistent across the UIC-UF and SSC samples in the haplotype analyses (positive for the A-A haplotype and negative for the G-G haplotype in Table 5). The A-G haplotype was somewhat puzzling initially, as the A-G haplotype was found to have a negative β in the UIC-UF sample but a positive β in the SSC sample. Interestingly, in the UIC-UF sample, the positively associated 'A' allele of rs2292813 was present only on the A-A haplotype, whereas the negatively associated 'G' allele of rs2292813 was present on both A-G  and G-G haplotypes, creating a negative β value for the A-G haplotype. In the SSC sample, by contrast, the more positively associated A allele of rs2056202 was present on the A-A haplotype about 70% of the time, and on the A-G haplotype about 30% of the time, creating a positive β value for the A-G haplotype. Therefore, the individual haplotype associations were consistent with the allelic associations; that is, positive association with the A allele of rs2292813 in the UIC-UF sample, and positive association with the A allele of rs2056202 in the SSC sample. In addition, the significantly associated SNPs and phenotypes may vary between datasets even in a true association [37]. For example, varying patterns of LD across samples could lead to the susceptibility variant to be associated with different variants in different samples. Taken together, these results argue against the probability of a false positive in these (UIC-UF and SSC) samples, despite the direction of the association being different from the Silverman study.
In this study, we also attempted to replicate the Silverman study directly, using the UIC and SSC samples, because it was not clear whether differences in the phenotype used (RBS-R in our study versus ADI-R in the Silverman study) or in the analytic methods (linear regression in our study versus ANCOVA in the Silverman study) might be contributing to the opposite direction of associated allele. Even with the same phenotypes and comparable analytic methods used; however, our samples did not replicate the Silverman findings. This suggests that genetic and phenotypic heterogeneity of ASD samples may at least partly account for the differences across the studies. For instance, we estimated that 51% of the A/+ group, 57% of the G/G group and 55% of the entire Silverman sample had an overall level of language score of 0. These numbers contrast with 82% of the UIC group and 93% of the SSC group having a score of 0 in the overall level of language. In addition, 'overall level of language' would have affected the ADI-R score of 'verbal rituals' and the subdomain score of 'routines and rituals,' which may have influenced the association findings. Furthermore, it is possible that we overestimated our study power based on an estimate of effect size from the Silverman study. This is often referred as 'winner's curse' when the true effect size Table 9 Four ADI-R Trait scores highlighted by Silverman [20], grouped by the presence or absence of at least one A allele for rs2056202 and rs2292813. a,b may have been much smaller than an estimate from the primary study [38]. Another point to consider is that the ADI-R is not as quantitative as the RBS-R. Therefore, scores may have not provided sufficient variability to observe the same association. Several family-based association studies previously identified the G alleles of rs2056202 or rs2292813 as risk alleles for autism [11][12][13][14]. Although it sounds confusing, these results should not be confused with our study result (the A alleles of rs2056202 or rs2292813 associated with RRB), because our study did not examine associations with autism, but with RRB. Although not straightforward, the association is not expected to be the same even if they seem to be related (i.e., RRB and autism), when it is not the same phenotype. In other words, there is variability of RRB in subjects diagnosed with ASDs. If there were not, then it would not be possible to detect an association with degree of RRB within an autism sample. If an allele is associated with increasing RRB within an autism sample, then whether that allele will show an association with autism depends on the distribution of RRB in the sample. The association with autism may be with the same allele, the opposite allele or neither allele in a sample in which RRB tends to be high, low or mixed within the autism sample, respectively.
In this study, we did not find evidence for TD between ASDs and SLC25A12 in the SSC sample. Because the SSC sample was estimated to have adequate power to detect a locus with a relative risk of 1.4, this result further emphasizes the genetic heterogeneity of ASD (making the effect size smaller or non-existent in the SSC sample). Of note, the SSC sample data were contributed from multiple groups in various regions, increasing the genetic heterogeneity even more in this specific sample. In addition, we would need to consider that the true effect size may have been much smaller than 1.4, an estimate from the previous studies. We also confirmed the effect of age on stereotyped behavior (i.e., older subjects with less severe stereotyped behavior), which is consistent with previous studies [39,40]. In addition, we found gender and population effects on the RBS-R subscale scores in the SSC sample. Moreover, we did not find any effect of study site in the UIC-UF sample, which is particularly interesting because the UIC and UF samples are different in terms of recruitment and assessment protocols, geographic locations, and the rate of concurrent psychotropic medications.

Conclusions
Our study confirmed an association between SLC25A12 and RRB. Genetic and phenotypic heterogeneity may account for the flip-flop phenomenon of associated alleles between our study and the Silverman study, and for the absence of association between SLC25A12 and ASDs in the SSC trio sample. In addition, SLC25A12 may not be the risk allele itself but may be in LD with a real risk allele for RRB and/or ASDs. As anticipated based on the replication design, this study did not fully tag the gene, but was designed to replicate previous findings. Therefore, we identified seven tagging SLC25A12 SNPs with pair-wise r 2 < 80% and MAF>10% from the International Hapmap project (http://hapmap. ncbi.nlm.nih.gov/). Future research efforts should include searching for risk alleles nearby using denser genetic markers including (but not limited to) all tagging SNPs, and dense resequencing of the interval to find genetic variants possibly more directly related to phenotype.