Deep exon resequencing of DLGAP2 as a candidate gene of autism spectrum disorders

Background We recently reported a terminal deletion of approximately 2.4 Mb at chromosome 8p23.2-pter in a boy with autism. The deleted region contained the DLGAP2 gene that encodes the neuronal post-synaptic density protein, discs, large (Drosophila) homolog-associated protein 2. The study aimed to investigate whether DLGAP2 is genetically associated with autism spectrum disorders (ASD) in general. Methods We re-sequenced all the exons of DLGPA2 in 515 patients with ASD and 596 control subjects from Taiwan. We also conducted bioinformatic analysis and family study of variants identified in this study. Results We detected nine common single nucleotide polymorphisms (SNPs) and sixteen novel missense rare variants in this sample. We found that AA homozygotes of rs2906569 (minor allele G, alternate allele A) at intron 1 (P = 0.003) and CC homozygotes of rs2301963 (minor allele A, alternate allele C) at exon 3 (P = 0.0003) were significantly over-represented in the patient group compared to the controls. We also found no differences in the combined frequency of rare missense variants between the two groups. Some of these rare variants were predicted to have an impact on the function of DLGAP2 using informatics analysis, and the family study revealed most of the rare missense mutations in patients were inherited from their unaffected parents. Conclusions We detected some common and rare genetic variants of DLGAP2 that might have implication in the pathogenesis of ASD, but they alone may not be sufficient to lead to clinical phenotypes. We suggest that further genetic or environmental factors in affected patients may be present and determine the clinical manifestations. Trial registration ClinicalTrial.gov, NCT00494754


Background
Autism spectrum disorders (ASD) are a group of childhood-onset neurodevelopmental disorders characterized by impaired verbal/nonverbal communication, abnormal reciprocal social interaction, and the presence of stereotyped behaviors and restricted interests. Due to increased awareness and clinical sensitivity to ASD, broadening of the diagnostic criteria of ASD by including Asperger's disorder and Pervasive Developmental Disorder, not otherwise specified in addition to autistic disorder, and other contributing factors, the prevalence of ASD has increased markedly in the past decade. Recent data showed that up to approximately 11 persons per thousand in the USA are affected with ASD, and males are more predominantly affected than females [1][2][3]. The literature documents strong evidence of a high degree of genetic influence in the etiology of autism with high heritability estimated to be more than 90% [4]. Genetic approaches such as cytogenetic analysis, genome-wide linkage and association scans, and candidate gene analysis, have been used to dissect the genetic complexity of ASD [5].
Traditional cytogenetic studies [6][7][8][9] and the recent array-based comparative genomic hybridization (array CGH) analysis have shown that chromosome abnormalities and rare copy number variants [10,11] are associated with ASD. Various chromosome abnormalities such as deletion, duplication, inversion, and translocation were identified in ASD patients. In particular, the advent of array CGH technology has greatly facilitated the detection of formerly undetectable submicroscopic copy number variants that are associated with ASD [10][11][12][13][14].
Our group previously identified two novel chromosome deletions in ASD using karyotyping analysis and array CGH [15]. One was a terminal deletion of approximately 2.4 Mb at 8p23.2-pter detected in a male patient with autistic disorder [15]. Several genes with neurobiological functions such as discs, large (Drosophila) homologassociated protein 2 (DLGAP2), ceroid-lipofuscinosis, neuronal 8 (CLN8), the Rho guanine nucleotide exchange factor 10 (ARHGEF10) and F-box protein 25 (FBXO25), were mapped to this region. It is likely that haploinsufficiency of one or several of these genes might result in the clinical phenotypes of the affected patients. Hence, these genes might be considered as candidate genes of ASD patients in general.
DLGAP2 (GeneID 9298) encodes the discs, large (Drosophila) homolog-associated protein 2, which is also called PSD-95/SAP90-binding protein 2 and SAP90/ PSD-95-associated protein 2 (SAPAP2). The protein is one of the membrane-associated guanylate kinases localized at the post-synaptic density that plays a role in the molecular organization of synapses and in neuronal cell signaling [16]. These kinases are a family of signaling molecules expressed at various submembrane areas, and contain the PDZ, SH3 and the guanylate kinase domains. Several studies have suggested that the synapse associated proteins (SAPs) localized at postsynaptic density are involved in the pathophysiology of psychiatric disorders [17][18][19][20][21]. Marshall et al. detected 13 loci with recurrent CNV in autism cases; they found that several postsynaptic density genes and synapse complex genes were mapped in these CNVs [22]. They also found that a genomic DNA duplication intersected the DLGAP2 gene in a patient with autism [22]. In addition, Ozgen et al. reported a classical inv dup del(8p) in a female patient with autism whose DLGAP2 gene was located within the 6.9 Mb terminal deletion [23]. Together with the finding in our previous study, these studies suggest that the DLGAP2 gene is an important candidate gene of ASD. To test this hypothesis, we conducted a deep resequencing of all the exons of the DLGAP2 in a sample of ASD from Taiwan. Herein, we present our findings of the genetic analysis of DLGAP2 in this report.

Subjects and procedures
All subjects were Han Chinese from Taiwan. Patients meeting the diagnostic criteria of either autistic disorder or Asperger's disorder according to the DSM-IV and ICD-10 were enrolled in this study from the psychiatry department of a university hospital, a private medical center, and schools and early intervention centers in northern Taiwan. Subjects with ASD aged 3 to 25 years old, and with a clinical diagnosis of ASD confirmed by the structured interview using the Chinese version of the Autism Diagnostic Interview-Revised (ADI-R) [24,25]. They received clinical assessments and provided DNA data at the university hospital, no matter whether they were recruited from this university hospital or referred by other places. Patients with known chromosomal abnormalities and associated medical conditions were excluded from the study. Subjects receiving routine medical check-ups at the Department of Family Medicine of a medical center were recruited as controls. The mental status of the control subjects was screened by a senior psychiatrist. Individuals with major psychiatric disorders including ASD were excluded.
The Research Ethics Committee of the research sites approved this study. The patients and their parents were informed that participation in this study was completely voluntary and that nonparticipation would not influence their treatment. Written informed consent was obtained from the patients if they were able to understand the contents of the study, the parents of all the patients, and all the control subjects after the procedures were fully explained. A total of 515 patients with ASD were recruited into this study, including 449 male patients (mean age ± standard deviation (SD) = 8.9 ± 4.6 years) and 66 female patients (mean age ± SD = 8.5 ± 4.4 years). The control group comprised 596 individuals including 263 males (mean age ± SD = 42.5 ± 15.1 years) and 333 females (mean age ± SD = 45.2 ± 13.4 years).
The ADI-R data of the 515 patients revealed the scores as 21.19 ± 5.78 in the 'qualitative abnormalities in reciprocal social interaction' (cut-off = 10), 15.35 ± 4.19 in the 'qualitative abnormalities in communication, verbal' (cut-off = 8), 8.58 ± 3.28 in the 'qualitative abnormalities in communication, nonverbal' (cut-off = 7), and 7.14 ± 2.42 in the 'restricted, repetitive and stereotyped patterns of behaviors' (cut-off = 3). All patients with ASD were noted to have had abnormal development at or before 36 months.
Genomic DNA was prepared from peripheral blood using the Puregene DNA purification system (Gentra Systems Inc. Minneapolis, MI, USA) according to the manufacturer's instructions.

PCR amplification and sequencing
The genomic sequences of human DLGAP2 are available from the NCBI Reference Sequence: NM_004745.3. The human DLGAP2 comprises twelve exons that span approximately 207 kb on chromosome 8p23.2-23.3 [16]; the schematic genomic structure of the DLGAP2 is shown in Figure 1. Optimal PCR primer sequences were generated to amplify each exon of the DLGAP2 using Primer3 (http:// bioinfo.ut.ee/primer3/). All the primer sequences, optimal annealing temperatures and size of each amplicon are listed in Additional file 1: Table S1. After PCR, aliquots of PCR products were processed using a PCR Pre-Sequencing Kit (USB, Cleveland, OH, USA) to remove residual primers and dNTPs following the manufacturer's protocol. The purified PCR products were subjected to direct sequencing using the ABI Prism™ BigDye™ Terminator Cycle Sequencing Ready Reaction Kit Version 3.1, and the ABI autosequencer 3730 (Perkin Elmer Applied Biosystems, Foster City, CA, USA), according to the manufacturer's protocol. The quality of the sequencing results was visualized using Chromas 2.4.1 software (Technelysium Pty Ltd, South Brisbane, Australia). For variant identification, sequencing results of each subject were aligned and compared with the reference sequences using BioEdit software (http://www.mbio.ncsu.edu/bioedit/bioedit.html). To verify the authenticity of mutations identified in this study, repeated PCR from genomic DNA and re-sequencing of the amplicon in both directions were performed. The nomenclature of genetic variants follows the rules of the 'Nomenclature for description of human sequence variations' [26].

Statistical analysis
For the common single nucleotide polymorphisms (SNPs), the differences in the allele and genotype frequencies between the patients and controls were assessed with the Chi-square test. Fisher's exact test was used to compare the combined frequency of rare variants between the patient and the control groups. It was also used to compare the frequency of damaging and functional rare missense variants between the ASD and control groups. Assessment of haplotype-based association analyses was performed using the SHEsis computer program [27]. A P value of less than 0.05 was considered statistically significant, and Bonferroni correction for multiple comparisons was performed when appropriate. We also calculated the Q value for each test with the false discovery test of 5% using QVALUE software (http://genomics.princeton. edu/storeylab/qvalue/) [28].

Results
After sequencing all the exons of DLGAP2 in 515 patients and 596 control subjects, we identified nine common SNPs (defined as with a minor allele frequency > 0.05), and a total of 49 rare variants (defined as with a frequency < 0.05) in this sample. The locations of these variants are listed in Figure 1.

Case-control association study of common SNPs
The genotype and allele frequencies of nine common variants in the patients and control subjects are listed in Table 1. Two SNPs (rs2906569 and rs2301963) were found to have significant differences in genotype frequency distribution between the patient and control groups, even after correction for multiple comparisons. The rs2906569 (A > G) was located at intron 1. The genotype AA homozygotes (G: minor allele, A: alternate allele) were significantly over-represented in the patient group compared to the control group (odds ratio: 1.46; 95% confidence interval, 1.13 to 1.87, P = 0.003) ( Table 2). When the subjects were sub-grouped by gender, the over-representation of genotype AA homozygotes was still observed in male patients but not in female patients ( Table 2). The rs2301963 (C > A) was a missense variant (P384Q) located at exon 3 (A: minor allele, C: alternate allele). The CC homozygotes were significantly over-represented in the patient group compared to the control group (odds ratio: 1.30; 95% confidence interval, 0.99 to 1.70; P = 0.0003) ( Table 2). When the subjects were sub-grouped by gender, the over-representation of CC homozygotes was observed in male patients but not in female patients (Table 2).
Further linkage disequilibrium (LD) analysis showed strong LD among rs6996621, rs2906568, rs2906569, and rs60089073. In addition, rs2235112 and rs2235113 also showed strong LD ( Figure 1). In haplotype-based association analysis derived from nine SNPs, we found significant difference in the haplotype distribution of ACACAAGGT and CCACCAACT between the ASD patients and controls, but only haplotype CCA CCAACT was sustained after correction for multiple comparisons ( Table 3).

Identification of rare genetic variants of DLGAP2
In this study, we found a total of 16 rare missense mutations in our ASD patients and control subjects. The locations of these variants are illustrated in Figure 1. These missense variants are novel and have not been reported before in the literature. Distributions of these rare missense variants are listed in Table 4. There was no significant difference in the combined frequency of rare missense mutations between the two groups (P = 1.00). Those individuals who carried these rare missense mutations had only one missense mutation; we found no individual who carried two rare missense variants simultaneously.
Family study and functional prediction of rare variants A total of 10 missense variants (S15F, R93S, R182Q, A192S, P281A, R324W, C506R, T712M, E798Q and P917L) were detected in 13 patients (Table 4). All the patients carrying these rare variants were heterozygotes. Four of the 13 patients (A, B, J, K) inherited the variant from their mothers and 6 (C, D, E, F, G, H) from their fathers ( Figure 2). Three patients did not have enough genetic information from the parents. Among these ten missense variants, seven (S15F, R93S, R182Q, A192S, R324W, C506R and E798Q were found in the patient group only, while the other three (P281A, T712M and P917L) were detected also in the control group. Aside from these three variants that overlapped with the patient group, six missense variants (V234L, A421V, P771A, H796D, D884N, and N892K) were unique in the control group. The inheritance mode of missense variants found in the control group cannot be traced because we did not collect their parents' DNA. In the analysis of functional prediction of these 16 rare missense variants, S15F, R93S, R182Q, P281A, R324W, C506R, H796D, E798Q, and N892K were predicted to have functional impact on the protein using the PolyPhen-2 or SIFT computer program (Table 4). The ADI-R data of these 13 patients revealed scores of 22.82 ± 7.15 (range, 8 to 28) in the 'qualitative abnormalities in reciprocal social interaction' (cut-off = 10), 15.00 ± 4.84 (range, 5 to 22) in the 'qualitative abnormalities in communication, verbal' (cut-off = 8), 9.55 ± 4.39 (range, 4 to 14) in the 'qualitative abnormalities in communication, nonverbal' (cut-off = 7), and 6.00 ± 2.68 (range, 1 to 12) in the 'restricted, repetitive and stereotyped patterns of behaviors' (cut-off = 3). The average age at which the 13 patients said their first word was 37.1 ± 17.4 (range, 15 to 66) months old and at which they said their first phrase was 44.8 ± 18.5 (range, 19 to 78) months. Their current average intelligence quotients (IQ) were 82.8 ± 34.0 (range 40 to 126) for full-scale IQ, 84.9 ± 32.1 (range, 41 to 123) for performance IQ, and 82.3 ± 33.1 (range, 44 to 122 ) for verbal IQ, as assessed by the Wechsler Intelligence Scale for Children-third edition. Due to the lack of psychometric data for the control subjects who carried the rare missense mutations, we were not able to determine their clinical characteristics.
Clinical assessments of the parents of the 10 patients carrying rare missense mutations found that none of them achieved the clinical diagnosis of ASD. Their reports on the Chinese version of the Autism Spectrum Quotient [29] also revealed no evidence of an overt autistic trait.

Discussion
In this study, we found that rs2906569 at intron 1 and rs2301963 (P384Q) at exon 3 of DLGAP2 were associated with ASD. These two SNPs did not form significant  LD in our genetic analysis. The functional significance of rs2906569 was difficult to infer, because it was located at intron 1. As to rs2301963 (C > A, P384Q), the A allele (Q allele) was the minor allele in our population, and was predicted to be probably damaging using the PolyPhen-2 computer program, but tolerated by SIFT. Based on the finding of significant over-representation of the CC homozygotes in the patient group, we suggest that the Q384 variant of the DLGAP2 might confer an increased risk of ASD, but the real mechanism and meaning of this association remain to be clarified. The small sample size of this study may have led to a false positive, which is also a limitation of this study. In a recent review of the role of common variants in autism, Devlin and colleagues reported their study on three large genome-wide association (GWA) studies of autism, each of which showed a single, non-overlapping risk locus. They found that there was no significant finding when all the data were combined. In their analysis, they found no definitive, replicated results, and they could not be certain that there was a role for common variants in autism risk [30]. Hence, an independent replication study is needed to verify our finding. We also detected a total of 16 novel missense rare variants in the patient and control groups in this study. Given that some of these missense mutations were predicted to have functional impact on DLGAP2, we found no differences in their combined frequency between the two groups. In addition, most of these rare missense mutations found in the patients were inherited from their unaffected parents. Hence, the clinical relevance of these rare missense mutations to ASD is not straightforward, and needs further elucidation. We attempted to assess the interactions between rs2906569 and rs2301963 and the missense rare variants in this study, but could not detect interactions between them. The small sample size of this study might be a major limiting factor. There were eight AA homozygotes of rs2906569 carrying rare missense variants, including four patients and four controls, and there were five CC homozygotes of rs2301963 carrying rare missense variants, including two patients and three controls. The genotype of rs2906569 and rs2301963 in subjects carrying the rare missense variants are listed in the Table 4.
To assess the phenotypic significance of two common SNPs associated with ASD, we compared the three core symptoms of ASD measured by the ADI-R and the Chinese version of Social Communication Questionnaire (SCQ) between patients with A/A and patients with A/G + G/G of rs2906569 (Additional file 2: Table S2), between patients with C/C and patients with C/A + A/A of rs2301963 (Additional file 3: Table S3). We also compared the three core symptoms of ASD measured by the ADI-R and SCQ between patients who carried rare missense mutations and those did not carry rare missense mutations (Additional file 4: Table S4). The Chinese SCQ is a screening tool based on the Autism Diagnostic Interview-Revised (ADI-R) algorithm which corresponds to DSM-IV diagnostic criteria. It examines the three functional domains of reciprocal social interaction, communication, and restricted, repetitive, and stereotyped patterns. The Chinese SCQ was translated under the approval of Western Psychological Services and was validated by the research group led by Gau and Wu [31]. The results showed that there were no statistical significant differences in all the comparisons (all P values > 0.1), regardless of gender with the following two exceptions. Patients with A/A of rs2906569 had less severe social interaction impairment than patients with A/G + G/G of rs2906569 (P = 0.0219) and such significant difference was only noted in male patients (P = 0.0285). But, the statistical significance did not sustain after correcting for multiple testing.
ASD is a complex disease with highly heterogeneous genetic underpinnings, and genotype-phenotype correlation remains a challenging task. According to the S15S C/C S15F C/T S15S common variant hypothesis of complex psychiatric disorders, the effect size of common variants is usually considered small or modest. Moreover, it is not unusual to find inconsistent results among different studies, and the common variants can only explain a small proportion of the clinical variance. In a recent report on stage two of the autism genome project genome-wide association study, Anney and colleagues found that no single SNP showed a significant association with ASD or selected phenotypes at a genome-wide level after genotyping over a million SNPs, and the clinical variance explained by common variants en masse was small [32].
In contrast, the 'rare variant hypothesis' of complex psychiatric disorders suggests that rare variants are likely to have large effect sizes and to be de novo in their origin [33]. Given that several studies have provided evidence to support the large effect size of de novo rare mutations associated with ASD [34][35][36], emerging evidence suggests the multiple-hits model may be more appropriate to explain the incomplete penetrance and varied expressivity of genetic underpinnings of ASD [37,38]. Girirajan and colleagues proposed a 'two-hit' model in which a first hit may act in concert with some factors as a second hit, such as mutations in a single gene, micro-deletions/duplications, epigenetic factors, or environmental insults, and result in variable expressivity of phenotypes in complex neuropsychiatric disorders [38]. Furthermore, in a genetic study of high-functioning, idiopathic ASD, Schaaf and colleagues reported that in addition to de novo rare mutations, patients with ASD had a significantly higher proportion of multiple events of oligogenic heterozygosity than control subjects, suggesting oligogenic heterozygosity is a new potential mechanism in the pathogenesis of ASD [39]. In our previous study, we also reported a boy with ASD who carried two CNVs that were inherited respectively from his unaffected parents [40]. Our previous case report also lent support to the two-hit and compound heterozygosity models of ASD.
In a recent study examining the patterns and rates of exonic de novo mutations in ASD, Neale and colleagues found only a small increase in the rate of de novo events in ASD. They suggested an important but limited role for de novo point mutations in ASD, and supported polygenic models of ASD [41]. In the present study, we found that the putative deleterious missense variants occurred in both patient and control groups with equal chance, and that most of the rare missense mutations were inherited from their unaffected parents. We suggest that the missense mutations of DLGAP2 alone may not be sufficient for the clinical presentations of ASD, and additional hits such as environmental insults or further genetic mutations in the affected patients may be needed for the clinical presentations, which support ASD as likely a multifactorial disease. Hence, it is difficult to find a strong association with a single gene, like DLGAP2 in this study.