This study sought to identify biologically meaningful associations buried within top signals that did not meet genome-wide significance thresholds from autism GWAS results. Traditional GWAS in autism have been burdened with problems of heterogeneity and limited sample sizes. To date, multiple traditional autism GWAS have been published although it is important to note that some of these studies use overlapping datasets [3–6]. We believe that, while these studies might be underpowered to identify signals that survive genome-wide correction, their reported results do not reflect the full value of the data.
There are excellent examples to date of such novel methodologies providing additional insight into the role of common genetic variants in complex human disorders, including autism. Anney and colleagues in the AGP also conducted a novel analysis of the present genome-wide data utilizing the SNP ratio test in an effort to detect over-transmitted SNPs in predefined pathways of interest, which subsequently identified multiple pathways and/or Gene Ontology terms enriched for associated signals . Interestingly, two of these pathways, ribosomal components and methyltransferase activity, were also identified in the current study as pathways regulated by eQTL-associated SNPs. Our study complements these pathway-based approaches and demonstrates a method of utilizing expression based SNP annotations to mine already existing analyses for disease associations with functional variants in relevant tissues.
Our study identified significant enrichment of parietal and cerebellum eQTL, but not LCL eQTL, among the top SNP associations in all four of the primary AGP GWAS identified by Anney et al. This pattern of results showing significant enrichment in brain (the affected and relevant tissue) and not in tissues peripheral to the main pathology was also seen in a study of cis-regulatory SNPs in bipolar disorder . Similarly, Below et al.  report enrichment of top signals from a type 2 diabetes GWAS in tissues involved in pathogenesis (muscle and adipose) but not LCLs. Additionally, we found no significant difference in proximity to genes between our GWAS implicated SNPs and the SNPs forming the null distribution. Taken together, the minor allele matching, the lack of a significant difference in distance to nearest gene, and the fact that enrichment is found in some tissues and not in others provide strong evidence against possible sources of systematic bias.
There was very little overlap in the eQTL targets found in parietal, cerebellum and LCLs. Only three target genes, protein NIP-SNAP homologue 3A (NIPSNAP3A), TM2 domain containing protein 2 (TM2D2) and copine 1 (CPNE1), were present in all three tissues in the analysis of the broadest diagnostic and ancestry categories. These findings also strongly suggest that the most appropriate tissue for further functional work is the tissue of pathology. This poses clear challenges for difficult to access tissues, such as neurons. However, there is significant evidence that human induced pluripotent stem cells, once transformed into neurons, provide an invaluable source of neuronal tissue from live patients in which to study cellular phenotypes [24–26]. Our study highlights the importance of furthering the development of methods that allow access to tissues involved in pathology.
While we expected, and observed, overlap in top signals between each primary GWAS, our findings showed an abundance of unique eQTL and implicated a total of 140 target genes. Additionally, three genes, SLC25A12 PANX1 and PANX2, were strongly implicated by our results. Two of these genes, SLC25A12 and PANX2, were also identified in a recent study of genes differentially expressed in the brains of individuals with autism compared to control brains . PANX2 is a protein implicated by trans eQTL that modulates the timing of neuroprogenitor commitment to a neuronal lineage in the hippocampus . It is located 500 kb proximal to SH3 and multiple ankyrin repeat domains protein 3 in the Phelan-McDermid Syndrome 22q13.33 terminal deletion region that has been implicated in ASD . PANX1 was implicated in cis by multiple SNPs in multiple analyses and across multiple tissues and has recently been shown to play a role in N-methyl-D-aspartic acid-mediated epileptiform activity . In murine models, Panx1 and Panx2 are strongly co-expressed early in the developing brain, specifically in the hippocampus .
SLC25A12 is a 109 kb gene found on chromosome 2q31.1 and functions as a calcium-binding mitochondrial protein, integral in the exchange of aspartate for glutamate across the mitochondrial membrane. Several studies have investigated the role of SLC25A12 in the development of autism and SLC25A12 SNPs have been implicated [31–34]. Due to the platform used in the original AGP GWAS and quality-based filtering of imputed eQTL data, only one of the previously published SLC25A12 risk variants (rs908670) was included in our study . Rs908670 was identified as an SLC25A12 eQTL in parietal tissue. Previous work has found SLC25A12 to be differentially expressed in autism brains compared to controls [22, 35]. One such recent study found SLC25A12 expression decreased in autism brains compared to controls . We found that the risk alleles of the SLC25A12 eQTL SNPs identified in the AGP GWAS are correlated with decreased expression of SLC25A12 in the parietal lobe. We add our findings to this growing body of evidence and suggest that common variation in SLC25A12 auto-regulates expression and may contribute to autism susceptibility.
Our study is limited by the available analyses and the sample sizes used in the original GWAS. It is with some caution that the comparison of results (for example, Strict|All versus Spec|All) must be interpreted. Enrichment analyses, such as those conducted here, include only the top SNP signals and therefore are sensitive to power and sample size differences between the GWAS scans that provide the data to go into an enrichment analysis. It is not possible to make direct comparisons between GWAS enrichment results that differed with respect to sample size, ranging from 720 subjects (Strict|WestEur) to 1,385 subjects (Spec|All). However, within a single sample (for example, Spec|All or Strict|WestEur), comparisons can easily be made between all tissues including cerebellum, parietal and LCLs.
Additional mining of available GWAS data may provide new insight into the biology of autism while allowing the genetics community to leverage data from smaller studies of GWAS. Our study provides evidence for the hypothesis that SNPs below the genome-wide significant threshold (P = 10-8) are functionally relevant to the development of autism and may yet contribute to risk. Additionally, our results point specifically to SLC25A12 and PANX1/2 and to pathways (such as, methyltransferase, ribosomal components) previously implicated in autism.