Skip to main content

Evidence against the “normalization” prediction of the early brain overgrowth hypothesis of autism



The frequently cited Early Overgrowth Hypothesis of autism spectrum disorder (ASD) postulates that there is overgrowth of the brain in the first 2 years of life, which is followed by a period of arrested growth leading to normalized brain volume in late childhood and beyond. While there is consistent evidence for early brain overgrowth, there is mixed evidence for normalization of brain volume by middle childhood. The outcome of this debate is important to understanding the etiology and neurodevelopmental trajectories of ASD.


Brain volume was examined in two very large single-site samples of children, adolescents, and adults. The primary sample comprised 456 6–25-year-olds (ASD n = 240, typically developing controls (TDC) n = 216), including a large number of females (n = 102) and spanning a wide IQ range (47–158). The replication sample included 175 males. High-resolution T1-weighted anatomical MRI images were examined for group differences in total brain, cerebellar, ventricular, gray, and white matter volumes.


The ASD group had significantly larger total brain, cerebellar, gray matter, white matter, and lateral ventricular volumes in both samples, indicating that brain volume remains enlarged through young adulthood, rather than normalizing. There were no significant age or sex interactions with diagnosis in these measures. However, a significant diagnosis-by-IQ interaction was detected in the larger sample, such that increased brain volume was related to higher IQ in the TDCs, but not in the ASD group. Regions-of-significance analysis indicated that total brain volume was larger in ASD than TDC for individuals with IQ less than 115, providing a potential explanation for prior inconsistent brain size results. No relationships were found between brain volume and measures of autism symptom severity within the ASD group.


Our cross-sectional sample may not reflect individual changes over time in brain volume and cannot quantify potential changes in volume prior to age 6.


These findings challenge the “normalization” prediction of the brain overgrowth hypothesis by demonstrating that brain enlargement persists across childhood into early adulthood. The findings raise questions about the clinical implications of brain enlargement, since we find that it neither confers cognitive benefits nor predicts increased symptom severity in ASD.


Autism spectrum disorder (ASD) is a neurobiologically-based, highly heritable condition [4, 81], but the brain bases of ASD have proven complex and difficult to characterize. An early neurobiological insight was Leo Kanner’s observation that the majority of his autistic patientsFootnote 1 had enlarged head sizes [43]. Numerous studies have since confirmed greater head circumference in ASD compared to typically developing control (TDC) samples, especially in studies with large sample sizes (e.g., [17]; for review see [68]). Piven and colleagues were the first to report increased brain volume in ASD compared to controls using MRI [61], which has been replicated in several studies [32, 33, 57]. However, not all imaging studies have found significant enlargement [3, 35, 37, 52, 70, 80].

One highly cited hypothesis—the Early Brain Overgrowth hypothesis [20]—accounts for inconsistencies in volumetric studies of ASD by postulating (1) average brain size at birth, followed by (2) periods of accelerated growth over the next 2 years, and then (3) deceleration of brain growth, equalizing volumes between groups by middle to late childhood. Support for the first two predictions (average size at birth followed by overgrowth) has been strong and consistent. Head circumference, which correlates highly with brain volume in newborns [49], has been observed to be normal at birth in most individuals who go on to develop ASD [21]. Moreover, brain volume appears to remain typical through 6 months of age in infants later diagnosed with ASD [38]. A recent longitudinal MRI study found a significantly greater rate of growth of total brain volume (TBV) between 12 and 24 months in ASD, resulting in significantly greater TBV in the ASD group at age 2 [39]. Consistent reports of larger TBV in young children with ASD [15, 22, 40, 75] also support the second prediction of the early overgrowth hypothesis: accelerated growth in the first 2 years.

The third prediction—arrested growth leading to normalization—has proven controversial. Several studies have found no difference in total brain volume between ASD and TDC groups in school-age [3, 37, 52, 80] and/or adulthood [35, 70], and a 2005 meta-analysis only supported overgrowth in 2-to-5-year-old autistic children, but not in children older than 6 years [63]. Nevertheless, many studies of school-age to adult samples have identified larger brain volumes in the ASD group [32, 33, 57, 61]. Two more-recent meta-analyses (2008 and 2015) support continued enlargement in school-age through adulthood [68, 76]. In a study of 1881 families of autistic 4–18-year-olds, affected probands had larger head circumference than their unaffected siblings, with an effect of 0.2 cm [17]. Although not a direct measure of brain volume, this study suggests that when appropriately controlling for sex, age, height, weight, and genetic ancestry, head size (an adequate predictor of brain volume, [6]), remains enlarged in ASD.

If overgrowth does persist in ASD, it is not clear which tissue types drive these differences. Some research suggests an imbalance of gray matter (GM) and white matter (WM). However, both increased GM relative to WM [12] and increased WM relative to GM [41] have been noted. Differences in ventricle size have also been noted, including enlarged third ventricles [36], and that ventricular enlargement in neonatal low-birth weight babies relates to a seven-fold increased risk of ASD development [54].

Previous work investigating brain volume in school age and beyond is limited by small sample sizes. For example, in the most recent meta-analysis [68], ASD samples ranged from 6 to 121 individuals, with a median of 20. Given the well-known heterogeneity in ASD of core symptom severity, intellectual abilities, and co-occurring psychiatric conditions [5, 51, 72], small samples have decreased statistical power to detect true effects and increased chance of studying biased groups, producing results that are harder to replicate [14]. Meta-analytic efforts mitigate some concerns related to sample size, but sampling error in small original studies can result in biased meta-analytic estimates [48], and publication bias and selective reporting lead to biased effect size estimates in meta-analysis that are difficult to correct [47]. While multi-site network studies producing publicly available datasets, such as the Autism Brain Imaging Data Exchange (ABIDE [24];), are pushing the field toward ever-larger datasets, inter-site and inter-scanner differences contribute significant noise to these data, which may be nonlinear [31]. This between-scanner noise, when random, limits the ability to find group differences [2, 78], and when systematic, biases observed effects.

Small sample sizes also limit the investigation of important individual differences, such as IQ, sex, and ASD symptom severity. Both sex and IQ are known correlates of brain volume (with larger brains in males and individuals with higher IQ [64, 91], and have known clinical relationships with ASD. Approximately half of autistic children have IQ more than one standard deviation below the mean [5]. ASD is four times more prevalent in boys than girls, who are often disproportionately under-represented or excluded from imaging studies. Some studies of brain structure and white matter tracts [8, 10], and functional connectivity [92] have suggested interactions between sex and diagnosis, but more systematic study of sex differences is needed. Moreover, efforts to associate brain volume with core ASD symptom severity have produced mixed results [1, 68].

To assess the prediction of the Early Brain Overgrowth hypothesis that brain volume normalizes by school age and adolescence in ASD, we investigated the relationships of diagnosis, age, sex, IQ, and core ASD symptom severity with global brain volumes (i.e., total brain, gray matter, white matter, and ventricular volumes) in a large, diverse sample of children, adolescents, and adults. There are several important strengths to the samples examined in the present study. The samples are among the largest of their kind, including 456 individuals in the primary sample, and 175 in the replication. Crucially, within each sample, all individuals were characterized and imaged at the same site, using the same MRI scanner and scan sequence, eliminating sources of error variance that are present in large samples produced by combining data across research sites. The primary sample is particularly strong in terms of diversity of several key characteristics with potential etiological correlates in ASD, including a large number of females with ASD (43, which to our knowledge represents the largest single-site female structural MRI sample to date), an inclusive IQ range (47-158), and a wide age (6 to 25).

Materials and methods


Participants in the primary sample were selected from the larger group of individuals who had participated in any imaging study at CHOP’s Center for Autism Research between 2009 and 2015, from whom a structural anatomical image was acquired. For individuals with ASD, final diagnosis was made by expert clinical judgment using DSM-IV criteria using results from the Autism Diagnostic Observation Schedule [50] and the Autism Diagnostic Interview-Revised [67]. In keeping with DSM-5, all diagnostic subcategories (autism, Asperger’s, PDD-NOS) were pooled into a single ASD group in this study. Four hundred ninety-eight participants had structural scans. Nineteen of these were excluded due to bad scan quality, and 16 were excluded because they received a final diagnosis other than ASD. Seven more individuals were excluded for not having an IQ estimate, leaving a final sample of 456 individuals (see Table 1 and S1, Additional file 1 for demographic data, Figure S1, Additional file 1 for age distributions). Diagnostic groups did not differ significantly on mean age or height (in the subset of 281 individuals for whom height was available at the time of the MRI). Groups differed significantly on proportion of males, reflecting general population differences between ASD and TDC. Racial proportions differed significantly between groups, so sensitivity analyses entailed repeating all analyses within only the White participants.

Table 1 Demographic and clinical information for the primary and replication samples

Cognitive ability

Participants’ cognitive ability (“IQ”) was assessed with one of four standard instruments: the General Cognitive Ability score of the Differential Abilities Scale, Second Edition [26], or the Full Scale IQ of the Wechsler Intelligence Scale for Children, Fourth Edition [90], and the Wechsler Abbreviated Scale of Intelligence, First or Second Edition [88, 89]. The distribution of IQ is shown in Figure S1, Additional file 1. IQ in the ASD group (M = 100.9, SD = 20.6) was significantly lower than controls (M = 113.0, SD = 16.0, t = − 7.0, p < 0.001).

Clinical severity

Clinical severity was assessed with three measures: The Social Responsiveness Scale–2 (SRS-2), a parent questionnaire assessing current ASD traits [19]; the Autism Diagnostic Observation Schedule Calibrated Severity Score (ADOS CSS), an estimate of severity based on clinician ratings [34]; and the Social Communication Questionnaire (SCQ), a parent questionnaire assessing lifetime symptom severity [66].

Parental education

Socio-economic status (parental educational attainment, occupation, and income) is related to children’s neurocognitive functioning, mediated by brain structure, with increased educational attainment of parents predicting increased surface area and volume in children [55]. Because of the well-known relationships between parental education, brain structure, and cognitive functioning, we (1) tested whether these relationships were observed in our TDC sample, and (2) conducted exploratory analyses to examine these relationships in ASD. See Supplementary Methods, Additional file 1 for treatment of this variable.

Replication sample participants and characterization

The replication sample consisted of an all-male cohort collected at Yale University. From 215 available participants, 40 were excluded due to poor scan quality, leaving a final sample of 175. Distributions of age and IQ within the groups are shown in Figure S2, Additional file 1 and Table 1. Yale sample participants were evaluated with the Wechsler Intelligence Scale for Children, Third Edition [86], Wechsler Abbreviated Scale of Intelligence, First Edition [88], or the Wechsler Adult Intelligence Scale, Revised or Third Edition [85, 87]. Diagnostic groups differed significantly on age (t = − 2.92, p < 0.05) and IQ (t = − 4.60, p < 0.001).

Image acquisition

CHOP anatomical images were acquired on a Siemens 3T wide-bore Magnetom Verio Tim scanner with a 32-channel head coil and a Siemens MPRAGE sequence (0.9 × 0.41 × 0.41 mm, TR = 1900, TE = 2.54, flip angle = 9). Replication sample images were collected at Yale University on a GE Signa 1.5 T using a high resolution SPGR sequence (2 NEX, 1.2 mm3; TR = 24, TE = 5, flip angle = 45, matrix=192 × 256, FOV = 30 cm, 124 contiguous 1.2 mm thick sagittal images).

Image processing

CHOP images were N3 bias corrected with ANTS [83] and brain extracted with LABEL ([71], see Fig. 1). Brain extractions were visually inspected, and manually edited with ITK-SNAP [93] if cortex was removed by the automated extraction. Yale images were intensity normalized using a histogram normalization procedure using the BioImage Suite Package [59]. Brain extraction was performed using BET (Brain Extraction Tool, S. M [74].), and conservatively thresholded to remove non-brain pixels only. Manual editing was performed to remove remaining non-brain tissue. Raters demonstrated excellent inter-rater reliability for brain volume (ICC = .99, n = 25).

Fig. 1
figure 1

Raw T1-weighted images (left) were N3 bias corrected and skull-stripped, with manual corrections to ensure cortex was not removed (middle). Skull-stripped images were processed with Freesurfer with manual corrections (right), producing volume estimates

For both datasets, segmentation of the volumes was performed by the Freesurfer image analysis suite (, [23, 27,28,29]), producing total brain volume (TBV), gray matter volume (GMV), white matter volume (WMV), and ventricular volumes. To mitigate concerns that preprocessing techniques instantiated by different statistical packages show differential biases in comparing ASD and TDC volume [44], segmentations were visually inspected slice-by-slice (blind to all subject characteristics). Segmentation errors were manually edited using Freeview (e.g., dura labeled as gray matter, inaccurate identification of the gray/white or pial surface). Final segmentations were visually inspected and excluded if motion artifacts impacted segmentation quality, if a superior image was available for the participant (in the primary dataset), or if correspondence between Freesurfer’s total brain volume estimate and the total brain volume from gold standard manual tracing was exceptionally poor (in the replication dataset). See Supplementary Methods and Figure S3, Additional file 1 for reliability information and details about volume definitions.

For each volume measure, a regression model was tested including IQ, age, sex (CHOP only), diagnosis, the interaction of IQ and diagnosis, the interaction of age and diagnosis, and the interaction of sex and diagnosis (CHOP only). Nonsignificant interaction terms were removed to simplify the models and provide more precise estimates of the effect sizes of the main effects, with full models presented in Additional file 1 to illustrate null interaction findings. In the subset of individuals from the CHOP site for whom accurate height data was available, effects of height were also investigated. Effect sizes are reported as partial eta squared (partial η2) derived from equivalent ANOVAs. Partial η2 measures the proportion of variance in a dependent variable associated with each independent variable, with the effects of other independent variables and interactions partialled out, with suggested interpretive benchmarks of .01, .06, and .14 for small, medium, and large effects [65]. Estimates and standardized estimates from regressions are also included in the tables.


In CHOP models including all terms (4 main effects and 3 interactions), there is 80% power to detect effects of f2 = 0.03, where Cohen’s guidelines suggest that effects of f2 > 0.02 are small and f2 > 0.15 are medium. In Yale models with all terms (3 main effects and 2 interactions), there is 80% power to detect effects of f2 = 0.08. Thus, the CHOP sample is powered to detect small effects, and the Yale sample is powered to detect small-to-medium effects.

Extreme size subgroup analysis

Some prior work has suggested that there is a higher rate of macrocephaly (head circumference above the 98th percentile) among autistic people. A recent meta-analysis found 15.7% of autistic participants had macrocephaly, and 9.1% showed brain overgrowth [68]. Higher rates of microcephaly have also been reported in ASD [30]. We conducted a post hoc analysis to explore the possibility that group-average differences in brain volume were driven by a subgroup of macrocephalic individuals in the ASD group. Within the TDC group, the mean and standard deviation of TBV were calculated within 3-year age bins separately by sex, and the number of individuals within the ASD group whose TBV exceeded 2 SD from the mean of their respective age/sex bin was examined. Individuals were excluded from this analysis when there were fewer than 2 TDC individuals within an age bin, because standard deviation could not be calculated. No individuals were excluded from the CHOP sample for this reason; 7 age bins including a total of 13 individuals were excluded from the Yale sample due to insufficient TDCs in the age bin.


Group volume differences

Final models are presented in Table 2, with group means, standard deviations, and Cohen’s d effect size estimates presented in Table 3, and models including all non-significant interactions in Table S2, Additional file 1. Figure 2 graphically displays the relationships between TBV, GMV, WMV, and age and IQ, separated by diagnosis and sex in the CHOP sample. Diagnosis significantly predicted all brain tissue variables used in the analyses (TBV, GMV, WMV, cortical GMV, cortical WMV, cerebellar GMV, cerebellar WMV), except the ratio of GM to WM. All significant effects of diagnosis were in the direction of ASD showing larger volume than TDC. There also was a significant diagnosis-by-IQ interaction predicting TBV, GMV, WMV, cortical GMV, cortical WMV, and cerebellar GMV. There was no significant diagnosis-by-IQ interaction for cerebellar WMV.

Table 2 Models for primary sample. Uncorrected p-values are reported. One outlier was removed from the lateral ventricle model.
Table 3 Mean and standard deviation of volumes in each group in the CHOP sample, and Cohen’s d for the difference between groups. Note that groups are not matched for age, sex, and IQ, and that the Cohen’s d effect size estimate does not account for these factors
Fig. 2
figure 2

Relationships of IQ and age with total brain volume (a, b), gray matter volume (c, d), and white matter volume (e, f) in the primary sample, by diagnosis and sex. IQ shows a significant interaction with diagnosis predicting all three outcome measures (b, d, e). Age did not significantly predict TBV (a), negatively predicted GMV (c), and positively predicted WMV (e). Significant main effects of diagnosis and sex were observed in all 3 measures. Dashed lines indicate regions-of-significance, where the effect of diagnosis is not significant within the shaded region

To further understand this interaction, the regions of significance of the diagnosis-by-IQ interaction were evaluated using the Johnson-Neyman procedure, which indicates at which levels of a moderator an independent variable has a significant effect on the dependent variable [7]. Controlling for age and sex, the effect of diagnosis on TBV was significant for IQ scores less than 115.3 (ASD > TDC) and greater than 140.1 (TDC > ASD). This means that for IQ scores below 115 and above 140, the relationship between IQ and TBV differs between ASD and TDC. Within the TDC group, the semi-partial correlation of IQ with TBV given age and sex was r = 0.38, p < 0.001. Within the ASD group, this correlation was r = 0.045, p = 0.47. Thus, the typical positive correlation between IQ and TBV was absent in the ASD group.

Across both groups, age negatively predicted GMV, cortical GMV, and cerebellar GMV. Age positively predicted WMV, cortical WMV, and cerebellar WMV. Age did not significantly predict TBV. Notably, there were no significant age-by-diagnosis interactions in any of the models tested.

Sex was a significant predictor in every model, with large effects (male larger than female) on all measures except cerebellar WMV, on which it had a small effect. Notably, there were no significant sex-by-diagnosis interactions. There were significant main effects of IQ in all models except the cerebellar WMV. All significant IQ effects occurred in the presence of an IQ-by-diagnosis interaction.

Models were all tested including height as a predictor in the subset of individuals for whom there was an available measure of height within a year of the scan, with few changes to the significance of results. In white matter models (WMV, cortical WMV, and cerebellar WMV), age became non-significant as a predictor, likely due to the multicollinearity (age and height were highly correlated, r = 0.86, p < 0.001). Within the TBV model, age became a significant predictor (partial η2 =0.03, p < 0.01). The only other qualitative change was in the model of cerebellar WMV, in which the effects of sex and diagnosis became non-significant, likely due to less statistical power compared to the full sample (n = 456 versus n = 281).


Controlling for age, sex, and IQ, diagnosis was a significant predictor of lateral ventricular volume (partial η2 = 0.013, p < 0.05). Visual inspection of data indicated one extreme outlier in lateral ventricular volume; removing this outlier reduced the size of the effect (partial η2 = 0.009, p < 0.05, Fig. 3). There was a significant age-by-diagnosis interaction in the third ventricles, (partial η2 = 0.013, p < 0.05). In the ventricles, unlike in the majority of the tissue volume measures, there were no significant IQ-by-diagnosis interactions in predicting volume.

Fig. 3
figure 3

Ventricular volume in the primary sample

Gray matter-to-white matter ratio

To examine relative contributions of GMV and WMV differences between the groups, the ratio of GMV-to-WMV differences were examined in a model similar to those used to examine primary volumetric measures. This yielded no significant interaction terms, and no significant main effect of group or IQ. There were significant effects of age (partial η2 = 0.46, p < 0.001) and sex (partial η2 = 0.03, p < 0.05), with a greater gray-to-white ratio in females, and with this ratio decreasing with age (Figure S4, Additional file 1).

Clinical correlates of brain size.

When controlling for age, sex, and IQ, neither the ADOS CSS, the SRS, nor the SCQ significantly predicted TBV within the ASD group (Fig. 4, Table 4). That is, none of the three measures of ASD severity correlated with brain volume in the ASD group.

Fig. 4
figure 4

Relationships of clinical severity measures (a, SRS; b, ADOS CSS; c, SCQ) with TBV within the CHOP ASD group only. No severity measure showed a significant relationship with ASD symptoms, controlling for age, sex, and IQ

Table 4 Models showing the relationships of clinical severity measures with TBV within the CHOP ASD group only. No severity measure showed a significant relationship with ASD symptoms, controlling for age, sex, and IQ

Parental education

In order to obtain precise statistics accounting for the rank-order nature of the parental education data, zero-order correlations between parental education and brain volume within each group were examined with Kendall’s Tau. Within the TDC group, the relationship between TBV and parental education was significant and positive (τ = .21, p < 0.001, Figure S5, Additional file 1). This relationship was negative (although non-significant) within the ASD group (τ = − 0.09, p = 0.09). To explore the significance of this apparent disordinal interaction, parental education was added to the model of TBV, such that the full model was TBV ~ diagnosis + age + sex + IQ + parental education + IQ*diagnosis + parental education*diagnosis. In this model, the interaction of parental education and diagnosis was significant (partial η2 = 0.03, p < 0.01). This interaction is also significant in separate models in which father’s education is included as a binary factor (college degree or no college degree) indicating that this interaction effect is robust to choice of statistical method. When mother’s education is included as a binary factor, it is not significant (p = 0.12).


Because race was imbalanced between groups, all of the models in Table 2 were examined within only the White participants, to rule out the explanation that racial differences accounted for group differences. The significance of terms changed in only three models: in the model predicting cerebellar GMV, IQ, diagnosis, and the IQ-by-diagnosis interaction were no longer significant (possibly due to reduced power); in the model predicting cerebellar WMV, diagnosis was no longer significant; and in the model predicting third ventricle volume, sex and diagnosis became significant. Although diagnosis remained a significant predictor in all other models, the effect size of diagnosis was somewhat reduced.

Extreme size subgroup analysis

To identify ASD participants with extremely large or small brains, we calculated the mean and standard deviation of TBV within 3-year age bins separately by sex within the TDC group, and examined ASD individuals whose TBV exceeded 2 SD from the mean for their age and sex. In the ASD group, there were 10 individuals with brains 2 SD above the mean for their age/sex bin (4.1%), and 10 with brains 2 SD below (4.1%, compared to 2.3% above and 2.8% below in the TDC group). Although a higher proportion of ASD individuals had brains with extreme sizes than TDC individuals, this difference was not statistically significant (χ2 (2, N = 456) = 1.9, p = 0.38). Information on the age, IQ, and gender ratio for the ASD individuals with larger, smaller, and typically-sized brains is presented in Table S3, Additional file 1. Comparing these groups statistically, there is a significant difference in age (F (2,237) = 3.95, p < 0.05), with the mean age of the extreme ASD groups higher than the mean age of the ASD individuals with typically sized brains. There were not significant differences in IQ (F (2, 237) = 0.12, p = 0.88), sex ratio (Fisher’s exact test p = 0.25), or ADOS CSS (F (2, 235) = 1.34, p = 0.26) between the groups. To investigate whether a subgroup of individuals with extremely sized brains drove the between-group diagnostic differences, the TBV model presented in Table 2 was re-examined excluding all ASD individuals with TBV > 2 SD from the mean, and the effect of diagnosis remained significant (partial η2 = 0.062, p < 0.001).

Replication results

As in the primary dataset, diagnosis was a significant predictor of TBV, GMV, and WMV in the Yale dataset (Tables 5 and 6, Fig. 5). There was no significant interaction of age and diagnosis in any of the models. There was also no significant interaction of diagnosis and IQ. These interactions were dropped from the models for simplicity, but are presented in Table S4, Additional file 1. In the full models with all interaction terms, the main effects of diagnosis on TBV, GMV, and WMV were in the same direction as in the main-effects-only models (ASD > TDC), but were not significant, potentially due to the loss of degrees of freedom. Although there was not a significant IQ-by-diagnosis interaction, the correlation between IQ and TBV was qualitatively smaller in the ASD sample than the TDC sample, which was the pattern observed in the primary dataset. Within the TDC group, the semi-partial correlation of IQ with TBV given age was r = 0.35, p < 0.001. Within the ASD group, this correlation was r = 0.25, p < 0.05. IQ was a significant predictor of TBV, GMV, and WMV. Age was a significant predictor of TBV and GMV. As in the primary dataset, diagnosis did not predict the ratio of GMV-to-WMV (Figure S6, Additional file 1). Additionally, both lateral ventricles and third ventricles were enlarged in the ASD group (Figure S7, Additional file 1). In the ASD subgroup analysis, there were 7 individuals with brains 2 SD above the mean for their age bin (9.1%), and 2 with brains 2 SD below (2.6%, compared to 2.4% above and 0% below in the TDC group). Fisher’s exact test indicates that the proportion of individuals with extremely-sized brains is different between the ASD and TDC groups (p = 0.035). Within the ASD group, there were no differences between the small, large, and typically-sized subgroups in age (F (2,74) = 0.49, p = 0.62) or IQ (F (2,74) = 0.33, p = 0.72). In models testing the main effects of IQ, age, and diagnosis on TBV when excluding the extremely-sized ASD individuals, diagnosis remained significant (partial η2 = 0.027, p = 0.04).

Table 5 Main effects of IQ, Age, and Diagnosis in the Yale sample.
Table 6 Mean and standard deviation of volumes in each group in the Yale sample, and Cohen’s d for the difference between groups. Note that groups are not matched for age and IQ, and that Cohen’s d does not account for these factors
Fig. 5
figure 5

Relationships of IQ and age with total brain volume (a, b), gray matter volume (c, d), and white matter volume (e, f) in the Yale sample, by diagnosis. Age negatively predicted TBV and GMV (a, c). IQ positively predicted all three measures (b, d, f). ASD status positively predicted all three measures


We do not find evidence to support the prediction that brain size in ASD normalizes over development. The Early Overgrowth hypothesis predicts that either (1) there should be no significant main effect of diagnosis on brain size (i.e., that brain size has normalized between the groups aged 6–25 years old in our sample) or (2) there should be a significant interaction of age and diagnosis, with volumetric differences for the youngest autistic children and normalization of brain volume across the age range. Our findings do not support either prediction. In the primary sample, we found a significant main effect of diagnosis for GMV, WMV, and TBV, and no significant interaction with age, with volumes about 2.8–3.2% larger in the ASD group. This finding was replicated in the Yale sample, in which TBV and GMV are 3.1% and 5.3% larger in the ASD group, with no interactions with age. Furthermore, our explorations of sub-groups of ASD individuals with particularly enlarged brains (> 2 SD from the typical mean for their age/sex) suggested that this enlargement was slightly more common in older youth in the CHOP sample and consistent across ages in the Yale sample. This indicates that findings of group-level enlargement in ASD were not driven by a subset of only the youngest children in our samples having enlarged brains. In addition, we observed a significant interaction in the primary sample between diagnosis and IQ, such that the overall brain enlargement effect in ASD was driven by children with IQ scores less than 115. This interaction is due to the stronger correlation in the TDC sample between IQ and brain volume than in the ASD sample, which showed no relationship between IQ and brain volume. The main effect of diagnosis should be interpreted in light of this significant interaction with IQ. Nevertheless, this study finds converging evidence in two large datasets that early brain overgrowth persists through adolescence and into early adulthood in ASD, failing to support “normalization” predictions from the Early Overgrowth hypothesis.

Increased brain volume has been one of the most consistently observed biomarkers of ASD in young children. Our results suggest that brain enlargement persists into early adulthood. This is consistent with some prior publications, including MRI studies [32, 33, 57, 61, 76] and a very large study of head circumference [17], but not with others [3, 35, 37, 52, 63, 70, 80]. The current results are noteworthy because of the large samples collected on the same scanner (by sample), inclusion of a broad age and IQ range, and large female representation. The size and quality of the samples we report allow for more generalizable and definitive conclusions about the development of brain size in autism than have previously been possible from smaller studies.

These results support neither the model of GM/WM imbalance predicting increased GM but decreased WM in ASD [12], nor the model predicting greater effect sizes of diagnostic group on WM than GM [41]. Rather, the data suggest that structural differences in WM occur in roughly equal proportion to GM between groups.

There are several potential mechanisms underlying the persistent brain volume difference in ASD. Brain volume is the product of cortical thickness and cortical surface area, which are independently heritable and have unique mechanistic underpinnings [58]. Increased surface area has been identified in some ASD samples [56] but not others [62]. Greater cortical thickness or differential rates of change have also been observed in some studies [25, 46, 62, 73]. but not all [56]. Using a subset of the CHOP sample reported in the current study, we recently reported that regional deviations from a normative model of brain development in diffusion metrics, volume, thickness, and surface area can accurately classify diagnostic status, although diffusion metrics out-performed anatomical measures in this age-based approach [82]. We plan to further investigate regional differences in cortical surface area and thickness in future work.

In typical development, dendritic arborization and synaptogenesis occur rapidly in the first year of life, followed by dendritic pruning [42]. The emergence of brain volume differences and clinical symptoms across the first 2 years of life in ASD points to these as candidate mechanisms. One potential mechanism of reduced dendritic pruning is mammalian target of rapamycin (mTOR) kinase, which is regulated by a number of genes associated with ASD, including TSC1/TSC2, NF1, and PTEN [13]. Hyperactive mTOR can produce excessive synaptic proteins and impair autophagy, and has been correlated with increased dendritic spine density in post-mortem brains of autistic individuals [79]. Another potential genetic source of this effect is chromodomain helicase DNA binding protein 8 (CHD8). This regulatory gene with neurodevelopmental targets has been strongly associated with ASD [69], and has been clinically associated with macrocephaly in ASD and in zebrafish models [9]. These cellular processes may be expressed differentially in different regions of the brain. For example, post-mortem studies of neuronal density revealed higher density in some regions of autistic brains compared to controls [16], but lower density in other regions [84].

In the context of significantly larger brains on average in ASD, we find no correlation between any of our severity measures and TBV, consistent with prior findings in preschoolers [1]. The failure to find correlations with symptom severity complicates the clinical implications of the enlarged-brain biomarker, given the conceptualization of autism as a spectrum disorder. If the degree of brain enlargement is not associated with the degree of core ASD symptoms, it is unclear how increased brain volume is functionally important to causal mechanisms of ASD. It might be that increased brain volume is not an underlying source of core ASD symptom differences, but represents a collateral consequence of the true underlying source. If true, increased brain size could be a biomarker of ASD, without being central to the pathophysiology [77]. Alternatively, enlarged brain volume may represent a categorical diathesis for ASD, with dimensional causes and symptoms overlaid. Another alternative is that global volume differences may not entirely reflect localized differences in regions, pathways, and networks, which may correlate more closely with symptom severity. Another important alternative is that group-level volume differences may be driven by a subsample of individuals with both ASD and enlarged brains, who are not distinguished by clinical severity [1]. We find that 4.1% of the CHOP ASD group and 9.1% of the Yale ASD group had brain volume greater than 2 SD above the mean for their age and sex. However, neither IQ nor clinical severity differed between the subgroups with extremely large or small brains and the subgroup falling within the typical range. Importantly, the group-average difference in brain volume remains significant when excluding the autistic individuals with extremely-sized brains, indicating that group level differences are not entirely driven by a subgroup of individuals. Finally, the absence of a correlation between brain size and ASD severity might indicate that our ASD symptom metrics fail to capture important aspects of ASD heterogeneity, as our three measures are poorly correlated with one another in this sample and in others [11].

Diagnostic group differences in brain volume are also complicated by an interaction between diagnosis and IQ. In humans, the relationship between brain size and intelligence has long been noted [53, 60, 91]. Indeed, in both the primary and replication datasets, we find a significant correlation within the TDC sample between brain size and IQ, while correcting for sex and age. However, correlations within the ASD group are smaller, and for the primary sample are not significant. The lack of a correlation with IQ suggests that individual differences in brain size have a different meaning in ASD, and that additional tissue volume does not confer cognitive advantages.

What, mechanistically, might disrupt the relationship between brain volume and IQ in ASD? The relationship may be weakened through a combination of underlying, unmeasured microstructural differences or differences in network organization. Alternatively, if IQ measurement is less reliable in ASD than TDC, the correlation between IQ and brain volume would be attenuated. While the IQ measures used in this study have evidence of validity and reliability in both typical and clinical [26] and ASD-specific samples [90], evidence of test-retest reliability in an ASD sample is lacking, and the factor structure of IQ may be different in ASD [18].

Our regions-of-significance analyses suggest that the ability of a study to detect brain enlargement in ASD depends on the IQ of the TDC sample. We expect little between-group difference when TDCs have high IQs, and greater difference when TDCs have lower IQs. Even a sample well-matched on IQ would be expected to show little difference if both groups are high in IQ. Failing to include lower-IQ TDC participants in imaging studies may bias results toward null brain volume differences between ASD and TDC, and contribute to controversy over the persistence of brain enlargement in ASD. Although the IQ-by-diagnosis interaction was not significant in the Yale sample, there are several reasons to believe the CHOP dataset is superior in accuracy and sensitivity (i.e., greater sample size, 3T versus 1.5T scanner, improved scan sequences, diversity of sex, superior matching of demographics). Post hoc power analyses using the effect sizes of the interaction obtained in the CHOP sample indicate that the power to detect this interaction in the Yale sample was 0.53 for TBV, 0.65 for GMV, and 0.29 for WMV. Thus, even the relatively large Yale dataset was likely underpowered to detect these interactions.

Although expected sex-effects were observed (i.e., larger brains in males than females), no sex-by-diagnosis interactions were observed in any of the measures. These findings suggest that diagnostic group differences in global brain morphology are not related to sex.

In the TDC sample, higher parental educational attainment (a proxy for socioeconomic status, SES) was associated with increased TBV and higher child IQ. These findings are both consistent with a theoretical model of brain structure mediating the relationship between parent SES and child cognitive ability [55]. Interestingly, follow-up analyses found that the relationship between parental education and child’s IQ is attenuated in the ASD sample, and that the relationship between parental education and child’s brain volume is weak and reversed in ASD. These findings suggest that the mechanisms that result in enlarged brains in ASD disrupt the typical relationship between SES and neurocognitive development, as well as the relationship between brain size and cognitive ability. Additionally, including the interaction of parental education-by-diagnosis in a regression model predicting TBV increases the partial η2 value of the main effect of diagnosis (from 0.05 to 0.11). This finding highlights the importance of obtaining information about cognitive ability and educational attainment of parents. Such information allows for the study of not only the autistic individual’s ability, but also how much that ability deviates from predicted familial relationships in the absence of ASD.


Our samples’ age range (6–25 years) is a significant limitation to our ability to fully evaluate the Early Brain Overgrowth hypothesis. Normalization is proposed to occur immediately following the period of overgrowth [20], with brain sizes of ASD and TDC equalizing by approximately age 5. Therefore, it is possible that the magnitude of the group differences observed in our sample would have been larger had they been observed as toddlers. A second limitation is that our data are cross sectional. This would be most problematic if there is a systematic difference in the brain volumes of individuals who chose to participate at different ages. The ideal test of the Early Brain Overgrowth hypothesis would follow a cohort prospectively from diagnosis as a toddler to adulthood. Longitudinal study is particularly important to assess individual differences in growth trajectories. For example, it is possible that a subset of our sample had larger brains relative to peers as toddlers, experienced normalization, and now have average-sized brains, while other individuals’ brain size did not normalize and remained enlarged. The group differences we report demonstrate clearly that brain volume changes do persist at a group level in autistic adolescents and adults, and further longitudinal study should investigate the potential clinical implications of differing individual trajectories.


In summary, this work provides evidence that brain volume does not normalize by school-age, adolescence, or young-adulthood in ASD. While the effect sizes obtained in both samples are somewhat smaller than those often reported in samples of toddlers, enlargement remains. As we do not have MRIs from younger ages, it is possible that some degree of normalization occurred prior to the present measurements; if so, this normalization was not exhaustive. It is important that cellular and molecular researchers understand this developmental context in the search for mechanisms that might account for brain overgrowth in ASD.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author, upon reasonable request.


  1. We use identity-first language per our stakeholder preference [45]



Autism spectrum disorder


Gray matter volume


Total brain volume


Typically developing control


White matter volume


  1. Amaral DG, Li D, Libero L, Solomon M, Van de Water J, Mastergeorge A, Naigles L, Rogers S, Wu Nordahl C. In pursuit of neurophenotypes: The consequences of having autism and a big brain. Autism Res. 2017.

  2. Auzias, G., Breuil, C., Takerkart, S., & Deruelle, C. (2014). Detectability of brain structure abnormalities related to autism through MRI-derived measures from multiple scanners. 314–317.

  3. Aylward EH, Minshew NJ, Field K, Sparks BF, Singh N. Effects of age on brain volume and head circumference in autism. Neurology. 2002;59(2):175–83.

    Article  CAS  PubMed  Google Scholar 

  4. Bailey A, Le Couteur A, Gottesman I, Bolton P, Simonoff E, Yuzda E, Rutter M. Autism as a strongly genetic disorder: Evidence from a British twin study. Psycholog Med. 1995;25(1):63–77.

    Article  CAS  Google Scholar 

  5. Baio J. Prevalence of Autism Spectrum Disorder Among Children Aged 8 Years—Autism and Developmental Disabilities Monitoring Network, 11 Sites, United States, 2014. MMWR. Surveillance Summaries. 2018:67.

  6. Bartholomeusz HH, Courchesne E, Karns CM. Relationship Between Head Circumference and Brain Volume in Healthy Normal Toddlers, Children, and Adults. Neuropediatrics. 2002;33(05):239–41 10.1055/s-2002-36735.

    Article  CAS  Google Scholar 

  7. Bauer DJ, Curran PJ. Probing Interactions in Fixed and Multilevel Regression: Inferential and Graphical Techniques. Multivariate Behavioral Research. 2005;40(3):373–400.

    Article  PubMed  Google Scholar 

  8. Beacher FD, Minati L, Baron-Cohen S, Lombardo MV, Lai M-C, Gray MA, Harrison NA, Critchley HD. Autism Attenuates Sex Differences in Brain Structure: A Combined Voxel-Based Morphometry and Diffusion Tensor Imaging Study. Am J Neuroradiol. 2012;33(1):83–9.

    Article  CAS  PubMed  Google Scholar 

  9. Bernier R, Golzio C, Xiong B, Stessman HA, Coe BP, Penn O, Witherspoon K, Gerdts J, Baker C, Vulto-van Silfhout AT, Schuurs-Hoeijmakers JH, Fichera M, Bosco P, Buono S, Alberti A, Failla P, Peeters H, Steyaert J, Vissers LELM, et al. Disruptive CHD8 mutations define a subtype of autism early in development. Cell. 2014;158(2):263–76.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  10. Bloss CS, Courchesne E. MRI Neuroanatomy in Young Girls With Autism: A Preliminary Study. J Am Acad Child Adolesc Psychiatry. 2007;46(4):515–23.

    Article  PubMed  Google Scholar 

  11. Bölte S, Westerwald E, Holtmann M, Freitag C, Poustka F. Autistic Traits and Autism Spectrum Disorders: The Clinical Validity of Two Measures Presuming a Continuum of Social Communication Skills. J Autism Dev Disord. 2011;41(1):66–72.

    Article  PubMed  Google Scholar 

  12. Bonilha L, Cendes F, Rorden C, Eckert M, Dalgalarrondo P, Li LM, Steiner CE. Gray and white matter imbalance – Typical structural abnormality underlying classic autism? Brain Dev. 2008;30(6):396–401.

    Article  PubMed  Google Scholar 

  13. Bourgeron T. A synaptic trek to autism. Curr Opin Neurobiol. 2009;19(2):231–4.

    Article  CAS  PubMed  Google Scholar 

  14. Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, Munafò MR. Power failure: Why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14(5):365–76.

    Article  CAS  Google Scholar 

  15. Carper RA, Moses P, Tigue ZD, Courchesne E. Cerebral lobes in autism: Early hyperplasia and abnormal age effects. Neuro Image. 2002;16(4):1038–51.

    PubMed  Google Scholar 

  16. Casanova MF, van Kooten IAJ, Switala AE, van Engeland H, Heinsen H, Steinbusch HWM, Hof PR, Trippe J, Stone J, Schmitz C. Minicolumnar abnormalities in autism. Acta Neuropathologica. 2006;112(3):287.

    Article  PubMed  Google Scholar 

  17. Chaste P, Klei L, Sanders SJ, Murtha MT, Hus V, Lowe JK, Willsey AJ, Moreno-De-Luca D, Yu TW, Fombonne E, Geschwind D, Grice DE, Ledbetter DH, Lord C, Mane SM, Lese Martin C, Martin DM, Morrow EM, Walsh CA, et al. Adjusting head circumference for covariates in autism: Clinical correlates of a highly heritable continuous trait. Biological Psychiatry. 2013;74(8):576–84.

    Article  PubMed  PubMed Central  Google Scholar 

  18. Clements CC, Watkins MW, Schultz RT, Yerys BE. Does the Factor Structure of IQ Differ Between the Differential Ability Scales (DAS-II) Normative Sample and Autistic Children? Autism Research, n/a(n/a). 2020.

  19. Constantino, J., & Gruber, C. (2012). The Social Responsiveness Scale Manual, Second Edition (SRS-2). Western Psychological Services.

  20. Courchesne E. Brain development in autism: Early overgrowth followed by premature arrest of growth. Ment Retard Dev Disabil Res Rev. 2004;10(2):106–11.

    Article  PubMed  Google Scholar 

  21. Courchesne E, Carper R, Akshoomoff N. Evidence of brain overgrowth in the first year of life in autism. JAMA. 2003;290(3):337–44.

    Article  PubMed  Google Scholar 

  22. Courchesne E, Karns CM, Davis HR, Ziccardi R, Carper RA, Tigue ZD, Chisum HJ, Moses P, Pierce K, Lord C, Lincoln AJ, Pizzo S, Schreibman L, Haas RH, Akshoomoff NA, Courchesne RY. Unusual brain growth patterns in early life in patients with autistic disorder: An MRI study. Neurology. 2001;57(2):245–54.

    Article  CAS  Google Scholar 

  23. Dale AM, Fischl B, Sereno MI. Cortical Surface-Based Analysis: I. Segmentation and Surface Reconstruction. Neuro Image. 1999;9(2):179–94.

    Article  CAS  PubMed  Google Scholar 

  24. Di Martino A, Yan C-G, Li Q, Denio E, Castellanos FX, Alaerts K, Anderson JS, Assaf M, Bookheimer SY, Dapretto M, Deen B, Delmonte S, Dinstein I, Ertl-Wagner B, Fair DA, Gallagher L, Kennedy DP, Keown CL, Keysers C, et al. The autism brain imaging data exchange: Towards a large-scale evaluation of the intrinsic brain architecture in autism. Mol Psychiatry. 2014;19(6):659–67.

    Article  PubMed  Google Scholar 

  25. Ecker C, Shahidiani A, Feng Y, Daly E, Murphy C, D’Almeida V, Deoni S, Williams SC, Gillan N, Gudbrandsen M, Wichers R, Andrews D, Van Hemert L, Murphy DGM. The effect of age, diagnosis, and their interaction on vertex-based measures of cortical thickness and surface area in autism spectrum disorder. J Neural Transm. 2014;121(9):1157–70.

    Article  CAS  PubMed  Google Scholar 

  26. Elliot, C. (2007). The Differential Abilities Scale, Second Edition. Harcourt Assessments, Inc.

  27. Fischl B, Dale AM. Measuring the thickness of the human cerebral cortex from magnetic resonance images. Proc Natl Acad Sci. 2000;97(20):11050–5.

    Article  CAS  PubMed  Google Scholar 

  28. Fischl B, Salat DH, Busa E, Albert M, Dieterich M, Haselgrove C, van der Kouwe A, Killiany R, Kennedy D, Klaveness S, Montillo A, Makris N, Rosen B, Dale AM. Whole Brain Segmentation: Automated Labeling of Neuroanatomical Structures in the Human Brain. Neuron. 2002;33(3):341–55.

    Article  CAS  PubMed  Google Scholar 

  29. Fischl B, Salat DH, van der Kouwe AJW, Makris N, Ségonne F, Quinn BT, Dale AM. Sequence-independent segmentation of magnetic resonance images. Neuro Image. 2004;23(Suppl 1):S69–84.

    Article  PubMed  Google Scholar 

  30. Fombonne E, Rogé B, Claverie J, Courty S, Frémolle J. Microcephaly and macrocephaly in autism. J Autism Dev Disord. 1999;29(2):113–9.

    Article  CAS  PubMed  Google Scholar 

  31. Fortin J-P, Parker D, Tunç B, Watanabe T, Elliott MA, Ruparel K, Roalf DR, Satterthwaite TD, Gur RC, Gur RE, Schultz RT, Verma R, Shinohara RT. Harmonization of multi-site diffusion tensor imaging data. Neuro Image. 2017;161:149–70.

    Article  PubMed  Google Scholar 

  32. Freitag CM, Luders E, Hulst HE, Narr KL, Thompson PM, Toga AW, Krick C, Konrad C. Total Brain Volume and Corpus Callosum Size in Medication-Naïve Adolescents and Young Adults with Autism Spectrum Disorder. Biol Psychiatry. 2009;66(4):316–9.

    Article  PubMed  PubMed Central  Google Scholar 

  33. Goldman S, Wang C, Salgado MW, Greene PE, Kim M, Rapin I. Motor stereotypies in children with autism and other developmental disorders. Dev Med Child Neurol. 2009;51(1):30–8.

    Article  PubMed  Google Scholar 

  34. Gotham K, Pickles A, Lord C. Standardizing ADOS Scores for a Measure of Severity in Autism Spectrum Disorders. J Autism Dev Disord. 2009;39(5):693–705.

    Article  PubMed  Google Scholar 

  35. Hallahan B, Daly EM, McAlonan G, Loth E, Toal F, FO B, Robertson D, Hales S, Murphy C, Murphy KC, Murphy DGM. Brain morphometry volume in autistic spectrum disorder: A magnetic resonance imaging study of adults. Psychol Med. 2009;39(2):337–46.

    Article  CAS  PubMed  Google Scholar 

  36. Hardan AY, Minshew NJ, Mallikarjuhn M, Keshavan MS. Brain Volume in Autism. J Child Neurol. 2001;16(6):421–4.

    Article  CAS  PubMed  Google Scholar 

  37. Hardan AY, Muddasani S, Vemulapalli M, Keshavan MS, Minshew NJ. An MRI Study of Increased Cortical Thickness in Autism. Am J Psychiatry. 2006;163(7):1290–2.

    Article  PubMed  PubMed Central  Google Scholar 

  38. Hazlett HC, Gu H, McKinstry RC, Shaw DWW, Botteron KN, Dager SR, Styner M, Vachet C, Gerig G, Paterson SJ, Schultz RT, Estes AM, Evans AC, Piven J, Network IBIS. Brain volume findings in 6-month-old infants at high familial risk for autism. Am J Psychiatry. 2012;169(6):601–8.

    Article  PubMed  PubMed Central  Google Scholar 

  39. Hazlett HC, Gu H, Munsell BC, Kim SH, Styner M, Wolff JJ, Elison JT, Swanson MR, Zhu H, Botteron KN, Collins DL, Constantino JN, Dager SR, Estes AM, Evans AC, Fonov VS, Gerig G, Kostopoulos P, McKinstry RC, et al. Early brain development in infants at high risk for autism spectrum disorder. Nature. 2017;542(7641):348–51.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Hazlett HC, Poe M, Gerig G, Smith RG, Provenzale J, Ross A, Gilmore J, Piven J. Magnetic Resonance Imaging and Head Circumference Study of Brain Size in Autism: Birth Through Age 2 Years. Arch Gen Psychiatry. 2005;62(12):1366–76.

    Article  PubMed  Google Scholar 

  41. Herbert MR, Ziegler DA, Makris N, Filipek PA, Kemper TL, Normandin JJ, Sanders HA, Kennedy DN, Caviness VS. Localization of white matter volume increase in autism and developmental language disorder. Ann Neurol. 2004;55(4):530–40.

    Article  PubMed  Google Scholar 

  42. Huttenlocher PR. Morphometric study of human cerebral cortex development. Neuropsychologia. 1990;28(6):517–27.

    Article  CAS  Google Scholar 

  43. Kanner L. Autistic disturbances of affective contact. Neuro Child. 1943;2:217–50.

    Google Scholar 

  44. Katuwal GJ, Baum SA, Cahill ND, Dougherty CC, Evans E, Evans DW, Moore GJ, Michael AM. Inter-Method Discrepancies in Brain Volume Estimation May Drive Inconsistent Findings in Autism. Front Neurosci. 2016;10.

  45. Kenny L, Hattersley C, Molins B, Buckley C, Povey C, Pellicano E. Which terms should be used to describe autism? Perspectives from the UK autism community. Autism. 2016;20(4):442–62.

    Article  PubMed  Google Scholar 

  46. Khundrakpam BS, Lewis JD, Kostopoulos P, Carbonell F, Evans AC. Cortical Thickness Abnormalities in Autism Spectrum Disorders Through Late Childhood, Adolescence, and Adulthood: A Large-Scale MRI Study. Cerebral Cortex. 2017;27(3):1721–31.

    Article  PubMed  Google Scholar 

  47. Kvarven A, Strømland E, Johannesson M. Comparing meta-analyses and preregistered multiple-laboratory replication projects. Nature Human Behaviour, 1–12. 2019.

  48. Lin L. Bias caused by sampling error in meta-analysis with small sample sizes. PloS One. 2018;13(9):e0204056.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  49. Lindley AA, Benson JE, Grimes C, Cole TM, Herman AA. The relationship in neonates between clinically measured head circumference and brain volume estimated from head CT-scans. Early Hum Dev. 1999;56(1):17–29.

    Article  CAS  PubMed  Google Scholar 

  50. Lord C, Risi S, Lambrecht L, Cook EH, Leventhal BL, DiLavore PC, Pickles A, Rutter M. The autism diagnostic observation schedule-generic: A standard measure of social and communication deficits associated with the spectrum of autism. J Autism Dev Disord. 2000;30.

  51. Masi A, DeMayo MM, Glozier N, Guastella AJ. An Overview of Autism Spectrum Disorder, Heterogeneity and Treatment Options. NeurosciBull. 2017;33(2):183–93.

    Article  Google Scholar 

  52. McAlonan GM, Cheung V, Cheung C, Suckling J, Lam GY, Tai KS, Yip L, Murphy DGM, Chua SE. Mapping the brain in autism. A voxel-based MRI study of volumetric differences and intercorrelations in autism. Brain. 2005;128(2):268–76.

    Article  PubMed  Google Scholar 

  53. McDaniel MA. Big-brained people are smarter: A meta-analysis of the relationship between in vivo brain volume and intelligence. Intelligence. 2005;33(4):337–46.

    Article  Google Scholar 

  54. Movsas TZ, Pinto-Martin JA, Whitaker AH, Feldman JF, Lorenz JM, Korzeniewski SJ, Levy SE, Paneth N. Autism Spectrum Disorder is associated with ventricular enlargement in a Low Birth Weight Population. J Pediatr. 2013;163(1):73–8.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Noble KG, Houston SM, Brito NH, Bartsch H, Kan E, Kuperman JM, Akshoomoff N, Amaral DG, Bloss CS, Libiger O, Schork NJ, Murray SS, Casey BJ, Chang L, Ernst TM, Frazier JA, Gruen JR, Kennedy DN, Van Zijl P, et al. Family income, parental education and brain structure in children and adolescents. Nat Neurosci. 2015;18(5):773–8.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  56. Ohta H, Nordahl CW, Iosif A-M, Lee A, Rogers S, Amaral DG. Increased Surface Area, but not Cortical Thickness, in a Subset of Young Boys With Autism Spectrum Disorder: Cortical thickness in autism spectrum disorder. Autism Res. 2016;9(2):232–48.

    Article  PubMed  Google Scholar 

  57. Palmen SJMC, Pol HEH, Kemner C, Schnack HG, Janssen J, Kahn RS, van Engeland H. Larger Brains in Medication Naive High-Functioning Subjects with Pervasive Developmental Disorder. J Autism Dev Disord. 2004;34(6):603–13.

    Article  PubMed  Google Scholar 

  58. Panizzon MS, Fennema-Notestine C, Eyler LT, Jernigan TL, Prom-Wormley E, Neale M, Jacobson K, Lyons MJ, Grant MD, Franz CE, Xian H, Tsuang M, Fischl B, Seidman L, Dale A, Kremen WS. Distinct Genetic Influences on Cortical Surface Area and Cortical Thickness. Cerebral Cortex (New York, NY). 2009;19(11):2728–35.

    Article  Google Scholar 

  59. Papademetris X, Jackowski MP, Rajeevan N, DiStasio M, Okuda H, Constable RT, Staib LH. BioImage Suite: An integrated medical image analysis suite: An update. Insight J. 2006;2006:209.

    PubMed  PubMed Central  Google Scholar 

  60. Pietschnig J, Penke L, Wicherts JM, Zeiler M, Voracek M. Meta-analysis of associations between human brain volume and intelligence differences: How strong are they and what do they mean? Neuroscience & Biobehavioral Reviews. 2015;57:411–32.

    Article  Google Scholar 

  61. Piven J, Arndt S, Bailey J, Havercamp S, Andreasen N, Palmer P. An MRI study of brain size in autism. Am J Psychiatry. 1995;152(8):1145–9.

    Article  CAS  PubMed  Google Scholar 

  62. Raznahan A, Lenroot R, Thurm A, Gozzi M, Hanley A, Spence SJ, Swedo SE, Giedd JN. Mapping cortical anatomy in preschool aged children with autism using surface-based morphometry. NeuroImage Clin. 2013;2:111–9.

    Article  Google Scholar 

  63. Redcay E, Courchesne E. When Is the Brain Enlarged in Autism? A Meta-Analysis of All Brain Size Reports. Biol Psychiatry. 2005;58(1):1–9.

    Article  PubMed  Google Scholar 

  64. Reiss AL, Abrams MT, Singer HS, Ross JL, Denckla MB. Brain development, gender and IQ in children. A volumetric imaging study. Brain: A Journal of Neurology. 1996;119(Pt 5):1763–74.

    Article  Google Scholar 

  65. Richardson JTE. Eta squared and partial eta squared as measures of effect size in educational research. Educ Res Rev. 2011;6(2):135–47.

    Article  Google Scholar 

  66. Rutter, M., Bailey, A., Lord, C., & et al. (2003). Social Communication Questionnaire, 2003. Western Psychological Services.

  67. Rutter M, Le Couteur A, Lord C, Faggioli R. ADI-R: Autism diagnostic interview—Revised: Manual. Organizzazioni speciali: OS; 2005.

    Google Scholar 

  68. Sacco R, Gabriele S, Persico AM. Head circumference and brain size in autism spectrum disorder: A systematic review and meta-analysis. Psychiatry Research: Neuroimaging. 2015;234(2):239–51.

    Article  PubMed  Google Scholar 

  69. Sanders SJ. First glimpses of the neurobiology of autism spectrum disorder. Curr Opin Genet Dev. 2015;33:80–92.

    Article  CAS  PubMed  Google Scholar 

  70. Scheel C, Rotarska-Jagiela A, Schilbach L, Lehnhardt FG, Krug B, Vogeley K, Tepest R. Imaging derived cortical thickness reduction in high-functioning autism: Key regions and temporal slope. NeuroImage. 2011;58(2):391–400.

    Article  PubMed  Google Scholar 

  71. Shi F, Wang L, Dai Y, Gilmore JH, Lin W, Shen D. LABEL: Pediatric brain extraction using learning-based meta-algorithm. NeuroImage. 2012;62(3):1975–86.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Simonoff E, Pickles A, Charman T, Chandler S, Loucas T, Baird G. Psychiatric Disorders in Children With Autism Spectrum Disorders: Prevalence, Comorbidity, and Associated Factors in a Population-Derived Sample. J Am Acad Child Adolesc Psychiatry. 2008;47(8):921–9.

    Article  PubMed  Google Scholar 

  73. Smith E, Thurm A, Greenstein D, Farmer C, Swedo S, Giedd J, Raznahan A. Cortical thickness change in autism during early childhood. Human Brain Mapping. 2016;37(7):2616–29.

    Article  PubMed  PubMed Central  Google Scholar 

  74. Smith SM. Fast robust automated brain extraction. Human Brain Mapping. 2002;17(3):143–55.

    Article  PubMed  PubMed Central  Google Scholar 

  75. Sparks BF, Friedman SD, Shaw DW, Aylward EH, Echelard D, Artru AA, Maravilla KR, Giedd JN, Munson J, Dawson G, Dager SR. Brain structural abnormalities in young children with autism spectrum disorder. Neurology. 2002;59(2):184–92.

    Article  CAS  Google Scholar 

  76. Stanfield AC, McIntosh AM, Spencer MD, Philip R, Gaur S, Lawrie SM. Towards a neuroanatomy of autism: A systematic review and meta-analysis of structural magnetic resonance imaging studies. European Psychiatry: The Journal of the Association of European Psychiatrists. 2008;23(4):289–99.

    Article  Google Scholar 

  77. Strimbu K, Tavel JA. What are Biomarkers? Curr Opin HIV AIDS. 2010;5(6):463–6.

    Article  PubMed  PubMed Central  Google Scholar 

  78. Styner, M. A., Charles, H. C., Park, J., & Gerig, G. (2002). Multisite validation of image analysis methods: Assessing intra- and intersite variability. 4684, 278–286.

  79. Tang G, Gudsnuk K, Kuo S-H, Cotrina ML, Rosoklija G, Sosunov A, Sonders MS, Kanter E, Castagna C, Yamamoto A, Yue Z, Arancio O, Peterson BS, Champagne F, Dwork AJ, Goldman J, Sulzer D. Loss of mTOR-Dependent Macroautophagy Causes Autistic-like Synaptic Pruning Deficits. Neuron. 2014;83(5):1131–43.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  80. Tate DF, Bigler ED, McMahon W, Lainhart J. The Relative Contributions of Brain, Cerebrospinal Fluid-Filled Structures and Non-Neural Tissue Volumes to Occipital-Frontal Head Circumference in Subjects with Autism. Neuropediatrics. 2007;38(01):18–24.

    Article  CAS  PubMed  Google Scholar 

  81. Tick B, Bolton P, Happé F, Rutter M, Rijsdijk F. Heritability of autism spectrum disorders: A meta-analysis of twin studies. J Child Psychol Psychiatry. 2016;57(5):585–95.

    Article  PubMed  Google Scholar 

  82. Tunç B, Yankowitz LD, Parker D, Alappatt JA, Pandey J, Schultz RT, Verma R. Deviation from normative brain development is associated with symptom severity in autism spectrum disorder. Mol Autism. 2019;10(1):46.

    Article  PubMed  PubMed Central  Google Scholar 

  83. Tustison NJ, Avants BB, Cook PA, Zheng Y, Egan A, Yushkevich PA, Gee JC. N4ITK: Improved N3 bias correction. IEEE Trans Med Imaging. 2010;29(6):1310–20.

    Article  PubMed  PubMed Central  Google Scholar 

  84. van Kooten IAJ, Palmen SJMC, von Cappeln P, Steinbusch HWM, Korr H, Heinsen H, Hof PR, van Engeland H, Schmitz C. Neurons in the fusiform gyrus are fewer and smaller in autism. Brain. 2008;131(4):987–99.

    Article  PubMed  Google Scholar 

  85. Wechsler, D. (1981). Manual for the Wechsler Adult Intelligence Scale—Revised. Psychological Corporation.

  86. Wechsler, D. (1991). Wechsler Intelligence Scale for Children: Third Edition manual. Psychological Corporation.

  87. Wechsler, D. (1997). Wechsler Adult Intelligence Scale–Third Edition. The Psychological Corporation.

  88. Wechsler, D. (1999). Wechsler Abbreviated Scale of Intelligence. The Psychological Corporation: Harcourt Brace & Company.

  89. Wechsler, D. (2011). Wechsler Abbreviated Scale of Intelligence–Second Edition (WASI-II). NCS Pearson.

  90. Wechsler, D., Kaplan, E., Fein, D., Kramer, J., Morris, R., Delis, D., & Maelender, A. (2003). Wechsler intelligence scale for children: Fourth edition (WISC-IV). Pearson.

  91. Willerman L, Schultz R, Rutledge N, Bigler E. In vivo brain size and intelligence. Intelligence. 1991;15:223–8.

    Article  Google Scholar 

  92. Ypma RJF, Moseley RL, Holt RJ, Rughooputh N, Floris DL, Chura LR, Spencer MD, Baron-Cohen S, Suckling J, Bullmore ET, Rubinov M. Default Mode Hypoconnectivity Underlies a Sex-Related Autism Spectrum. Biological Psychiatry: Cognitive Neuroscience and Neuroimaging. 2016;1(4):364–71.

    Article  Google Scholar 

  93. Yushkevich PA, Piven J, Hazlett HC, Smith RG, Ho S, Gee JC, Gerig G. User-guided 3D active contour segmentation of anatomical structures: Significantly improved efficiency and reliability. NeuroImage. 2006;31(3):1116–28.

    Article  Google Scholar 

Download references


Not applicable.


This work was supported by the National Institute of Health NIMH R01MH073084, RC1MH088791, R21MH098153, and R21MH092615, by NICHD 5U54HD086984, by the National Science Foundation Graduate Research Fellowship Program under Grant No. DGE-132185, by the Pennsylvania Department of Health SAP #4100042728 and 4100047863, by the Robert Wood Johnson Foundation #66727, by Pfizer Inc. (no award number), and by Shire Development LLC (no award number).

Author information

Authors and Affiliations



LDY, JDH, and RTS conceived of the study. JP, BEY, JDH, and RTS conducted and supervised data collection, including MRI acquisition. JP, BEY, and JDH contributed to consensus clinical diagnosis. LDY and JAP performed data quality control. LDY performed the statistical analysis. LDY, JDH, BEY, and RTS contributed to the theoretical approach presented in the manuscript. All authors contributed to writing the manuscript, and they read and approved the final manuscript.

Corresponding author

Correspondence to Lisa D. Yankowitz.

Ethics declarations

Ethics approval and consent to participate

All study procedures were approved by the institutional review board of the Children’s Hospital of Philadelphia or Yale University. Written informed consent was obtained from all participants and their parents or legal guardians.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Additional file 1: Table S1.

Height, race, and parental education for the primary sample. Fisher’s Exact Test was used to test for differences in proportion of race due to small sizes within cells. Figure S1. Age and IQ distributions in the primary sample. Figure S2. Age and IQ distributions in the replication sample. Figure S3. Volume of cerebellar white matter in original and reprocessed images in the primary sample. Low reliability of this measure is clearly driven by one subject. Figure S4. Ratio of gray to white matter in the primary sample. Table S2 part 1. CHOP models with effects of IQ, Age, Sex, Diagnosis, IQ*Diagnosis, Age*Diagnosis, Sex*Diagnosis. Table S2 part 2. CHOP models with effects of IQ, Age, Sex, Diagnosis, IQ*Diagnosis, Age*Diagnosis, Sex*Diagnosis. Figure S5. Relationships of parental education with (a) brain volume and (b) IQ, in the subset of the CHOP sample for which parental education was available. Within the TDC sample, a positive relationship was observed between parent education and both TBV and IQ. Within the ASD group, the positive relationship between parent education and IQ was attenuated, and the relationship with TBV was reversed. Table S4. Yale models with effects of IQ, Age, Diagnosis, IQ*Diagnosis, Age*Diagnosis.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Yankowitz, L.D., Herrington, J.D., Yerys, B.E. et al. Evidence against the “normalization” prediction of the early brain overgrowth hypothesis of autism. Molecular Autism 11, 51 (2020).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: