Description of included studies
Study selection is presented with a PRISMA flow diagram (Additional file 1: eAppendix-4.1), and the list of included/excluded full-texts in Additional file 1: eAppendix-4.2/4.3. From 203 eligible trials, 125 trials in children/adolescents (n = 7450 participants) and 18 in adults (n = 1104) were included in the quantitative analysis.
Study characteristics are presented in Additional file 1: eAppendix-5.1 and the distribution of potential effect-modifiers in Additional file 1: eAppendix-6.1. The majority of trials were double-blind (k = 138 studies), placebo-controlled (k = 137) with a parallel-design (k = 110) and two-arms (k = 125). They were recently published (median publication year of 2015, interquartile range [2008–2019]), had a short duration (12 [8–13] weeks), small sample sizes (40 [23–76]) and few sites (1 [1–3]), which were mainly academic (k = 102 trials had only academic sites).
The median age of participants was 8.2 [6.3–9.5] years in children/adolescents and 24.6 [21.9–27.9] years in adults. The overall male-to-female ratio was 5.3 [3.9–8.2]. Standardized diagnostic criteria were used in most of the studies (95%), and seven studies used only diagnostic evaluation tools. Associated symptoms were required as an inclusion criterion in about a third of the studies, mainly irritability and ADHD symptoms (in 30 trials), and a genetic syndrome (neurofibromatosis-type-I) in one trial [58]. At baseline, the sample was moderately to markedly ill with a CGI-S score of 4.8 [4.4–5.1], and ABC-Irritability of 16.9 [13.3–22.3], and about half of the participants had intellectual disability (50% [0–73.5%]). Nevertheless, reporting of participant characteristics was poor in about two thirds of the studies.
Risk of bias assessment is presented in Additional file 1: eAppendix-5.2. About 25% of the studies had an overall low risk of bias, 55% had moderate and 17% high. About half adequately reported methods of random sequence and allocation concealment, and blinding was adequately addressed in about 65%. High risk of bias was assigned in about 26% studies for incomplete outcome data, 36% for selective reporting and about 12% for other biases, mainly due to baseline imbalance or early trial termination. Finally, about 30% of the studies were funded by industry or their investigators applied for a patent.
Forty-one drugs were investigated in 100 trials (antipsychotics and antidepressants in about a third) and 17 dietary-supplements in 43 trials (Additional file 1: eAppendix-5.1). Interventions were connected in mainly star-shaped networks with placebo as the main node (Additional file 2: Fig. S1). Therefore, we focused on comparisons with placebo (Fig. 1, Additional file 3: Fig. S2), and league tables with all comparisons are presented in Additional file 4: Table S1. The results of pairwise meta-analyses and individual studies are presented in Additional file 5: Fig. S3. In addition, incoherence could not be evaluated when there were no closed loops (i.e., networks for anxiety/depression, quality of life, caregiver stress and all networks in adults). There was no clear indication of incoherence for the rest of the networks, except for irritability, response, weight gain and sedation in children/adolescents for which pairwise meta-analyses were conducted (Additional file 1: eAppendix-6.8).
Primary outcomes
Social-communication difficulties (SCD)
Social-communication difficulties were measured mainly with ABC-L/SW (55%) and VABS-S (18%).
In children/adolescents, social-communication difficulties were improved by risperidone (k = 4 studies in the analysis, n = 133 participants treated with risperidone; SMD = 0.31 95%CI [0.06, 0.55]; low quality of evidence) and aripiprazole (k = 6, n = 341; SMD = 0.27 [0.09, 0.44]; low). Some trends of improvement were noted for folinic acid (k = 2, n = 32, SMD = 0.44 [− 0.05, 0.93]; very low), tideglusib (k = 1, n = 40; SMD = 0.38 [− 0.06, 0.82]; low), omega-3-fatty-acids (k = 10, n = 171; SMD = 0.21 [0.00, 0.43], very low), probiotics (k = 5, n = 92; SMD = 0.21 [− 0.08, 0.51]; low) and bumetanide (k = 4, n = 174; SMD = 0.14 [− 0.08, 0.37]; low). There were no clear differences between other medications and placebo with very low-to-moderate confidence. Heterogeneity was low (τ2 = 0).
In adults, none of the investigated medications (sulforaphane, balovaptan, oxytocin) improved social-communication difficulties with very-low- or low-quality evidence. There were high levels of heterogeneity (τ2 = 0.096).
Repetitive behaviors (RB)
Repetitive behaviors were measured mainly with ABC-S (47%) and YBOCS-versions (27%).
In children/adolescents, repetitive behaviors were improved by risperidone (k = 4, n = 133; SMD = 0.60 [0.29, 0.90]; low), aripiprazole (k = 6, n = 322; SMD = 0.48 [0.26, 0.70]; very low), atomoxetine (k = 3, n = 107; SMD = 0.49 [0.18, 0.80]; very low) and bumetanide (k = 4, n = 175; SMD = 0.35 [0.09, 0.62], low). There were trends for valproate (k = 1, n = 9; SMD = 1.33 [− 0.03, 2.68]; very low) and guanfacine (k = 1, n = 30; SMD = 0.55 [− 0.02, 1.11]; low), and no clear differences for other medications with very low-to-moderate confidence. Heterogeneity was low-to-moderate (τ2 = 0.017).
In adults, repetitive behaviors were improved by fluoxetine (k = 1, n = 21; SMD = 1.20 [0.45, 1.96]; low), fluvoxamine (k = 1, n = 15; SMD = 1.04 [0.27, 1.81]; low), risperidone (k = 1, n = 14; SMD = 0.97 [0.21, 1.74]; very low), and oxytocin (k = 6, n = 147; SMD = 0.41 [0.16, 0.66]; moderate). Sulforaphane, balovaptan, milnacipran and citalopram were not found efficacious with very low or low confidence. Heterogeneity was low (τ2 = 0).
Overall core symptoms (OCS)
Overall core symptoms were measured mainly with SRS (47%) and CARS (22%).
In children/adolescents, overall core symptoms were improved by risperidone (k = 3, n = 81; SMD = 1.18 [0.75, 1.61]; very low), and bumetanide (k = 4, n = 189; SMD = 0.61 [0.31, 0.91]; low). There were some trends for haloperidol (k = 3, n = 36; SMD = 0.56 [− 0.03, 1.15]; very low) and carnosine (k = 3, n = 53; SMD = 0.42 [− 0.04, 0.88]; very low), and no clear differences for other medications with very low-to-moderate confidence. There were moderate levels of heterogeneity (τ2 = 0.038) and no indication of incoherence. Nevertheless, a small study (n = 30) [59] that found no difference between risperidone and memantine (SMD = 0.00 [− 0.71, 0.72]) introduced incoherence and was excluded from the primary analysis of this outcome (Additional file 1: eAppendix-6.8), and the results were robust after inclusion of this study (Additional file 6: Fig. S4).
In adults, none of the investigated medications (risperidone, sulforaphane, balovaptan and oxytocin) found to be more efficacious than placebo in reducing overall core symptoms, though a trend was noted for sulforaphane (k = 2, n = 53; SMD = 0.38 [− 0.05, 0.81]; low). Confidence in evidence was very low or low. Heterogeneity was low (τ2 = 0).
Sensitivity analysis
The results did not materially change in sensitivity analyses (Additional file 1: eAppendix-6.6, Additional file 6: Fig. S4). There were some potential differences in omega-3-fatty-acids. Omega-3-fatty-acids did not reduce social-communication difficulties in children/adolescents when studies on associated symptoms were excluded (k = 6, n = 112, SMD = 0.05 [− 0.21, 0.32]) or when clinician-ratings were used (k = 3, n = 53, SMD = 0.03 [− 0.36, 0.42]). Yet, their effect-size was larger when ABC-L/SW was used (k = 6, n = 79, SMD = 0.45 [0.13, 0.77]). In addition, the results for some interventions, i.e., folinic acid, carnosine, vitamin-D, were not robust in sensitivity analyses, which were based on one or two small trials with potentially inflated effect-sizes.
Small-study effects and publications
There was asymmetry in funnel plots for social-communication difficulties in children/adolescents, indicating small-study effects (Additional file 1: eAppendix-6.8). Funnel plots for the other co-primary outcomes were inconclusive. Reporting bias was suspected for some medications, and quality of evidence was downgraded accordingly (Additional file 1: eAppendix-6.9).
Secondary outcomes
Irritability
Irritability was measured mainly with ABC-I (83%).
In children/adolescents, there was evidence of incoherence (none of the closed loops were incoherent, but p-design-by-treatment = 0.014) and pairwise meta-analysis were conducted. Irritability was improved by risperidone (k = 4 studies in the analysis, n = 138 participants treated with risperidone; SMD = 1.05 [0.76, 1.33], τ2 = 0.02), sulforaphane (k = 1, n = 12; SMD = 0.97 [0.12, 1.83]), aripiprazole (k = 5, n = 312; SMD = 0.63 [0.44, 0.82], τ2 = 0), and citalopram (k = 1, n = 73; SMD = 0.37 [0.04, 0.69]), as well as there was a trend for guanfacine (k = 1, n = 30; SMD = 0.50 [0.00, 1.01]) and riluzole (k = 1, n = 29; SMD = 0.43 [− 0.09, 0.95]). On the other hand, irritability was worsened by vitamin-B12 (k = 1, n = 27; SMD = − 0.62 [− 1.19, − 0.05]) and levetiracetam (k = 1, n = 10; SMD = -1.47 [− 2.48, − 0.46]).
In adults, risperidone was found efficacious (k = 1, n = 14; SMD = 1.19 [0.34, 2.04]), and heterogeneity was moderate (τ2 = 0.028).
ADHD symptoms
ADHD symptoms were measured in the majority of the studies with ABC-H (79%).
In children/adolescents, ADHD symptoms were improved by olanzapine (k = 1, n = 6; SMD = 2.08 [0.48, 3.68], based only on indirect evidence), guanfacine (k = 1, n = 30; SMD = 1.39 [0.73, 2.05]), aripiprazole (k = 7, n = 363; SMD = 0.82 [0.59, 1.05]), risperidone (k = 5, n = 155; SMD = 0.79 [0.47, 1.11]), naltrexone (k = 1, n = 23; SMD = 0.85 [0.12, 1.59]), and atomoxetine (k = 3, n = 107; SMD = 0.64 [0.30, 0.99]), as well as a trend was noted for sulforaphane (k = 1, n = 12; SMD = 0.88 [− 0.03, 1.80]). Heterogeneity was moderate (τ2 = 0.032).
In adults, none of the investigated medications were found efficacious for ADHD symptoms, and heterogeneity was low (τ2 = 0).
Anxiety/depressive symptoms
Different scales measured anxiety/depression in children/adolescents (e.g., CBCL-I, BASC-I, CASI, DBC-Anxiety), and STAI-state was used in half of the studies in adults. None of the investigated medications found to improve anxiety or depressive symptoms, except for a trend about risperidone in adults (n = 1, k = 14; SMD = 0.67 [− 0.07, 1.41]). There were moderate-to-high levels of heterogeneity in children/adolescents (τ2 = 0.041) and low in adults (τ2 = 0).
Caregiver stress
Caregiver stress was measured mainly with PSI (36%), CSQ (22%) and CGSQ (14%) in children/adolescents, and with PedsQL-Family Impact in adults. In children/adolescents, it was reduced by melatonin (k = 1, n = 54; SMD = 0.51 [0.12, 0.91]), and there were trends of small improvements by cannabinoids (k = 1, n = 80; SMD = 0.32 [− 0.06, 0.69]) and atomoxetine (k = 3, n = 104; SMD = 0.21 [− 0.06, 0.48]). There were no clear differences between other medications and placebo in both age groups, and heterogeneity was low (τ2 = 0).
Global functioning
Global functioning was measured with GAF or CGAS. In children/adolescents, it was improved by risperidone (k = 3, n = 62, SMD = 0.83 [0.40, 1.26]) and aripiprazole (k = 2, n = 69, SMD = 0.75 [0.33, 1.17]). No clear differences between other investigated medications and placebo were found in both age groups. Heterogeneity was moderate in children/adolescents (τ2 = 0.016) and low in adults (τ2 = 0).
Quality of life
Quality of life was measured with PedsQL in children/adolescents, and with PedsQL (40%) and WHO-QOL (60%) in adults. There were no clear differences between medications and placebo in children/adolescents. In adults, quality of life was improved by balovaptan (k = 2, n = 217; SMD = 0.22 [0.02, 0.43]), and potentially by oxytocin (k = 3, n = 41; SMD = 0.44 [− 0.02, 0.90]). Heterogeneity was low in both age groups (τ2 = 0).
Response
Pairwise meta-analyses were conducted in children/adolescents due to incoherence (50% of the closed loops were incoherent; p-design-by-treatment = 0.068). In comparison with placebo, more participants responded with risperidone (k = 5, n = 161; OR = 11.33 [4.99, 25.70]; τ2 = 0.294), guanfacine (k = 1, n = 30; OR = 9.67 [2.41, 38.71]), whey-protein (k = 1, n = 22; OR = 4.56 [1.25, 16.63]), aripiprazole (k = 5, n = 317; OR = 4.26 [2.32, 7.83]; τ2 = 0.212), vitamin-B12 (k = 1, n = 28; OR = 3.83 [1.20, 12.28]), atomoxetine (k = 3, n = 109; OR = 3.18 [1.56, 6.48]; τ2 = 0), melatonin (k = 1, n = 60; OR = 3.06 [1.38, 6.77]), bumetanide (k = 3, n = 155; OR = 2.78 [1.48, 5.21]; τ2 = 0), and cannabinoids (k = 1, n = 100; OR = 2.56 [1.15, 5.70]), while fewer with oral human immunoglobulins (IGOH) (k = 1, n = 94; OR = 0.40 [0.16, 0.99]). There were no clear differences for other medications.
In adults, there were more responders with risperidone (k = 1, n = 15; OR = 37.40 [1.62, 865.22]) and fluvoxamine (k = 1, n = 15; OR = 35.13 [1.52, 814.72]. There were high levels of heterogeneity (τ2 = 0.257).
Dropouts due to any cause
In children/adolescents, fewer overall dropouts were noted with risperidone (k = 10, n = 274; OR = 0.38 [0.22, 0.65]), lurasidone (k = 1, n = 100; OR = 0.35 [0.14, 0.88]) and aripiprazole (k = 8, n = 399; OR = 0.46 [0.29, 0.75]), as well as potentially with melatonin (k = 4, n = 239; OR = 0.52 [0.26, 1.03]). More dropouts were observed with arbaclofen (k = 1, n = 76; OR = 3.39 [1.16, 9.88]), and a trend was noted for fluoxetine (k = 3, n = 161; OR = 1.59 [0.97, 2.58]). There were no clear differences for other medications, and there were some indications of incoherence (12.5% of the loops were incoherent; p-design-by-treatment = 0.334). In adults, there were no clear differences for the investigated medications. Heterogeneity was low in both age groups (τ2 = 0.006 and τ2 = 0).
Dropouts due to adverse events
There were no clear differences between investigated medications and placebo in both age groups, and heterogeneity was low (τ2 = 0).
Any adverse event
In children/adolescents, more participants had adverse events with risperidone (k = 4, n = 123; OR = 4.74 [2.24, 10.04]), citalopram (k = 1, n = 73; OR = 5.38 [1.14, 25.46]), fluvoxamine (k = 1, n = 18; OR = 4.50 [1.02, 19.90]) and aripiprazole (k = 6, n = 348; OR = 2.62 [1.65, 4.15]), as well as potentially with guanfacine (k = 1, n = 30; OR = 17.94 [0.98, 329.56]) and lurasidone (k = 1, n = 100; OR = 1.92 [0.95, 3.90]). In adults, more participants had adverse events with risperidone (k = 1, n = 15; OR = 14.30 [2.19, 93.37]). There were no clear differences between other medications and placebo. Heterogeneity was low in children/adolescents (τ2 = 0) and moderate in adults (τ2 = 0.049).
Sedation
In children/adolescents, pairwise meta-analyses were conducted due to incoherence (75% of the closed loops were incoherent; p-design-by-treatment = 0.051). More participants had sedation with guanfacine (n = 1, k = 30; OR = 62.83 [12.84, 307.45]), haloperidol (n = 1, k = 20; OR = 44.33 [4.78, 410.96]), risperidone (n = 4; k = 142, OR = 11.95 [5.86, 24.36], τ2 = 0), aripiprazole (n = 5, k = 317; OR = 3.56 [1.62, 7.86]; τ2 = 0) and melatonin (n = 1, k = 60; OR = 3.28 [1.25, 8.59]).
In adults, there were no clear differences, and heterogeneity was low (τ2 = 0).
Weight gain
In children/adolescents, there was evidence of incoherence (50% of the closed loops were incoherent; p-design-by-treatment = 0.032) and pairwise meta-analyses were conducted. More participants had weight gain with aripiprazole (n = 5, k = 317; OR = 3.78 [2.09, 6.84], τ2 = 0) and risperidone (n = 5, k = 161; OR = 3.39 [1.80, 6.38], τ2 = 0) in comparison with placebo, while aripiprazole caused less weight gain in comparison with risperidone (n = 2, k = 104; OR = 0.22 [0.09, 0.55], τ2 = 0.045). There were no clear differences between other medications.
In adults, none of the investigated medications (sulforaphane, oxytocin and balovaptan) was associated with weight gain, and heterogeneity was low (τ2 = 0).
Extrapyramidal symptoms
The network of children/adolescents was disconnected; therefore, pairwise meta-analyses were conducted. In comparison with placebo, more participants had extrapyramidal symptoms with risperidone (n = 4, k = 142; OR = 3.02 [1.22, 7.48]; τ2 = 0) and aripiprazole (n = 4, k = 300; OR = 2.38 [1.18, 4.77]; τ2 = 0).
There were no data available for adults.