Skip to main content

Table 3 Governance structure of LEAP, pre-registration, quality control, and reporting of findings

From: The EU-AIMS Longitudinal European Autism Project (LEAP): design and methodologies to identify and validate stratification biomarkers for autism spectrum disorders

Data analysis is split into expert core analysis groups, broadly defined by data modality (e.g. clinical measures, cognition, EEG, structural MRI, functional MRI, etc.). Each group leads core analyses and coordinates modality-relevant exploratory bottom-up projects. Core analysis groups are closely linked to each other and to ‘cross-cutting’ interest groups (e.g. sex differences, excitatory-inhibitory balance, etc.).
Registration of projects: All individual projects (whether they are part of core-analyses or bottom-up projects) are pre-registered on an internal website and shared among the group. Project information includes lead and senior investigators, active collaborators, primary and secondary project goals, and outlines core measures and methodologies. Individual login details to the central EU-AIMS data-base is given upon project review and approval.
Quality control, standardisation of definitions and analyses: To maximise coherence and comparability between projects, expert groups lead on modality-specific quality control procedures, which are documented and shared. Where applicable, processing and analysis scripts are also shared to increase transparency and enable replication. Expert groups provide study-wide recommendations, including, for example, a core set of clinical outcome measures, the use of specific covariates, particular analysis approaches pertaining to a given data modality, procedures to correct for multiple-comparisons (e.g. permutations), a priori decisions as to whether/when the data set should be split into a test/replication sample (depending on whether exact or approximate external validation data sets are available). For example, for cognitive analyses, IQ is not recommended to be entered as covariate, as in the present cohort IQ is partially collinear with group status [116]. For all but machine learning approaches, the data set is not split into test/replication (e.g. 70:30%) data sets, as for cross-domain or cross-modal analyses data loss due to missing values is expected, the number and size of empirically derived subgroups are a priori unknown, and therefore the replication data set likely has limited power in replicating findings. In these instances, internal cross-validation strategies (e.g. bootstrapping) should be used. For neuroimaging analyses, core analysis groups carry out centralised pre-processing using a homogeneous automated motion detection algorithm and several quality control procedures, based on consensus agreement on specific parameters, as well as first level values, e.g. of cortical thickness/surface area. For second-level neuroimaging analyses, parametric and non-parametric permutation-based inference methods will be applied depending on the distribution properties of the data. While parametric analyses offer the advantage of efficiency and reproducibility if the underlying distribution assumptions are met, non-parametric approaches offer greater robustness when normality assumptions are violated. These efforts are aimed at increasing consistency between individual projects/analyses, reducing duplication of efforts, and to allow LEAP researchers to benefit from each other’s expertise. In addition, we aim to create a culture that discourages practices such as ‘undisclosed analytic flexibility’, i.e. one uses multiple approaches for one analysis question but only reports the ‘best’ results (‘fishing, p value hunting’). However, to strike a balance between standardisation and supporting novel/different approaches, all LEAP researchers can access raw data, use different pre-processing methods or outcome measures, as long as these choices are a priori justified in a project proposal and/or the number of analyses performed are reported and appropriately corrected for.
Standardised framework for reporting and evaluating biomarkers: Each project gives summary statistics about effect size, frequency and severity of abnormalities, sensitivity, specificity and—where applicable—cut-offs for dimensional stratification biomarkers. These criteria were identified as a priority for the validation of biomarkers by the European Medicines Agency and follow efforts made to increase consistency in reporting and evaluating case-control studies (see STROBE, and clinical trials (see CONsolidation of Standards for Reporting Trials, CONSORT [117]).
Increased transparency of analyses and findings by depositing a summary of results: EU-AIMS researchers will deposit for each registered project a summary of results upon completion. The aim is to increase transparency of findings from planned analyses, including ‘negative results’, which are both less frequently written-up for publication and currently more difficult to publish in peer-reviewed journals than positive results [112].