Classification family | Models used | Built-in sparsifying coefficient, other penalization | Under-sampling used | Relevance |
---|---|---|---|---|
Penalized linear regression | Linear Regression | L 1 penalization | Yes | ∙ Very interpretable |
 | Lasso | L 2 penalization |  | ∙ Simple model |
 | Ridge |  |  | ∙ Linear like ADOS |
 | Elastic net |  |  | ∙ Can use gradation in label |
 | Relaxed Lasso |  |  | (ASD vs spectrum) |
Nearest neighbors | Nearest shrunken centroids | L 1 penalization | Yes | ∙ Can identify subgroups within classes, |
 |  |  |  | which is likely for our sample |
 |  |  |  | ∙ Simple model |
General linear models for classification | LDA (L 1) | L 1 penalization | No | ∙ Simple model |
 | Logistic regression (L 1, L 2) | L 2 penalization |  | ∙ Interpretable |
 |  |  |  | ∙ Based on linear assumptions |
Support vector machines | Linear kernel (L 1) | L 1 penalization | No | ∙ Can capture more complex shapes in data when using nonlinear kernels |
 | Polynomial kernel | Regularization parameter |  |  |
 | Radial kernel |  |  |  |
 | Exponential kernel |  |  |  |
Tree-based classifiers | Decision tree | Tree depth | No | ∙ Performs well on categorical data |
 | Random forest | Number of trees |  | ∙ Better captures feature interactions |
 | Gradient boosting |  |  | ∙ Tree is interpretable |
 | AdaBoost |  |  | ∙ Boosting techniques often gives higher accuracy than simpler models |