topic/materical included are:
From over-fitting to apparently complex methods which can work well, such as VC dimension and shattering sets.
PAC bounds. Loss functions. Risk (in the learning theoretic sense) and posterior expected risk. Generalisation error.
Supervised, unsupervised and semi-supervised learning.
The use of distinct training, test and validation sets, particularly in the context of prediction problems.
The Bootstrap revisited. Bags of Little Bootstraps. Bootstrap aggregation. Boosting.
Big Data and Big Model – issues and (partial) solutions:
The “curse of dimensionality”. Multiple testing; voodoo correlations, false-discovery rate and family-wise error rate. Corrections: Bonferroni, Benjamini-Hochberg.
Sparsity and Regularisation. Variable selection; regression. Spike and slab priors.
Inequalities (hoeffding,Chernoff bound), concentration inequalities.
Ridge Regression. The Lasso. The Dantzig Selector.
Concentration of measure and related inferential issues.
Metropolis hastings, gibbs sampling
MCMC in high dimensions – preconditioned Crank Nicholson; MALA, HMC. Preconditioning. Rates of convergence.