Statistical Data Analysis - Course program


Preliminary version of the course program. As the course progresses, the font color changes from gray to black. 

lesson
date
lesson topics
total time

Part 1: Probability theory and probability models


1
3/10/2017
Introduction to the course. Review of the basic concepts of probability. Basic combinatorics.  2
2
5/10/2017
Set theory representation of probability space. Addition law for probabilities of incompatible events. Addition law for probabilities of non-mutually exclusive events. Conditional probabilities. Bayes' theorem.
4
3
17/10/2017
Independent events. Statistical independence and dimensional reduction. Overview of random walk models in physics. Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models. Discrete and continuous forms of gambler's ruin.  6
4
19/10/2017
Statement of the first and second Borel-Cantelli lemmas.
The transition from sample space to random variables. Review of basic concepts and definitions on discrete and continuous random variables. Uniform distribution. Buffon's needle.
Review of dispersion (variance) and its properties. Chebyshev's inequality.
8
5
24/10/2017
From Chebyshev's inequality to the weak law of large numbers. Other inequalities in probability theory: Markov's inequality, generalized Chebyshev's inequality. Strong law of large numbers.
Moment generating function. Brief review of common probability models. 1. The uniform distribution; 2. the Bernoulli distribution; 3 the binomial distribution.
10
6
26/10/2017
Distributions (ctd): 4. the geometric distribution; 5. the negative binomial distribution. 5. the hypergeometric distribution; 6. the multinomial distribution 12
7
31/10/2017
Distributions (ctd): 7. Poisson distribution; memoryless random processes; Examples: lottery tickets and the Poisson statistics;  cell plating statistics in biology; 8. exponential distribution. Example: paralyzable and non-paralyzable detectors. 9. The De Moivre - Laplace theorem and the normal distribution.
14
9
7/11/2017
Distributions (ctd): Properties of the normal distribution. Transformations of random variables. Sum of random variables (convolution). Functions of random variables. Approximate transformation of random variables: error propagation. Linear (orthogonal) transformation of random variables. The multivariate normal distribution.  16
10
9/11/2017
Distributions (ctd): Other important distributions (lognormal, gamma, beta, Rayleigh, logistic, Laplace, Cauchy). Example of a complex model used to setup a null hypothesis: the distribution of nearest-neighbor distance. 18
11
13/11/2017
Distribution of nearest-neighbor distance (ctd.). Bertrand's paradox. Overview of Jaynes' solution of Bertrand's paradox. Generating functions.
20
12
14/11/2017
Introduction to probability generating and characteristic functions. Generating and characteristic functions of some common distributions. PGF of the Galton-Watson branching process.  22
13
16/11/2017
Photomultiplier noise. Poisson distribution as limiting case of a binomial distribution. Moments of a distribution. Skewness and kurtosis. Mode and median. Properties of characteristic functions. The Central Limit Theorem (CLT).  The Berry-Esseen theorem. Additive and multiplicative processes.
24

Part 2: Statistical inference


14
21/11/2017
Descriptive and exploratory statistics. Sample mean, sample variance, estimate of covariance and correlation coefficient. PDF of sample mean for exponentially distributed data. 26
15
23/11/2017
Statistical estimators in the German Tank Problem. Limitations of the standard deviation as a descriptor of the width of a distribution.  Shannon entropy and information concentration in pdf's. Box plots. Outliers. Violin plots. Rug plots. Kernel density plots.
Principal Component Analysis (PCA).
28
15 28/11/2017
Principal Component Analysis with the use of Singular Value Decomposition (SVD). Face recognition as an example of PCA and SVD.
Introduction to the Monte Carlo method. Early history of the Monte Carlo method. Pseudorandom numbers.
30
16
5/12/2017
Uniformly distributed pseudorandom numbers. Transformation method. Transformation method and the trasformation of differential cross sections. Acceptance-rejection method. Example: one-dimensional diffusion process with absorption.
Monte Carlo method examples: Examples: generation of angles in the e+e- -> mu+mu- scattering; generation of angles in the Bhabha scattering. The structure of a complete MC program to simulate low-energy electron transport. 
32
17
7/12/2017
The structure of a complete MC program to simulate low-energy electron transport (ctd.). 
Statistical bootstrap.
34
18
12/12/2017
(11-13)
Introduction to Bayesian methods. Example of Bayesian parametric estimate, estimate of probability in a binomial model (duality with Beta distribution, considerations on different priors, importance of wise choice of prior, information embedded in prior distribution and effect of data, Bernstein-Von Mises theorem) (link to presentation) 36
19
12/12/2017
(14-16)
Maximum likelihood method 1. Connection with Bayes' theorem. Point estimators. Properties of estimators. Consistency of the maximum likelihood estimators. Asymptotic optimality of ML estimators. Bartlett's Identities. Cramer-Rao-Fisher bound. Variance of ML estimators.  38
20
14/12/2017
Maximum likelihood method 2. Efficiency and Gaussianity of ML estimators. Introduction to Shannon's entropy. Information measures based on the Shannon's entropy: Kullback-Leibler divergence, Jeffreys distance, Fisher information.
40
21
19/12/2017
(9-11)
Maximum likelihood method 3. Non-uniqueness of the likelihood function. Introduction to confidence intervals. Confidence intervals and confidence level. Detailed analysis of the Neyman construction of confidence intervals (link to the Neyman paper). Confidence intervals for the sample mean of exponentially distributed samples. Confidence intervals for the correlation coefficient of a bivariate Gaussian distribution from MC simulation. More properties of the likelihood function. Graphical method for the variance of ML estimators. Example with two counting channels. 
42
22
20/12/2017
(9-11)
Maximum likelihood method 4. Extended maximum likelihood. Examples. Introduction to ML with binned data.  Example of ML with binned data: decay rate in radioactive decay. Chi-square and its relation to ML. Very brief overview of chi-square and least squares fits. Chi-square distribution.
44
23
20/12/2017
(14-16)
Hypothesis tests, significance level. Examples. Critical region and acceptance region. Errors of the first and of the second kind. p-value and rejection of the null hypothesis. Chi-square as a test statistic. Neyman-Pearson lemma. Confidence intervals and the Feldman-Cousins construction (link to the FC paper). CLb, CLs+b and CLs.
46
24
11/01/2017
Final seminar by Diego Tonelli: Statistical Methods for the Large Hadron Collider (link to slides).
48

 Edoardo Milotti - Dec. 2017