Statistical Data Analysis - Course program


Preliminary version of the course program. As the course progresses, the font color changes from gray to black. 

lesson
date
lesson topics
total time

Part 1: Basics of probability theory and probability models


1
29/9/2015
Introduction to the course. Review of the basic concepts of probability. Basic combinatorics. Stirling's Approximation. Set theory representation of probability space. Addition law for probabilities of incompatible events. 2
2
1/10/2015
Addition law for probabilities of non-mutually exclusive events. Conditional probabilities. Bayes' theorem. Independent events. Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models (1). 4
3
6/10/2015
Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models (2). Bertrand's Paradox (1).
First Borel-Cantelli lemma. Second Borel-Cantelli lemma.
6
4
8/10/2015
The transition from sample space to random variables.
Review of basic concepts and definitions on discrete and continuous random variables. Uniform distribution. Buffon's needle. Transformations of random variables. Brief review of mathematical expectation, dispersion (variance), and of the properties of expectation and variance.
8
5
13/10/2015
Chebyshev's inequality. From Chebyshev's inequality to the weak law of large numbers. Other inequalities in probability theory: Markov's inequality, generalized Chebyshev's inequality. Strong law of large numbers. Chernoff bound. 10
6
20/10/2015
Moment generating function. Bernoulli (indicator) random variables. Proof of the Chernoff bound with applications (ctd.).
12
7
22/10/2015
Brief review of common probability models. 1. The uniform distribution; 2. the Bernoulli distribution; 3 the binomial distribution; 4. the multinomial distribution; 5. Poisson distribution; memoryless random processes; Examples: lottery tickets and the Poisson statistics;  cell plating statistics in biology.
14
8
27/10/2015
Distributions (ctd): 5. exponential distribution. Example: paralyzable and non-paralyzable detectors; 6. the De Moivre-Laplace theorem and the Gaussian distribution;  16
9
29/10/2015
Distributions (ctd): 7. The multivariate normal distribution. Transformations of random variables. Sum of random variables (convolution). Product of two random variables (Mellin's convolution). Functions of random variables. Approximate transformation of random variables: error propagation. Linear (orthogonal) transformation of random variables.  18
10-11
6/11/2015
(recupero, aula B, ore 9-11, 11-13)
Distributions (ctd): other important distributions. Example: the distribution of nearest-neighbor distance.
Jaynes' solution of Bertrand's paradox. The nature of randomness. Randomness in a coin toss. Bell's inequalities and quantum probabilities.
Introduction to generating functions: examples. 
22
12
10/11/2015
Probability generating functions (PGF). PGF of the Poisson distribution. PGF of uniform and binomial distributions. Poisson distribution as limiting case of a binomial distribution. 

PGF of the Galton-Watson branching process. Photomultiplier noise.
24
13
12/11/2015
Photomultiplier noise. (ctd.)
Characteristic functions. Moments of a distribution. Skewness and kurtosis. Mode and median. Properties of characteristic functions.
The Central Limit Theorem (CLT).  The Berry-Esseen theorem. Additive and multiplicative processes.
26

Part 2: Introduction to statistical inference


14 17/11/2015 Descriptive statistics. Sample mean, sample variance, estimate of covariance and correlation coefficient. Statistics of sample mean for exponentially distributed data.
Example of synthesis of nonparametric estimators in the "German Tank Problem". Order statistics.
28
15
19/11/2015
Limitations of the standard deviation as a descriptor of the width of a distribution.
The Allan variance for noise processes with infrared divergences. Shannon entropy and information concentration in pdf's. 
Introduction to the Monte Carlo method. Early history of the Monte Carlo method.
30
16
24/11/2015 Pseudorandom numbers. Uniformly distributed pseudorandom numbers. Transformation method. Acceptance-rejection method. Examples: generation of angles in the e+e- -> mu+mu- scattering; generation of angles in the Bhabha scattering.
32
17
26/11/2015 The structure of a complete MC program to simulate low-energy electron scattering. 
Transformation method and the trasformation of differential cross sections.
Statistical bootstrap.
34
18
1/12/2015 Statistical bootstrap (ctd.).
Maximum likelihood method 1. Point estimators. Connection with Bayes' theorem.
36
19
3/12/2015
Maximum likelihood method 2. Properties of estimators. Consistency of the maximum likelihood estimators. Asymptotic optimality of ML estimators. Maximum likelihood method 3. Bartlett's Identities. Cramer-Rao-Fisher bound. Variance of ML estimators. Introduction to Shannon's entropy. 38
20-21
11/12/2015
(recupero, aula B, ore 9-11, 11-13)
Maximum likelihood method 4. Information measures based on the Shannon's entropy: Kullback-Leibler divergence, Jeffreys distance, Fisher information. Introduction to confidence intervals.
5. Confidence intervals and confidence level. Confidence intervals for the sample mean of exponentially distributed samples. Confidence intervals for the correlation coefficient of a bivariate Gaussian distribution from MC simulation. Graphical method for the variance of ML estimators. Maximum likelihood method 6. Extended maximum likelihood. Examples. Introduction to ML with binned data. Example with two channels. Other examples of ML with binned data: decay rate in radioactive decay; exponent of power-law.
42
22
15/12/2015 Very brief overview of chi-square and least squares fits, chi-square distribution, weighted straight line fits, general least squares fits, least squares fitting of binned data, and nonlinear least squares. Fit quality and dimension of parameter space. Chi-square and chi-square tests. 44
23
17/12/2015
Hypothesis tests, significance level. Examples. Critical region and acceptance region. Errors of the first and of the second kind. p-value and rejection of the null hypothesis. Chi-square as a test statistic. Neyman-Pearson lemma. Significance of a signal. Detailed analysis of the statistical significance of a peak in spectral estimation. 46
24
21/12/2015
(recupero, aula C 15-17)
Detailed analysis of the Neyman construction of confidence intervals (link to the Neyman paper). Confidence intervals and the Feldman-Cousins construction (link to the FC paper, link to the presentation).   48
25
22/12/2015
"Statistical topics at the LHC", talk given by Dr. D. Tonelli (link to the presentation).
50

 Edoardo Milotti - December 2015