Statistical Data Analysis - Course program


Preliminary version of the course program. As the course progresses, the font color changes from gray to black. 

lesson
date
lesson topics
total time

Part 1: Basics of probability theory and probability models


1
14/10/2014
Introduction to the course. Review of the basic concepts of probability. Basic combinatorics. Stirling's Approximation. 2
2
16/10/2014
Set theory representation of probability space. Addition law for probabilities of incompatible events. Addition law for probabilities of non-mutually exclusive events. Conditional probabilities. Bayes' theorem. Independent events. Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models (1). 4
3
20/10/2014
Conditional probabilities in stochastic modeling: gambler's ruin and probabilistic diffusion models (2). Bertrand's Paradox (1).
First Borel-Cantelli lemma.
6
4
23/10/2014
Second Borel-Cantelli lemma.
The transition from sample space to random variables.
Review of basic concepts and definitions on discrete and continuous random variables. Uniform distribution. Buffon's needle. Transformations of random variables. Brief review of mathematical expectation, dispersion (variance), and of the properties of expectation and variance. Chebyshev's inequality.
8
5
27/10/2014
From Chebyshev's inequality to the weak law of large numbers. Other inequalities in probability theory: Markov's inequality, generalized Chebyshev's inequality. Strong law of large numbers. Chernoff bound. 10
6
29/10/2014
Very brief review of common probability models. Examples: 1. The uniform distribution; 2. the Bernoulli distribution; 3 the binomial distribution; 4. the multinomial distribution; 5. Poisson distribution; memoryless random processes; Examples: lottery tickets and the Poisson statistics;  cell plating statistics in biology. 12
7
5/11/2014
Distributions (ctd): 5. exponential distribution. Example: paralyzable and non-paralyzable detectors; 6. the De Moivre-Laplace theorem and the Gaussian distribution; 7. The multivariate normal distribution.
14
8
6/11/2014
(recupero, aula B)
Transformations of random variables. Sum of random variables (convolution). Product of two random variables (Mellin's convolution). Functions of random variables. Approximate transformation of random variables: error propagation. Linear (orthogonal) transformation of random variables.
Other important distributions.
16
9
7/11/2014
(recupero, aula B)
Jaynes' solution of Bertrand's paradox. The nature of randomness. Randomness in a coin toss. Bell's inequalities. 18
10
10/11/2014
Introduction to generating functions: examples. Probability generating functions (PGF). PGF of the Poisson distribution. PGF of uniform and binomial distributions. Poisson distribution as limiting case of a binomial distribution. 
20
11
12/11/2014
PGF of the Galton-Watson branching process. Photomultiplier noise.
Characteristic functions. Moments of a distribution. Skewness and kurtosis. Mode and median. Properties of characteristic functions.
The Central Limit Theorem (CLT). 
22
12
19/11/2014
The Berry-Esseen theorem. Additive and multiplicative processes. Stable distributions.
Introduction to discrete-time stochastic processes. Markov chains. Transient and persistent states in Markov chains.
24
13
20/11/2014
(recupero, aula B)
Transient and persistent states in higher-dimensional random walks.
Invariant distribution (1).
26
14
24/11/2014 Invariant distribution (2). Time reversal and detailed balance. Transient and persistent states in higher-dimensional random walks.Invariant distribution. Time reversal and detailed balance. Continuous time Markov Processes. Boltzmann's H-theorem. Hidden Markov Models.
28

Part 2: Introduction to statistical inference


15
26/11/2014
Descriptive statistics. Sample mean, sample variance, estimate of covariance and correlation coefficient. Statistics of sample mean for exponentially distributed data.
Example of synthesis of nonparametric estimators in the "German Tank Problem".
30
16
1/12/2014 Order statistics. Limitations of the standard deviation as a descriptor of the width of a distribution.
The Allan variance for noise processes with infrared divergences. Shannon entropy and information concentration in pdf's. 
Introduction to the Monte Carlo method.
32
17
3/12/2014 Pseudorandom numbers. Uniformly distributed pseudorandom numbers. Transformation method. Acceptance-rejection method. Examples: generation of angles in the e+e- -> mu+mu- scattering; generation of angles in the Bhabha scattering.
34
18
10/12/2014 The structure of a complete MC program to simulate low-energy electron scattering. 
Transformation method and the trasformation of differential cross sections.Early history of the Monte Carlo method.
Statistical bootstrap.
36
19
11/12/2014
(recupero, aula B)
Maximum likelihood method 1. Point estimators. Connection with Bayes' theorem.
37
20
16/12/2014
(recupero, aula B, ed. A)
Maximum likelihood method 2. Properties of estimators. Consistency of the maximum likelihood estimators. Asymptotic optimality of ML estimators.  39
21
17/12/2014 Maximum likelihood method 3. Bartlett's Identities. Cramer-Rao-Fisher bound. Variance of ML estimators. Introduction to Shannon's entropy. 41
22
7/1/2015 Maximum likelihood method 4. Information measures based on the Shannon's entropy: Kullback-Leibler divergence, Jaynes distance, Fisher information. Introduction to confidence intervals.  43
23
8/1/2015 (recupero)
Maximum likelihood method 5. Confidence intervals and confidence level. Confidence intervals for the sample mean of exponentially distributed samples. Confidence intervals for the correlation coefficient of a bivariate Gaussian distribution from MC simulation. Graphical method for the variance of ML estimators.
45
24
9/1/2015 (recupero) Maximum likelihood method 6. Extended maximum likelihood. Examples. Introduction to ML with binned data. Example with two channels. Other examples of ML with binned data: decay rate in radioactive decay; exponent of power-law.
Very brief overview of chi-square and least squares fits, chi-square distribution, weighted straight line fits, general least squares fits, least squares fitting of binned data, and nonlinear least squares. Fit quality and dimension of parameter space. 
47
25
12/1/2015
Maximum likelihood method 7. Hypothesis test, significance level. Examples. Critical region. Construction of test statistics. Neyman-Pearson lemma. Chi-square test.Significance of a signal. Detailed analysis of the statistical significance of a peak in spectral estimation. The Feldman-Cousins construction (link to the FC paper). Brief introduction to the test statistics used by the LHC experiments.
49

 Edoardo Milotti - January 2015