2024-2025 / MATH2021-1

High-dimensional statistics

Duration

30h Th, 15h Pr, 30h Proj.

Number of credits

 Master MSc. in Data Science, professional focus5 crédits 
 Master MSc. in Data Science and Engineering, professional focus5 crédits 

Lecturer

Gentiane Haesbroeck

Language(s) of instruction

English language

Organisation and examination

Teaching in the first semester, review in January

Schedule

Schedule online

Units courses prerequisite and corequisite

Prerequisite or corequisite units are presented within each program

Learning unit contents

The course is devoted to the following themes:

- Exploratory data analysis
- Dimension reduction techniques: Principal Component Analysis, tSNE
- Multivariate estimation, with a particular emphasis on the estimation of the covariance matrix (classic technique under normality, penalized version and robust version)
- Multiple regression and generalized linear modeals (e.g. Poisson Model and Logistic model)
- Independent Component Analysis

Learning outcomes of the learning unit

The student will gain sufficient knowledge to be able to select the appropriate multivariate statistical technique to reduce the dimension of the problem or construct a linear/non linear model to explain a dependent variable by means of explanatory variables...

Prerequisite knowledge and skills

A strong background in univariate statistics is required. Moreover, even though the mathematical justifications are not developped in details, the students must be familiar with the basic notions of linear algebra (vector, matrix, determinant, eigen values and eigen vectors...).

Planned learning activities and teaching methods

The theory is exposed in an ex-cathedra way. After the theory lectures, the techniques available in the R software (which is compulsory on this course) are briefly illustrated. The students are then invited to use them at home on some concrete practical situations. The results and interpretation of these personal analyses will be regularly discussed during the following lectures.

 

Mode of delivery (face to face, distance learning, hybrid learning)

Blended learning


Further information:

Additional information:

 

The course is a priori scheduled in a face-to-face way but some lectures might, exceptionnally, be given via videos (the information will be available in Celcat).

The practicals are done at home.

Course materials and recommended or required readings

There are no lecture notes. The slides will be available from eCampus. Moreover, for each theme, a reference book will be notified in order to suggest additionnal reading.
 

Exam(s) in session

Any session

- In-person

written exam ( open-ended questions )

Written work / report

Out-of-session test(s)


Further information:

Exam(s) in session

Any session

- In-person

written exam ( open-ended questions with access to the software R)

Written work / report

A report based on some data analysis has to be handed in and presented to the professor and her assistant before the winter break.


Additional information:

The final grade is a weighted mean computed on the grades obtained for

- preparation and presentation of a personal project : the date for the release of the statement as well as the deadlines for the submission and presentation of the projets will be stated in Celcat. 

- the written exam consisting of some data analyses and detailed/explained applications of techniques taught in the lectures.

When both grades are superior or equal to 6/20, the weighted average is computed by means of equal weights (50%-50%). If at least one of the grades is inferior to 6/20, then the weights become 25%-75%, the largest weight being attributed to the lowest grade.

Work placement(s)

Organisational remarks and main changes to the course

The lectures are taught in English.

The lecture room does not provide a podcast equipment by default, the lectures given in a face-to-face way will not be available under another form.

Contacts

Lecturer: Gentiane HAESBROECK, Institute of Mathematics (B37), g.haesbroeck@uliege.be

Association of one or more MOOCs