Final CRoNoS Spring Course
The Spring Course will consists of a series of tutorials in representative areas of CRoNoS (Computationally-Intensive methods for the robust analysis of non-standard data).

Dates: 14-16 April 2019
Venue: Poseidonia Beach Hotel, Limassol, Cyprus.
Speakers:
Stefan Van Aelst, KU Leuven, Belgium.
Tim Verdonck, TKU Leuven, Belgium.
Karel Hron, Palacký University, Czech Republic.
Alastair Young, Imperial College, UK.
Peter Winker, University of Giessen, Germany.
Zlatko Drmac, University of Zagreb, Croatia.
Simon Caton, National College of Ireland, Ireland.
Ivette Gomes, Universidade de Lisboa, Portugal.
Vladimir Batagelj, University of Ljubljana, Slovenia.
Daniela Zaharie, West University of Timisoara, Romania.

Grants
PhD students and Early Career Investigators (who have obtained their PhD degree in 2010 or after) from eligible COST countries* can apply for a limited number of grants. The granted participants will be reimbursed up to 600 Euro for accommodation and travelling plus the (standard) registration fee.
  • In order to apply for the grants candidates should submit their CV by e-mail to cronos.cost@gmail.com.
  • Deadline for applications: 8th January 2019.
  • Granted candidates will be informed by e-mail after the deadline and must register 7 days after the notification to cronos.cost@gmail.com to secure their grants. Otherwise, their grants will be revoked and assigned to other candidate.
  • The granted candidates must attend all the sessions of the Spring course and sign the attendance list in order to obtain their grants.
*Eligible COST countries: Austria, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Malta, Montenegro, The Netherlands, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey, United Kingdom and the former Yugoslav Republic of Macedonia.
Robust high-dimensional data analysis (4 hours)

Stefan Van Aelst, KU Leuven, Belgium, and Tim Verdonck, TKU Leuven, Belgium.

TBA.

Applied compositional data analysis (3.5 hours)

Karel Hron, Palacký University, Czech Republic.

Compositional data are multivariate observations that carry relative information. They are measured in units like proportions, percentages, mg/l, mg/kg, ppm, and so on, i.e., as data that might obey (or not) a constant sum of components. Due to their specific features, the statistical analysis of compositional data must obey the geometry of the simplex sample space. In order to enable processing of compositional data using standard statistical methods, compositions can be conveniently expressed by means of real vectors of logratio coordinates. Their meaningful interpretability is of primary importance in practice. Aim of the course is to introduce the logratio methodology of compositional data together with a wide range of its possible applications. The first part of the course will be devoted to theoretical aspects of the methodology including principles of compositional data analysis, geometrical representation of compositions, construction of logratio coordinates and their interpretability. In the second part exploratory data analysis including visualization will be presented, followed by concrete popular statistical methods, e.g. correlation and regression analysis, or principal component analysis, and even methods for processing of high-dimensional data adapted within the logratio methodology. Also robust counterparts to some of these methods will be discussed. Numerical examples will be presented using the package robCompositions of the statistical software R.

Selective inference (2-3 hours approx.)

Alastair Young, Imperial College, UK.

Selective inference is concerned with performing valid statistical inference when the questions being addressed are suggested by examination of data, rather than being specified before data collection. In this tutorial we describe key ideas in selective inference, from both frequentist and Bayesian perspectives. In frequentist analysis, the fundamental notion is that valid inference, in the sense of control of error rates, is only obtained by conditioning on the selection event, that is, by considering hypothetical repetitions which lead to the same inferential questions being asked. The Bayesian standpoint is less clear, but it may be argued that such conditioning on the selection is required if this takes place on the parameter space as well as on the sample space. We provide an overview of conceptual and computational challenges, as well as asymptotic properties of selective inference in both frameworks, under the assumption that selection is made in a well-defined way.

Text mining in econometrics (2-3 hours approx.)

Peter Winker, University of Giessen, Germany.

There is a growing interest in the use of textual information in different fields of economics ranging from financial markets (analysts’ statements, communication of central banks) over innovation activities (patent abstracts, websites) to the history of economic science (journal articles). In order to draw meaningful conclusions from this type of data, the analysis has to cover a substantial number of steps including 1) the selection of appropriate sources (corpora) and establishing access, 2) the preparation of the text data for further analysis, 3) the identification of themes within documents, 4) quantifying the relevance of themes in different documents, 5) aggregating relevance information, e.g. across sectors or over time, 6) analysis of the generated indicators. The course will provide some first insights into these steps of the analysis and indicate open issues regarding, e.g. computational complexity and robustness of the methods. It will be illustrated with empirical examples.

Numerical aspects of computational statistics (3 hours approx.)

Zlatko Drmac, University of Zagreb, Croatia.

TBA.

Analysing social media data (2-3 hours approx.)

Simon Caton, National College of Ireland, Ireland.

TBA.

Statistical extreme value analysis and R packages (3 hours approx.)

Ivette Gomes, Universidade de Lisboa, Portugal.

TBA.

Temporal network analysis (2-3 hours approx.)

Vladimir Batagelj, University of Ljubljana, Slovenia.

TBA.

Machine learning methods for multivariate data analysis (2-3 hours approx.)

Daniela Zaharie, West University of Timisoara, Romania.

TBA.