2018 CRoNoS Spring Course on Multivariate Data Analysis and Software

Dates: 3-5 April 2018
Venue: Poseidonia Beach Hotel, Limassol, Cyprus.
Organizers: Ana Colubi and Erricos Kontoghiorghes on behalf of the CRoNoS COST Action.

Anne Ruiz-Gazen, Toulouse School of Economics, France.
Simon Caton, National College of Ireland, Ireland.
Roland Fried, TU Dortmund University, Germany.
Cristian Gatu, Alexandru Ioan Cuza University of Iasi, Romania.

Multivariate Outlier Detection With ICS (8 hours)

Anne Ruiz-Gazen, Toulouse School of Economics, France.

After a practical introduction of the general use of R for multivariate data analysis,the objective of the course is to present the Invariant Coordinate Selection (ICS) method as a tool for multivariate outlier detection. ICS was proposed by Tyler et al. (2009) and shows remarkable properties for revealing data structures such as outliers or clusters. It is based on the simultaneous spectral decomposition of two scatter matrices and leads to an affine invariant coordinate system where the Euclidian distance corresponds to a Mahalanobis Distance (MD) in the original system. However, unlike MD, ICS makes it possible to select relevant components. This proves useful for detecting outliers lying in a small dimensional subspace for data sets in large dimensions. This context appears in particular in high reliability standards fields such as automotive, avionics or aerospace. In this context, ICS can be useful for detecting anomalies with a small proportion of false positives. The method will be illustrated on several artificial and real data sets using the recent R packages ICSOutlier and ICSShiny. The package ICSOutlier allows to choose scatter matrices, automatically select the most relevant components, calculate an outlierness index and identify potential outlying observations. The ICSShiny package provides a user-friendly application for ICS in particular for outlier detection.

Parallelisation of R Models with h2o (7 hours)

Simon Caton, National College of Ireland, Ireland.

Model scaling is becoming increasingly necessary as datasets increase in size, but also to facilitate core aspects of model building, model prototyping, and model selection. In this session, we will explore the application of h2o to facilitate the parallelisation of R models. The session will begin with parallelising a selection of multivariate methods to use multiple cores on participants' machines. From here, it will move towards leveraging cloud resources to further increase model scalability and correspondingly reduce runtimes. It will culminate with advice on appropriate uses of cloud and other parallel architectures for model building.

robts - an R-package for robust time series and changepoint analysis (2 hours)
Roland Fried, TU Dortmund University, Germany.

The progress on our R-package robts is reported, which is available from R-Forge. Our package works under the assumption of short range dependence and provides different techniques for robust estimation of autocorrelations, partial autocorrelations and spectral densities, for robust fitting of autoregressive time series models, for model diagnostics and prediction. Since many time series models assume second order stationarity, we include robust tests for checking the stationarity of the mean, the variance and the autocovariances. Extensions to multivariate time series analysis are a task for future work.

Computational strategies for regression model selection (2 hours)

Cristian Gatu, Alexandru Ioan Cuza University of Iasi, Romania.

Computational strategies for computing the best-subset regression models are proposed. The algorithms are based on a regression tree structure that generates all possible subset models. An efficient branch-and-bound algorithm that finds the best submodels without generating the entire tree is described. Specifically, the computational burden is reduced by pruning the non-optimal subtrees. Strategies and approximate algorithms that improve the computational performance are investigated. Further, this strategies are adapted to solve the problem of regression subset selection under the condition of non-negative coefficients. The solution is based on an alternative approach to quadratic programming that derives the non-negative least squares by solving the normal equations for a number of unrestricted least squares subproblems. This innovative approach is computationally superior to the straight-forward method that would estimate the corresponding non-negative least squares of all possible submodels in order to select the best one. The R package "lmSubsets" for regression subset selection is introduced and described. The package aims to provide a versatile tool for subset regression.

PhD students and Early Career Investigators (who have obtained their PhD degree in 2008 or after) from eligible COST countries* can apply for a limited number of grants. The granted participants will be reimbursed up to 650 Euro for accommodation and travelling plus the standard registration fee.
  • In order to apply for the grants candidates should submit their CV by e-mail to cronos.cost@gmail.com.
  • Deadline for applications: 10th February 2018.
  • Granted candidates will be informed by e-mail after the deadline and must send their flight tickets and accommodation booking 7 days after the notification to cronos.cost@gmail.com to secure their grants. Otherwise, their grants will be revoked and assigned to other candidate.
  • The granted candidates must attend all the sessions of the Spring course and sign the attendance list in order to obtain their grants.
*Eligible COST countries: Austria, Belgium, Bosnia and Herzegovina, Bulgaria, Croatia, Cyprus, Czech Republic, Denmark, Estonia, Finland, France, Germany, Greece, Hungary, Iceland, Ireland, Israel, Italy, Latvia, Lithuania, Luxembourg, Malta, Montenegro, The Netherlands, Norway, Poland, Portugal, Romania, Serbia, Slovakia, Slovenia, Spain, Sweden, Switzerland, Turkey, United Kingdom and the former Yugoslav Republic of Macedonia.

Registration fees

The registration fee includes participation to all sessions both of the Spring Course, material, coffee breaks and a welcome reception (pre-registration is mandatory). The registration also includes attendance to the CRoNoS Workshop on Multivariate Data Analysis and Software

Early bird registration
until February 15th, 2018
Standard registration
until March 9, 2018
Late registration
until March 23, 2018
Cash registration
after March 23, 2018
CRoNoS Member 120€ 170€ 250€ 350€
Non-CRoNoS Member 250€ 270€ 350€ 450€
Social Events

Attendees are responsible for making their own lodging and travel arrangements.

The Poseidonia Beach Hotel, venue of events, is offering special prices to the Summer Course and Workshops participants. In order to book at the special prices, you should register for the events to get your code and introduce it at their reservation page.