HiTEc & CoDES 2024: Spring Course

HiTEc Spring Course

The Spring Course will consist of a series of tutorials on representative HiTEc topics. Trainees are requested to bring their own laptop with R and Anaconda installed. You can install Anaconda, which enables the use of Jupyter Notebook, by visiting this website.

Dates: 23-25 March 2024
Venue: CUT Tassos Papadopoulos Building,Cyprus University of Technology, Limassol, Cyprus.
Room: Lecture Room 2, Floor 1.
Speakers:
Jamal Abdul Nasir, University of Galway, Ireland.
Cristina Mollica, Sapienza University of Rome, Italy.
Marta Crispino, Bank of Italy, Italy.
Degui Li, University of York, UK.

Module I

Large language models (LLMs)

Jamal Abdul Nasir, University of Galway, Ireland.

Software required:
Python: Anaconda

Open-AI Account:
To create an OpenAI account and access their services, follow these steps:
1. Visit the official OpenAI website at https://www.openai.com/ .
2. Upon sign-up, OpenAI provides $5 worth of free tokens, valid for 3 months.

The training schedule:

The training schedule outlined below is structured into four sessions under Module I, covering a comprehensive introduction to various aspects of Python programming, text mining including Large language Models (LLMs) and its applications in the field of econometrics.

Session 1.1 - Module I: Starting with Python (08:30 – 10:00)
This session serves as a foundational introduction to Python programming. Participants will delve into the basics of Python syntax, data types, and fundamental programming concepts. The goal is to establish a strong programming foundation for subsequent modules.

Session 1.2 - Module I: Text Pre-processing and Text Mining (10:30 – 12:30)
In this session, participants will focus on the crucial steps of text pre-processing and mining. Topics covered may include tokenization, stemming, lemmatization, and other techniques essential for preparing text data for analysis. The session aims to equip participants with the skills needed to handle textual data effectively.

Session 1.3 - Module I: Large Language Models -I (14:00 – 15:30)
This session explores the realm of Large Language Models, providing an introduction to their architecture, functionality, and applications. Participants may delve into understanding the inner workings of models like GPT-3 and learn how to leverage these models for various natural language processing tasks.

Session 1.4 - Module I: Large Language Models -II (16:00 – 18:30)
Building upon the previous session, this segment further delves into the practical applications of Large Language Models with a focus on econometrics. Participants may explore case studies, real-world examples, and hands-on exercises to gain a deeper understanding of how to implement and utilize Large Language Models effectively.

Module II

Preference learning via models for ranking data

Cristina Mollica, Sapienza University of Rome, Italy.
Marta Crispino, Bank of Italy, Italy.

The course will introduce ranking data, a special form of multivariate ordinal data, emerging, for instance, in experiments on choice behavior, preference characterization studies, such as recommender systems, sports competition contexts and political polls. These data originated the two streams of research of the Machine Learning community known as rank aggregation and preference learning. In this tutorial we will privilege the probabilistic approach, focusing on two alternative benchmark models designed to make inference from rankings: i) the Plackett-Luce and ii) the Mallows model. We will go through theoretical insights as well as empirical analysis from both frequentist and Bayesian inferential points of view, also including model-based clustering via finite mixtures.

During the lab sessions, we will go through a number of hands-on examples, including heterogeneous partially observed rankings. Installation of R/R Studio on private devices prior to the course is recommended.

Topics: Ranking data, Mallows model, Plackett-Luce, EM algorithm, MCMC sampling methods, clustering

Modules III

High-dimensional functional time series

Degui Li, University of York, UK.

This module provides a selective review of recent advances in high-dimensional functional time series. Both the number of functional processes and time series length are allowed to diverge to infinity. The following four topics are covered: high-dimensional functional regression and autoregression models, high-dimensional functional factor models, high-dimensional functional covariance matrix estimation, and high-dimensional nonstationary functional time series models.

Programme

Saturday, 23 March 2024

08:30 – 10:00 Session 1.1 - Module I
10:00 – 10:30 Coffee break
10:30 – 12:30 Session 1.2 - Module I
12:30 – 14:00 Lunch break
14:00 – 15:30 Session 1.3 - Module I
15:30 – 16:00 Coffee break
16:00 – 18:30 Session 1.4 - Module I

Sunday, 24 March 2024

08:30 – 10:00 Session 2.1 - Module II
10:00 – 10:30 Coffee break
10:30 – 12:30 Session 2.2 - Module II
12:30 – 14:00 Lunch break
14:00 – 15:30 Session 2.3 - Module II
15:30 – 16:00 Coffee break
16:00 – 18:30 Session 2.4 - Module II

Monday, 25 March 2024

08:30 – 10:00 Session 3.1 - Module III
10:00 – 10:30 Coffee break
10:30 – 12:30 Session 3.2 - Module III
12:30 – 14:00 Lunch break
14:00 – 16:00 Session 3.3 - Module III