View Submission - HiTECCoDES2024
A0195
Title: Data augmentation for testing subject alignment with a COST Action Authors:  Louisa Kontoghiorghes - Kings College London (United Kingdom) [presenting]
Ana Colubi - University of Giessen (Germany)
Abstract: The objective is to use a generative pre-trained transformer (GPT) model for data augmentation when limited text data is observed. The focus is to assess how well a research abstract aligns with the scientific objectives of a specific COST action. To achieve this, a GPT model is employed to create abstracts based on the proposal of the COST action, ensuring contextual relevance. The model is also used to generate a variation set of abstracts from the one of interest. To quantify differences between the two sets, Latent Dirichlet Allocation (LDA), a topic model method, is implemented, the prevalence is estimated, and a two-sample bootstrap test is performed, providing a statistical comparison of subject alignment.