View Submission - HiTECCoDES2024
A0211
Title: Spectral CLTs with long memory for causal inference in augmented large language models Authors:  Andrej Srakar - Institute for Economic Research Ljubljana (Slovenia) [presenting]
Abstract: Since the pioneering works from the 1980s, central and noncentral limit theorems have been constantly refined, extended and applied to an increasing number of diverse situations. A recent study extended this to spectral central limit theorems valid for additive functionals of isotropic and stationary Gaussian fields. Their work uses the Malliavin-Stein method and Fourier analysis techniques for situations where $Y_t$ admits Gaussian fluctuations in a long memory context. Another recent article augmented existing language models with long-term memory. They proposed a framework of language models augmented with long-term memory, which enables LLMs to memorize long histories. The two perspectives are combined with a CausalNLP, a toolkit for inferring causality with observational data that includes text in addition to traditional numerical and categorical variables, to develop spectral central limit theorems in a context of causality for text data from long-term memory augmented large language models. The main stochastic calculus tools are derived from the Malliavin-Stein method, Fourier analysis, and free probability. Applications on datasets are presented from finance and medical imaging. In conclusion, possible Bayesian extensions are discussed.