Title: Modelling non-stationary `big data'
Authors: Jennifer Castle - Oxford University (United Kingdom) [presenting]
Jurgen Doornik - Oxford University (United Kingdom)
David Hendry - University of Oxford (United Kingdom)
Abstract: Seeking substantive relationships among vast numbers of spurious connections when modelling Big Data requires an appropriate approach. Big Data are useful if they can increase the probability that the data generation process (DGP) is nested in the postulated model, increase the power of specification and mis-specification tests, and yet do not raise the chances of adventitious significance. Simply choosing the best-fitting equation or trying hundreds of empirical fits and selecting a preferred one--perhaps contradicted by others that go unreported--is not going to lead to a useful outcome. A crucial issue addressed in this paper is that wide-sense non-stationarity (cointegration and location shifts) must be taken into account if statistical modelling by mining Big Data is to be productive. A factor approach to identifying cointegrating relationships is investigated. Moreover, important computational problems must be resolved given the huge numbers of possible models to be selected over.