Title: Text as a new source of data: First experience with conference abstracts Authors:  Peter Winker - University of Giessen (Germany) [presenting]
Abstract: The use of textual information gained momentum over the last years in economics. Text is considered as the new data in fields such as financial markets, innovation activities and economic history. For drawing meaningful conclusions from this type of data, a substantial number of steps in pre-processing and analyzing the data has to be taken. Usually, the implementation of these methods is based on previous experience, statistical methods, or human judgement. Thus, typical issues present when dealing with conventional quantitative data also apply to textual information. They might just be disguised differently, while new challenges show up. Some relevant steps in using textual data in a time series context are sketched. Abstracts of a conference series serve as an example. In particular, the following issues will be addressed: 1) selection of appropriate sources (corpora) and establishing access, 2) preparation of the textual data, 3) identification of themes, 4) quantifying the relevance of themes across documents, 5) aggregating relevance information over time. Finally, some remarks on the use of the generated indicators in further analysis will be provided. Open issues regarding, e.g. computational complexity and robustness of the methods will be discussed.