COMPSTAT 2024: Start Registration
View Submission - COMPSTAT2024
A0204
Title: Textual content and academic journals selectivity: A case of economic journals Authors:  Pawel Baranowski - Institute of Economic and Financial Research, Lodz, Poland (Poland) [presenting]
Szymon Wojcik - University of Lodz (Poland)
Abstract: A large supply of papers obstructs the editorial procedures in scientific journals, especially in top-quality academic journals. Moreover, this phenomenon stimulates the emergence of low- (or non-) selective journals, attracting authors with short editorial procedures in exchange for high fees. We argue that introducing natural language processing can help distinguish the papers worth reading by the editor from those whose scientific quality does not meet the standards. To test this hypothesis, we apply state-of-art large language models, i.e. bidirectional encoder representations from transformers (BERT). Our sample consists of approximately 500 academic papers representing economics and finance or business. The papers were collected from journals of three levels of selectivity, namely: highly selective (top-tier journals), moderately selective (journals listed on the DOAJ list), and non-selective (predatory journals). More specifically, we applied both pre-trained and fine-tuned Sci-BERT model on anonymised texts of academic papers. The results show that the pure textual content may give over 80\% out-of-sample accuracy in classifying texts into the three levels of selectivity. The outcomes prove the usefulness of NLP in distinguishing the scientific quality of the paper and support Bealls classification of predatory journals.