The impact of preprint servers in the formation of novel ideas

Swarup Satish, Zonghai Yao, Andrew Drozdov, Boris Veytsman

First Workshop on Scholarly Document Processing (SDP 2020) Workshop Paper

Abstract: We study whether novel ideas in biomedical literature appear first in preprints or traditional journals. We develop a Bayesian method to estimate the time of appearance for a phrase in the literature, and apply it to a number of phrases, both automatically extracted and suggested by experts. We see that presently most phrases appear first in the traditional journals, but there is a number of phrases with the first appearance on preprint servers. A comparison of the general composition of texts from bioRxiv and traditional journals shows a growing trend of bioRxiv being predictive of traditional journals. We discuss the application of the method for related problems.
