Sleeping beauties in Computer Science: characterization and early identification


While a large majority of scientific publications get most of their citations within the initial few years after publication, there is an interesting number of papers—termed as sleeping beauties—which do not get much cited for several years after being published, but then suddenly start getting cited heavily. In this work, we focus on sleeping beauties (SBs) in the domain of Computer Science. We identify more than 5,000 sleeping beauties in Computer Science, and characterise them based on their sub-field and their citation profile after awakening. We also reveal some interesting factors which led to their awakening long after publication. Furthermore, we also propose a methodology for early identification of sleeping beauties, and develop a machine learning-based classification approach that attempts to classify publications based on whether they are likely to be SBs. The classifier achieves a precision of 0.73 and a recall of 0.45 in identifying SBs immediately after their year of publications, and the performance significantly improves with time. To our knowledge, this is the first study on sleeping beauties in Computer Science.

comments powered by Disqus