CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations

Barbara Plank, Reinard van Dalen

Research output: Conference Article in Proceeding or Book/Report chapterArticle in proceedingsResearchpeer-review


Scientific dissemination is of central importance for the scientific process. This paper presents CiteTracked, a dataset of peer reviews and citation statistics covering scientific papers from the machine learning community and spanning six years. We describe and analyze the data collection of over 3,000 published papers, their peer review texts and citation counts, and depict possible usage directions. The dataset aims at fertilizing novel interdisciplinary work between fields such as scientometrics, information retrieval, computational linguistics and natural language processing to study the scientific publishing process.
Original languageEnglish
Title of host publication4th Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2019)
EditorsMuthu Kumar Chandrasekaran, Philipp Mayr
PublisherCEUR Workshop Proceedings
Publication date25 Jul 2019
Publication statusPublished - 25 Jul 2019
SeriesCEUR Workshop Proceedings


Dive into the research topics of 'CiteTracked: A Longitudinal Dataset of Peer Reviews and Citations'. Together they form a unique fingerprint.

Cite this