Over the last two years, a group of researchers used a shared dataset in order to compare their approaches to the identification of thematic structures in a set of 111,616 papers on astronomy and astrophysics published in 59 journals between 2003 and 2010. The outcomes of this comparative exercise are published in a special issue of Scientometrics (Gläser et al. 2017). Now that Clarivate Analytics kindly agreed to make this dataset available to interested researchers in the bibliometrics community, we suggest to extend this comparative approach.

We challenge you to participate in the comparative topic identification exercise.

The challenge is not to develop the best partitioning of the dataset. We believe this to be impossible because there is not one single best solution. Instead, we challenge you to gain as much information as possible about your own approach and the reasons why it produced a particular solution, and how it compares to solutions produced by other approaches. We challenge you to comparatively discuss advantages and disadvantages of approaches to topic identification and thus to contribute to a cumulative body of knowledge on the suitability of data models and algorithms for the identification of topics.

Thanks to Clarivate Analytics, we are able to offer access to the dataset with an efficient license agreement.

While participating in the “Web of Science comparative topic identification exercise,” you will be provided with access to the Clarivate Analytics “Web of Science comparative topic identification exercise” dataset. You may access and use this dataset from March 1, 2017 through December 31, 2018 only for the exercise above, subject to the “Clarivate Analytics Terms”, including the “Web of Science: Custom Data Set Product Terms” in the “Product/Service Terms”, available on our Terms of Business site: http://clarivate.com/tob/. By accessing and/or using our data, you are legally bound by and hereby consent to these terms. If you do not agree to these terms, then you may not access or use our data. Any extension or further use of our data beyond December 31, 2018 is strictly prohibited unless you receive prior written permission from Clarivate Analytics.

The dataset can be obtained by sending an email to jason.rollins@thomsonreuters.com.

We will also offer access to a website where solutions can be deposited and downloaded for comparisons. The website, which also offers some tools for a comparative analysis of solutions and individual clusters, is www.topic-challenge.info.

If there are enough participants, we will run sessions on the comparative exercise at the next ISSI conferences and dedicated workshops.

We hope that many of you will take up the challenge and thus contribute to cumulative progress in bibliometrics.