Data science. How scientists select their research topics. Paper in the journal Scientific Reports by a team of researchers affiliated with the Sant’Anna School, the Scuola Normale Superiore, the University of Bologna and the Pennsylvania State University

Producing new knowledge is an increasingly complex endeavor. How do scientists decide on what topics to focus their work? A team of researchers affiliated with the Sant’Anna School (the Scuola Superiore Sant’Anna), the Scuola Normale Superiore, the University of Bologna and the Pennsylvania State University addressed this question analyzing data on over three decades of publications (from 1977 to 2009) by a very large sample of physicists. Their results, which were published in the journal Scientific Reports, suggest that contemporary science is a quintessentially social enterprise. When scientists move out of their immediate specialization fields, they do so through collaborations. 

Scientific and technological discovery is shaped by a multiplicity of factors, but two general mechanisms are believed to affect scientists' choices; an “essential tension” in the trade-off between exploration and exploitation – already postulated by Thomas Kuhn as far back as the 1970s, and an increasing “burden of knowledge” in keeping up with a fast-evolving scientific frontier – which leads scientists to seek narrower specializations and rely on collaborations.  The aim of the study was to disentangle and quantify different factors shaping individual research choices, and in particular the roles played by knowledge relatedness (among research topics) and social relatedness (among scholars).

“We used innovative network and data science tools to compute a measure of similarity among topics and a measure of social proximity among scholars, to test whether and how these affect the scientists' diversification strategies. Both variables play a significant role, but social interactions emerge as the dominant means to exchange and acquire new knowledge” – says Giorgio Tripodi, first author of the study and Ph.D. student in Data Science, a joint program of Scuola Normale SuperioreUniversity of PisaSant’Anna School (Scuola Superiore Sant’Anna)Scuola IMT Alti Studi Lucca and CNR (the National Research Council). “Moreover, the data clearly shows that collaborations modulate knowledge transfer; the more a scientist moves away from his/her primary expertise,  the more interactions with other scientists become critical.”

“Tracing the scientific publications of roughly 200,000 physicists active in 9 fields and 68 subfields of research, we see that their diversification strategies have a systematic component. We used various statistical techniques to evaluate and quantify the effects of topics similarity and social proximity on such diversification, controlling for several potential confounding variables” – adds Francesca Chiaromente, author of the study, professor of Statistics at the Scuola Superiore Sant’Anna and at the Pennsylvania State University, and Scientific coordinator of the EMbeDS Department of Excellence (Economics and Management in  the  era of Data Science) at Sant’Anna School.

“Contemporary advances in science and technology are profoundly shaped by collaborative projects. The data at our disposal show that interactions among scientists, and thus our measure of social proximity, account for approximately 30% of their diversification strategies, while similarity among topics, and thus the choice to investigate a subject  “close” to one’s competences, accounts for only 10%” – stresses Fabrizio Lillo, author of the study, professor of Mathematical Methods for Economics and Finance at the Scuola Normale Superiore and at the University of Bologna.

From an operational perspective, the highlights the importance of fostering interactions among scientists with different specializations, which can be crucial to the development ofinnovative research. Individual knowledge and skills remain essential, but working in teams – especially with individuals with different expertise, is critical to today’s science. “In many ways, this study is proof of its own conclusions – says Francesca Chiaromonte – Giorgio Tripodi, the first  author of the study, belongs to the first cohort of students in the Data Science PhD program, coordinated by Dino Pedreschi, professor of Computer Science at the University of Pisa, which was created in 2017 with the specific goal of fostering interdisciplinary collaborations across the Scuola Normale, the University of Pisa, the Scuola Sant’Anna, the Scuola IMT Alti Studi Lucca and CNR”.   

Here the link to the study


See also:

The vaccine and vaccination against Covid-19: the propensity of the Italian population to join the vaccination campaign. A large scale survey by Agenas and Management and Healthcare Laboratory (MeS) of the Sant'Anna School

The Italian public administration is often used as an example of the so-called implementation gap: ambitious and technically advanced projects...


A new research and innovation project funded by the European Union’s Horizon 2020 Marie Skłodowska-Curie grant scheme (No 956745) is on the...