INTRODUCTION
Modern scholarship has long been debating about the importance of research evaluation for ensuring scientific rigor, significance, and impact in both natural and social sciences. Furthermore, in the era of globalization, determining standards and criteria for evaluating research became one of the major preoccupations of not only the academic community but also that of ministries, scientific committees, nonprofit agencies, foundations and other stakeholders engaged in the research investment cycle.
Quantification, standardization, and hierarchization of knowledge production became one of the major pillars of the academic work, affecting the organization and orientation of the whole sector. As in any other branch, competition in academia created the necessity to develop indicators for assessing, evaluating and ranking research work, academics, departments, projects and institutions. Thus, sets of criteria, standards and evaluation methods have been established in order to measure performance in academic work. The most important ones, traditionally, have been the prestige of the publication venue, namely the journal-level measures such as the impact factor and the metrics such as citation index. This enabled the academic community to distinguish clearly between good and bad research work, to establish merit-based practices and shape review, promotion, and tenure (RPT) processes within the institutions.
However, in the process, academics started complaining that surveillance, evaluation, and ranking became so oppressive and demanding that the pressure to publish in a particular way and venue started hindering academic production, its quality and pertinence[1][2]. Due to this criticism, in recent years scholars started advocating for setting the new innovative criteria for evaluating research, such as the research impact and influence in society, which became one of the most elaborated topics in the social science research agenda[3][4][5][6][7]. Nowadays, the evaluation of the research quality relies as much on the different kinds of impacts that are discussed in ‘impact’ literature, such as popularization[8][9], business engagement[10], scientists’ response to policy[11], teaching and collaborations[12][13], influence on politics[14].
The goal of this article is to review the following topics relevant to the evaluation methods of research quality in the social sciences and humanities. First, we discuss the evaluation of the research quality in general. Second, we discuss the notion of assessing the quality of interdisciplinary research. Third, we question the usage of evaluative inquiry, as the new concept for assessing the quality of scientific research. Finally, we consider the relevance of societal impact as the measurement of research quality in the social sciences and humanities.
EVALUATION OF RESEARCH QUALITY
Traditionally, most of the research evaluation relied on measures of academic output, such as peer review and bibliometric, as tools for assessing research quality and its scientific merit[15][16]. Thus, one of the major criteria for evaluating research quality is considered to be very the prestige of the journal in which the publication appeared (impact factor, etc.) and the number of subsequent citations in other scholarly works[17]. Although heavily criticized, journal reputation has been used as a measurable and reliable tool for judging research results in social sciences and humanities for years, even in transdisciplinary fields. But this excludes from the competition number of other publication formats common in social sciences – such as books, book chapters, conference papers, reports, reviews, etc. Citation count is also highly flawed, or to say at least, exclusionary – as most of the social science articles indexed by Web of Science are in English[18], meaning that publications in national languages, regardless of their genuine quality, remain most of the time under the radar.
These evaluation procedures have been heavily criticized lately, especially when it comes to evaluating research in humanities and socials sciences, as they seem not to be fully compatible with research and scientific communication practices employed in these fields[19][20]. In one of his essays on the topic, Wouters argues that the discrepancy between how evaluators and researchers perceive quality in research creates the ‘evaluation gap’, where scholars argue that the established assessment criteria do not coincide with what they value in their research work[21]. Especially in social sciences and humanities, scholars have been criticizing evaluation indicators and metrics as ‘unlikely to fully reflect the quality of their work’[22]. Indeed, while citations, peer review and impact factor remain relevant for evaluating social sciences, it has been noted that additional criteria should be established to address variety of innovative approaches, actors, research designs and mechanisms, as well as the societal impact of social sciences and humanities, especially when it comes to transdisciplinary research[22].
The major issue with the dominant evaluation techniques is that they encourage researchers to focus on publishing “safe papers” in major journals, rather than to try and conceptualize ground-breaking alternative research which might not fit into the framework of mainstream journals[23]. Furthermore, the “one-size-fits-all” evaluation tools might not be appropriate for innovative research designs and alternative research concepts and methods. Different criteria and factors affect the quality of research, and it is difficult to compare research quality across the fields. What might be considered bad research in one field, could be easily recognized as high quality research in the other. Similarly, research quality depends on the context, time and place of publication – what might be highly relevant for certain communities, in a particular moment, is often unacknowledged in other settings or a different period.
ASSESSING QUALITY OF TRANSDISCIPLINARY RESEARCH
As pointed out by Belcher and colleagues, in every activity, we need sets of principles, comparison criteria or benchmarks, which should help us, evaluate its quality, potential, progress, and success[24]. It is necessary to provide reliable quality criteria in order not only to improve scientific rigor, research design and methodological tools but also to inform funders about the outcomes of their support and funding in terms of research success. The research quality is determined using mostly two main criteria – scientific excellence and scientific relevance[24]. In most disciplines, there is an established set of measures and criteria evaluating the quality of research design, soundness of methodology, originality of results. These processes are much more challenging when it comes to transdisciplinary research in social sciences and humanities since up to this day there has been no consensus or widely accepted principles and criteria for evaluating transdisciplinary work.
The majority of authors reflecting on the quality evaluation in social science transdisciplinary research emphasize the need for expansion of existing and adoption of new evaluation criteria for these research articles and projects[25]. These criteria should be made explicit and widely agreed upon, but only a few of the reviewed articles suggested the specific criteria to be used. Furthermore, it has been argued that the quality assessment of transdisciplinary research articles should be conducted by the reviewers from beyond discipline, or at least reviewers from various disciplines[26][27][28]. This is particularly important as researchers doing transdisciplinary research struggle with selecting publishing outlets and are often inclined to first choose a journal, and then tailor their research methods and design to “fit” disciplinary scope of the journal. This limits the advances in knowledge and the creation of innovative transdisciplinary methods.
Concerning the quality criteria for evaluating transdisciplinary research, Boaz and Ashby distinguish four criteria – methodological quality, quality of reporting, appropriateness of methods and relevance to policy and practice[29]. Spaapen, Dijstelbloem, and Wamelink suggest that the evaluation of each research project should be conducted against its own goals and not rely on a comparison between projects[30]. According to Jahn and Keil, quality criteria for evaluation of transdisciplinary research are quality of the research problems, quality of the research process, and quality of the research results[31]. Other important criteria mentioned in the literature are stakeholder engagement, integration of epistemologies, impact agenda, diversity of result outputs, etc.
The evaluative inquiry is one of the newest concepts in the field, aiming to challenge previous instruments and organization of the research evaluation, in terms of its understanding of academic achievement, impact and the ways it should be measured. The evaluative inquiry has been first introduced by Fochler and de Rijcke, arguing that the research quality cannot be understood as a straightforward and universal concept and thus there cannot be a one-size-fits-all instrument for measuring it[1]. Instead, they suggest to reflect on the academic work as a process and understand quality as a result of the interactions between values and networks of people, outputs and resources through which knowledge is generated[32].
The main idea of this approach is that academic achievement is distributed amongst both academic and non-academic participants and thus needs to be studied through a portfolio approach, namely the multiplication of methods offering various insights into academic work and its quality. In reference to impact, this method problematizes the request to produce both high-quality academic publications and the societal relevance through it. It criticizes the idea of passive stakeholders receiving benefits from academic expertise (impact), emphasizing instead the concept of ‘productive interactions’[15] between stakeholders. In practice, this means that stakeholders are not only co-producers of knowledge and impact, but also of the criteria by which such impact is to be evaluated[32].
SOCIAL IMPACT
As previously elaborated, most of the quality evaluation in social sciences previously relied on scientific impact[15]. The idea, which in general appears in the literature, related to impact creation and evaluation is the concept of productive interactions. In their influential study, Spaapen and Van Drooge provided a new way to think about the ways in which research creates impacts which they termed ‘productive interactions’ and defined as ‘exchanges between researchers and stakeholders in which knowledge is produced and valued that is both scientifically robust and socially relevant. These exchanges are mediated through various ‘tracks’, for instance, a research publication, an exhibition, a design, people or financial support’[15]. This concept highlighted the importance of stakeholder collaboration in the research design, publication, and implementation, arguing that rather than some unattainable goal, societal impact in social sciences and humanities can be easily achieved by simply enhancing productive discussions between academics and policy makers.
According to Muhonen, Benneworth and Olmos-Penuela, impact creation in social sciences and humanities cases can be achieved through different forms of scientific and popular publishing, but also extensive media and public engagement, stakeholder interaction, commercialization or policy, legislation and epistemic training[7]. In their research, they develop the pipeline model detailing 12 major impact pathways, which are the interactive dissemination model, the collaboration model, the public engagement model, the expertise model, the mobility model, the ‘anticipating adversaries’ model, the ‘seize the day’ model, the social innovation model, the commercialization model, the research engagement model, the knowledge ‘creeps’ into society model and the building ‘new epistemic communities’ model. This framework might be very useful for conceptualizing different impact pathways.
Reale and colleagues distinguish between three major impact categories – scientific, social and political impact[33]. According to their findings, scientific impact in SSH could be understood as a scientific change produced by a certain piece of research, such as the transformation of the research process. The political impact is defined as the transferability of research results into the political sphere aiming to contribute to the policymaking, while social impact refers to the research contribution to social challenges by inspiring social activism or civil society interventions. For all the impact categories, the authors highlight the tendency of a participatory approach, by including new stakeholders and engaging in public debates between academics and policymakers, civil society, etc.
The impact is often addressed as the usefulness of the research and it is determined by its purpose. Thus, researchers are increasingly encouraged to undertake the research commissioned by the government, local authorities or companies, which makes research even before its creation starts highly impactful. This type of research is thus designed to answer specific societal needs or challenges. It is often conducted in close cooperation with other stakeholders and includes perspectives of several co-creators, which often makes it more relevant and applicable. Furthermore, the results of these research projects are in general widely disseminated – published in newspapers, advertised in media, etc.
However, it may be argued that due to the lack of clarity in the way policy absorbs research, and sometimes very long delays between research and impact production, it is difficult to evaluate research impact and relevance. Thus, one might raise the question of the fluidity of impact, as certain topics, which seemed not to be highly impactful at some point turned to be extremely relevant some years later. Similarly, some of the research designed to answer certain problems in society by the time it is finished can be irrelevant if another solution arises in the meantime.
The growing awareness of the importance of the social impact of research instigated the emergence of the Social Impact Open Repository (SIOR), launched by the European Commission in 2015, aiming to disseminate different social impact stories in order to inspire and encourage future impactful research[34]. It cites evidence of both real social impact, where research already created certain societal change, and potential impact, where the research results have not yet been completely translated into societal improvements, but there are some indicators that it will create societal improvements[35]. This open repository became 2015 a reliable tool for evaluating the social benefits of research and communicating different impact pathways in social sciences and humanities.
Pulido and colleagues analyzed channels of dissemination of research impact, focusing on social impact coverage ratio (SICOR) in order to identify the percentage of tweets and Facebook posts related to impact in the total number of social media data on a particular research project[35]. As social media has increasingly become the tool for academics to boost the visibility of their research, some of the communication on these platforms, as their results demonstrated, refers to the social impact of research. While their research has been limited in scope (only 10 projects were analysed), it showed that there is some, but not much of the social impact evidence in social media. Thus, this tool should be further exploited in years to come and scholars should seriously consider publishing concrete qualitative or quantitative evidence of their real or potential research impact.
A similar methodology was introduced by Cabre-Olive and colleagues, who suggested using social media as a tool for understanding emerging topics in the society in order to define research, which may create significant social improvements[36]. Another important contribution in the field was made by Gomez, Puigvert and Flecha who apply principles of critical communicative methodology to research in order to advocate for more stakeholder engagement and shared creation of knowledge (and thus also the impact)[37]. It highlights the importance of dialogue between researchers and social actors in order to use the community’s cultural intelligence in designing and conducting research. These interactions ensure not only that the research responds to the challenges important for the society, but also that the community in question better accepts and more quickly implements the research results, translating them into a long-term impact.
MEASURING SOCIETAL IMPACT IN EUROPEAN UNION PROJECTS
In the EU assessment and evaluation, what we understand as “impact” has often been referred to as “relevance”. According to the European Commission’s reports “Better Regulation “Toolbox” and “Applying relevance-assessing methodologies to Horizon 2020”, “relevance looks at the relationship between the needs and problems in society and the objectives of the intervention.” Indeed, it is indispensable to continuously screen and benchmark objectives and activities of the EU projects against major strategic goals and priorities of the European Commission. Thus, one of the major challenges is to assess the “relevance” of the framework programs, in order to verify to which extent the original objectives of the particular framework program still coincide with the current priorities and needs.
The general methodology for evaluation of relevance of the framework programs includes three major steps, which aim to determine the degree of compatibility of the framework program with the institutional perspective (Is the program in line with the EU and international priorities?), citizens’ perspective (Is the program in line with the needs of the EU citizens?) and science and technological perspective (How well adapted is the program to the subsequent technological or scientific advances?). In order to respond to each of the questions, two sub-questions have been formulated, as the assessment of relevance requires to: (1) identify policy priorities, citizens’ needs and scientific and technological advances in the first place, so that (2) the framework program could be put into perspective and benchmarked against the identified priorities, needs and advances.
Different methods have been employed for answering each of these questions. First, in order to identify EU and international priorities and asses the compatibility of the framework program with it, experts’ exploratory approach and computer content analysis (text mining) are employed. Additionally, societal needs within the EU are also identified using on-line content analysis (Eurobarometer surveys and EC consultation reports, social media analysis). Besides the experts’ exploratory approach; several other methods are used to assess the relevance of the framework program to the technological and scientific advances, such as the bibliometric analysis; social media content analysis; and patent analysis.
What the document specifically emphasizes is that the goals set at the beginning of the framework program do not necessarily correspond to the contemporary challenges, as the political priorities, societal needs and technological advances change over time. This is also true for most of the research projects in social sciences and humanities, designed to create a certain impact. But, by the time the project is finalized and the “impact” achieved, the problem might be “outdated” and no longer relevant for the society. Therefore, the “relevance” or impact analysis is an ongoing process, aiming to continuously question and adapt projects to better reflect current challenges and needs. Moreover, when designing a project, one has to take into consideration not only the current, but also the future relevance of it, and try to foresee future priorities, challenges and developments.
The Better Regulation “Toolbox” highlights the importance of identification and assessment of the most significant impact in the process of project/policy evaluation. The process consists of first mapping out all potentially relevant impacts and then selecting for the in-depth analysis of those which are likely to be significant. The key impacts for screening, according to the document, maybe split into 3 major groups – economic, social and environmental. The selection of significant impacts is based on the relevance of the impact, absolute scope, relative size of expected impact for specific stakeholders and the importance of impacts for Commission horizontal objectives and policies. The key economic impact categories to be closely monitored include, amongst others, the impact on operating costs and conduct of business, the impact on administrative burdens, trade and investment flows, competitiveness, the position of SMEs, innovation and research, public authorities, consumers or macroeconomic environment. For the social impact, the Toolbox suggests monitoring the impact on employment, working conditions, effects on income, distribution, social protection and inclusion, governance, public health systems, security, education and training, culture and the social impact in third countries. The major environmental impacts include the impact on climate, air quality, water quality, biodiversity, soil quality, waste production and recycling, efficient use of resources, sustainable consumption, international environmental impacts, transport and energy use, animal welfare, prevention of environmental risks and the land use. Finally, it is possible to reflect on the impact in the field of fundamental rights, such as dignity, liberty of individuals, private and family life, freedom of expression and information, personal data protection, asylum, property rights, gender equality, children’s rights, administration and justice.
CONCLUSION
The aim of this article was to provide an initial overview of the main debates in the field of research evaluation and impact creation in social sciences and humanities. The existing evaluation tools and methods do not necessarily reflect the quality of transdisciplinary research nor encourage advances across the fields and innovative projects. This is why the number of scholars started advocating for the introduction of new evaluative criteria and methods. We aimed to identify these innovative strategies for evaluating research in social sciences and humanities, as well as the main challenges and dilemmas researchers face when it comes to assessing transdisciplinary research. It was concluded that while most of the new evaluation strategies still need to be re-shaped and put into practice, the societal impact has been a widely accepted tool for enhancing research quality. It has been increasingly used to asses research potential, outreach and practical implication and as such represents an important evaluation mechanism, out of which most of the funding became dependent. In conclusion, we analyzed some of the major mechanisms for assessing societal impact in EU projects, highlighting major impact categories and tools for their evaluation. Some of these strategies might be in future translated into the sphere of evaluation of research articles and smaller-scale projects, using the same categorization and assessment process.