The Vocabulary of Emotions Test (VET): Psychometric Properties of the Serbian Version

Considering the necessity to broaden the range of valid emotional intelligence (EI) measures, this study further examined the psychometric properties of the Vocabulary of Emotions Test (VET; Takšić, Harambašić, & Velemir, 2003). Participants were 333 university students (75.4% female) from Serbia, 245 studying education sciences or humanities, and 88 pursuing natural/technical sciences. All were administered the Serbian version of the VET and two standard tests of verbal intelligence (VI) and asked to report their average grade. The VET had good internal consistency (α = .83) and correlated positively with both measures of VI (r = .37 and .45), as with participants’ grades (r = .20). Significant group differences emerged on the VET, but not the two VI tests, with female participants and the Education Sciences/Humanities group scoring higher than their respective counterparts. A hierarchical regression analysis with VI (Step 1) and VET scores (Step 2) as predictors, and grades as the criterion, yielded a significant model (R = .04) with emotional vocabulary explaining additional variance over VI (ΔR = .02) and surfacing as the only independent predictor (β = .18) of academic achievement. Further analyses showed emotional vocabulary to incrementally predict achievement in education sciences and humanities (ΔR = .03, β = .19), but not in natural/technical sciences, in which context neither VI nor emotional vocabulary were statistically significant predictors of students’ grades. The current results are interpreted as promising evidence on the reliability and validity of the Serbian VET, encouraging further use of this instrument in EI research.


Introduction
A major challenge taken up by the proponents of emotional intelligence (EI) has been to establish the construct as a true or standard intelligence (see e.g., Mayer, Caruso, & Salovey, 1999;Mayer, Salovey, Caruso, & Sitarenios, 2001). Indeed, the ability model of EI proposed by Mayer and Salovey (1997) has been developed precisely so as to fulfil the conceptual criteria for a distinct intellectual ability. Moreover, the model has provided a good framework to operationalize the construct and test whether it can also meet the empirical criteria for a new intelligence (Mayer et al., 1999). It has thus been established that measures of EI derived from the Mayer-Salovey model indeed conform to the positive manifold of traditional measures of cognitive abilities and that they can be integrated into the Cattel-Horn-Carroll (CHC) model as yielding another broad factor of human intelligence (MacCann, Joseph, Newman, & Roberts, 2014;Mestre, MacCann, Guil, & Roberts, 2016). Evidence has also been presented that ability EI contributes over and above academic intelligence, personality or both to the prediction of socioemotional outcomes such as quality of social interactions and interpersonal relationships (Lopes et al., 2004;Lopes, Salovey, Côté, & Beers, 2005;Lopes, Salovey, & Straus, 2003) or psychological well-being (Altaras Dimitrijević, Jolić Marjanović, & Dimitrijević, 2018;Extremera, Ruiz-Aranda, Pineda-Galán, & Salguero, 2011). On top of this, ability EI has been shown to possess some incremental predictive power in relation to academic achievement (Gil-Olarte Márquez, Palomera Martín, & Brackett, 2006;Ivcevic & Brackett, 2014;Rivers et al., 2012), in which context it is speculated to be particularly relevant at the level of post-secondary education and to potentially have a differential impact on success in different academic fields (Parker, Saklofske, Wood, & Collin, 2009).
While, overall, the prospects of establishing EI as a true intelligence appear good, there are two major challenges that still need to be tackled, both of which concern the measurement of the construct. The first is the requirement for more tests that would serve as alternative or partly overlapping measures of (particular aspects of) ability EI. While there are a number of tests to assess any aspect of traditional, academic intelligence, the assessment of ability EI has relied mostly on one procedurethe Mayer-Salovey-Caruso Emotional Intelligence Test (MSCEIT; . This obvious lack of diversity of EI measures presents a serious obstacle for resolving some important questions; for instance, the issue of placing EI within the CHC model of intelligence cannot be conclusively resolved without a broader spectrum of tests to measure the construct. Ultimately, unless they can be confirmed with multiple tests of EI, any empirical results regarding the construct may be disputed as being specific to the measurement method employed (MacCann & Roberts, 2008). The problem seems to be particularly pressing when it comes to assessing the two higher branches of EI, i.e., the abilities of understanding and managing emotions, which capture the "strategic" essence of the construct . Although there have been several laudable attempts to develop such tests (e.g., MacCann & Roberts, 2008), particularly in the Croatian context (e.g., Babić Čikeš, Marić, & Šincek, 2018;Buško & Babić Čikeš, 2013;Kulenović, Balenović, & Buško, 2000;Mohorić, 2016;Takšić, Harambašić, & Velemir, 2004), so far only few of the proposed measures have been validated in (culturally) different samples and introduced to a wider, international audience.
The second is the requirement for any test of ability EI to be much like a standard intelligence test. Despite obvious efforts to design the MSCEIT in this vein, this test is often criticized for using weighted consensus scoring (regardless of whether the normative sample is drawn from the general population or a pool of experts), rather than a scoring key based on a single, indisputably correct answer for each item (MacCann, Roberts, Matthews, & Zeidner, 2004). To circumvent this problem, both MacCann and Roberts (2008) and Mohorić (2016) relied on Roseman's appraisal theory of emotions to determine the correct answers for their tests of emotional understanding; nevertheless, while yielding some evidence of validity, both attempts faced issues regarding internal consistency, which, at best, reached acceptable levels (< .80). In other words, although they came closer to standard intelligence tests in terms of scoring, they did not so when it comes to the reliability of measurement. Thus, the quest for psychometrically sound ways to supplement the MSCEIT in assessing ability EI remains ongoing.

Measuring Ability EI with the Vocabulary of Emotions Test (VET)
Given the above, our attention has turned to an instrument proposed to target the strategic area of EI in a manner that most closely resembles that of a standard intelligence testthe Vocabulary of Emotions Test -VET (Takšić et al., 2003). In the following sections, we will present the rationale of this instrument and review available evidence on its psychometric properties.
Rationale: Why focus on emotional vocabulary? The VET probes into the ability to label emotions with words and understand the relations (of similarity and difference) between emotions as reflected in language. By focusing on emotional vocabulary, the authors of the test were able to resolve some important theoretical and practical issues in measuring ability EI.
According to Mayer and Salovey's ability model, the Understanding Emotions branch also encompasses "linguistic information about emotions" (Mayer et al., 2001). In fact, it is precisely the ability to label emotions with words and to "recognize the relationships among exemplars of the affective lexicon" that is regarded as the most fundamental competency within this branch of ability EI (Salovey, Mayer, & Caruso, 2002, p. 161). At the same time, the Understanding Emotions branch has been pinpointed by Mayer et al. (2001) as the "central locus of abstract processing and reasoning about emotions" (p. 235) and the "core" cognitive component of the construct of EI (p. 234). Thus, by drawing on emotional vocabulary as content for their test, the authors of the VET made a crucial step towards ensuring the substantive validity of the proposed measure.
This choice of content also allowed the authors to avoid some of the most common shortcomings of the present tests of ability EI (Costa, Faria, & Takšić, 2011). For one, they circumvented the use of relatively complex item formulations (vignettes) which are characteristic of situational judgment tests and which commonly result in low(er) reliabilities, especially in different cultural settings. Second, they were able to deal with the requirement of constructing a test of EI that would essentially have the same appearance as a standard test of cognitive abilityin this case a regular vocabulary test (such as the one from the California Tests of Mental Maturity). In other words, the VET was given the same multiple-choice format as a well-established measure of crystallized intelligence yet differentiated from the latter by employing target words that refer exclusively to emotions. Last but not least, a major advantage of using emotional vocabulary as the sole content of the test was the availability of a correct answer as provided by the dictionary of the Croatian language.
Psychometric properties and validity evidence. The VET has hitherto been empirically tested and validated on samples of high-school students from Croatia and Portugal (Costa et al., 2011;Takšić et al., 2004) and a sample of elementary school students from Croatia Mohorić, Takšić, & Šekuljica, 2016).

Sensitivity and reliability.
In the above-mentioned studies, the VET has demonstrated good sensitivity, with distributions of scores approximating the normal curve (for details on mean scores in particular samples see Table 1). Previous studies have also established excellent (α = .90; Takšić & Mohorić, 2008) or good internal consistency (α = .84 in Costa et al., 2011;α = .84 in  for the original Croatian version of the test; however, alpha was lower (α = .71) for the Portuguese translation (Costa et al., 2011).
Meaningful group differences. Significant gender differences on the VET were found in Croatian samples, with girls scoring higher than boys both at the elementary  and the secondary school level (Costa et al., 2011). Considering that women are theoretically expected to be more emotionally intelligent and have indeed been found to outscore men on the MSCEIT, at least in samples of adolescents and young adults (Altaras Dimitrijević & Jolić Marjanović, in press), the established gender differences on the original version of the VET speak in favour of the test's validity.

Convergent-discriminant validity.
Available evidence also supports the convergent-discriminant validity of the VET: The test was found to display positive correlations in the .30 -.35 range with measures of fluid intelligence Takšić et al., 2004), and a larger positive association, slightly above .50, with the TEU Mohorić et al., 2016); not surprisingly, the largest correlation, r = .67, was observed between the VET and the Vocabulary Test from the California Tests of Mental Maturity (Takšić et al., 2004). Despite using the same item format as the latter, the VET still had 44% of specific variance not explained by general vocabulary.
Criterion validity. In two instances, the VET was found to act as a significant predictor of scholastic achievement: In a longitudinal study with Portuguese secondary school students, it made a significant, unique contribution to students' grade point average in grades 10 through 12 (Costa & Faria, 2015), while in  study in Croatia, the VET emerged as an independent predictor of elementary school students' mean grades (β = .36 for girls and .18 for boys), adding to the prediction of this criterion over and above nonverbal intelligence and personality factors. The latter study also found the VET to incrementally and positively predict girls' aggressive behaviour, although it did not surface as a statistically significant predictor of prosocial behaviour at school . Finally, a first insight into the Serbian version of the VET showed the test to incrementally predict intercultural judgment and decision making, as an aspect of intercultural effectiveness (Altaras Dimitrijević, Starčević, & Jolić Marjanović, 2019).

The Present Study
Overall, available examinations of the VET have yielded rather promising results on its psychometric properties, but to gain broader relevance, the instrument would need to be further validated with different samples and in different cultural settings. The aim of this study, therefore, was to test the Serbian version of the VET, whichin linguistic termspresents only slight modifications to the Croatian original and was thus expected to show a comparable level of sensitivity and reliability. Following the recommendations uttered by other researchers, we were particularly focused on examining the VET's convergent-discriminant validity in relation to verbal intelligence (cf. Costa et al., 2011;Mohorić, Takšić, & Duran, 2010), as well as its predictive validity vis-à-vis academic achievement at the level of post-secondary education and in different fields of study (cf. Parker et al., 2009). As evidence of validity, the VET was expected to be positively associated but not redundant with verbal intelligence and to add to the prediction of academic achievement, particularly in fields of study where EI should matter more, such as education sciences and humanities (Matthews, Zeidner, & Roberts, 2002).

Participants and Procedure
Participants in the study were 333 undergraduate students from four Serbian universities, 245 of whom (73.6%) pursued studies in education sciences or humanities, and 88 of whom (26.4%) were involved with natural and technical sciences. The sample included 251 (75.4%) female and 82 (24.6%) male students, and their age ranged from 19 to 42 years (Mage = 21.44; SDage = 2.53). As within the target population, samples from the two study fields were unbalanced with respect to gender, with 225 (91.5%) female participants in the Education Sciences/Humanities and 26 (29.9%) in the Natural/Technical Sciences group.
All participants were invited to participate and tested during their regular class hours at university premises. Before data collection, all were informed on the general nature of the study, after which those who agreed to participate provided their informed consent. Participation in the study was voluntary and compensated with provision of extra course credits or individual feedback on test results (depending on the faculty and field of study).

Measures
Participants filled in a brief information sheet, asking them to specify their gender, year and field of study, and average grade in all exams previously taken. The following measures were administered to assess their EI and verbal abilities: Vocabulary of Emotions Test (VET; Takšić et al., 2003). As described above, the VET is a 35-item test designed to assess an important aspect of the Understanding Emotion branch of the ability model of EI (Mayer & Salovey, 1997), i.e., emotional vocabulary. The test adopts a classical vocabulary test format, but with target-words referring to feelings and emotions. Each target-word is paired with six alternatives only one of which is equivalent to it in meaning (e.g., happysad, lonely, angry, merry, satisfied, or none of the above). Items are scored 0 or 1, the overall score thus ranging from 0 to 35. Data on the VET's previously established psychometric properties are given in the Introduction. (Stevanović, 1988). Verbal intelligence (VI) was measured using two tests targeting verbal reasoning and verbal working memory. Both are well-established and reliable measures of verbal abilities for Serbian-speaking participants. Verbal reasoning was assessed via the 30-item Verbal Analogies Test (VAT), using the traditional word analogy item format (e.g., shoe : foot = glove : ?skin, hand, winter, wool, textile) and yielding scores ranging from 0 to 30. Verbal working memory was measured using the 25-item Scrambled Sentences Test (SST) that requires test-takers to quickly mentally rearrange words to arrive at a meaningful sentence and then follow the instruction contained within it (e.g., this with pencil your sentence underline). Scores on this test take values from 0 to 25. Table 1 gives the mean, standard deviation and range of scores on the VET in the present sample, in comparison to data obtained with the original Croatian version of the instrument. The median score in the present sample was 25, and average item difficulty .71. The Kolmogorov-Smirnov test yielded a statistically significant Z value of 1.42 (p < .05), whereas both Skew and Kurtosis were well below 1 (-.39 and -.49, respectively).

Internal Consistency
As indicated by the results of reliability analyses, the Serbian adaptation of the VET has good internal consistency, with α = .83. There were no relevant changes in overall alpha when individual items were deleted.

Correlations
Correlations between scores on the VET and the two measures of verbal intelligence are given in Table 1. These correlations were uniformly statistically significant, positive and of moderate size, with no statistically significant difference in the size of the VET's correlations with verbal analogical reasoning and verbal working memory (z = 1.53, p = .063). Both the VET and the Verbal Analogies Test were also positively albeit only weakly related to the average grade, which was not the case for the test of verbal working memory (see Table 1). Performance on none of the three ability tests was related to age (rVETxage = .01, rVATxage = -.01, and rSSTxage = -.07, all ns), and neither was average grade (r = .02, ns), which is why age was not considered in further analyses.

Hierarchical Regression Analysis
Finally, the VET's incremental validity was tested using 2-step hierarchical regression analysis, with the two VI tests entered in the first and the VET score in the second group of independents serving to predict academic achievement as indicated by participants' average grade during the course of their studies (criterion measure). As shown by the results of this analysis (Table 2), the two VI measures were not successful in predicting academic achievement. In fact, the regression model explained a statistically significant amount of the variance in grades (4%) only after the VET was entered as a predictor. The obtained Beta coefficients also indicated that emotional vocabulary was the only predictor variable that had a statistically significant contribution to explaining the chosen criterion. The same pattern was observed when the criterion measure was standardized with respect to the field of study, to accommodate for the fact that the latter had a statistically significant effect on students' grades.
The same regression model was then separately tested in the two subsamples differentiated by field of study. In the Education Sciences/Humanities group, the VA score explained 3% of the variance in average grades, and the VET added to the prediction in Step 2, with R 2 increasing so as to account for a total of 6% of criterion variance. In the Natural/Technical Sciences group, neither Step 1 nor Step 2 of the regression analysis yielded any statistically significant parameters.

Model
Overall sample

Discussion
This study aimed to test the psychometric properties of the Serbian version of the Vocabulary of Emotions Test (VET), as a possible addition to the currently very limited range of psychometrically sound measures of ability EI, particularly its Emotional Understanding branch. Considering that this version of the test presents only slight linguistic modifications to the Croatian original, it was expected to perform similarly to the latter, yielding comparably good results when it comes to its distributional properties, internal consistency, associations with verbal intelligence (as evidence of convergent-discriminant validity), and the prediction of academic achievement (as evidence of incremental validity).
As a first observation, the mean and range of scores obtained in the present sample were slightly higher than previously reported for Croatian and Portuguese students (Costa et al., 2011;. This is quite understandable, given that the present study was performed with university students, whereas previous research recruited younger, school-aged participants. Indeed, comparing the means of samples drawn from different populations (Table 1), there seems to be a steady and meaningful increasement in VET scores from elementary school to high school to university students. Of course, this regularity would have to be confirmed with samples stemming from one country, yet can be taken as a preliminary indication of the VET's validity, as EI is expected to increase with age and level of education (e.g., Altaras Dimitrijević & Jolić Marjanović, in press;Mayer, Salovey, Caruso, & Cherkasskiy, 2011).
Concerning the VET's sensitivity, the distribution of scores in the present sample was approximated by a left-skewed curve, which, according to the Kolmogorov-Smirnov test, diverges from the normal one. Still, an inspection of other relevant statistics leads us to conclude that any asymmetry in the distribution of scores is only a minor one. Both Skew and Kurtosis were well below 1, which is regarded as acceptable (George & Mallery, 2010); the lowest and highest score were distant enough to ensure adequate score dispersion; finally, the mean was almost equal to the median. All of this speaks for sufficient sensitivity of the Serbian VET, although it should be noted that the test might be somewhat less sensitive and exhibit ceiling effects for respondents of higher education (which is expected given that item selection for the standard 35-item version of the VET was based on the performance of a high-school sample and emotional vocabulary is assumed to develop with further education).
The Serbian VET also showed good internal consistency in the present sample, with Cronbach's alpha reaching an almost identical value as in two studies with Croatian students (Costa et al., 2011;. The fact that equivalent alpha values were obtained in Croatia and Serbia, while internal consistency was considerably lower in the Portuguese context (Costa et al., 2011) brings to mind the possibility that translating the VET into languages that are less related to Croatian might present an issue and require a meticulous approach to preserve the psychometric qualities of the instrument and achieve cultural equivalence of its different versions.
While the VET was previously found to have a substantial proportion of specific variance in relation to a standard vocabulary test (Takšić et al., 2004), it has also been noted that further evidence was necessary to firmly establish its distinctness from tests of verbal intelligence (Mohorić et al., 2010). The present results do provide some support for the VET's convergent-discriminant validity, by showing emotional vocabulary to be positively associated with two aspects of VI, i.e., verbal analogical reasoning and working memory, yet not completely overlapping with them. Admittedly, the correlation between the two VI tests was only slightly larger than their correlation with the VET, suggesting thatfrom a statistical point of viewall three tests might also be subsumed under a general construct of verbal intelligence. Nevertheless, the VET score was further differentiated from standard tests of VI through its associations with gender and field of study, both of which had a nontrivial effect on emotional vocabulary, yet not on verbal analogical reasoning or working memory. A clear implication of these results is thatalthough EI relies heavily on language, whereby emotions are labelled and communicatedit may ultimately be a distinct quality from traditionally conceived verbal intelligence (cf. Mayer et al., 2001).
The above-mentioned group differences on the VET stand as evidence of validity in their own right, too. Females scored higher than males, which is consistent with the results obtained in Croatian samples with the original version of the VET (Costa et al., 2011;; it also corresponds with findings reported for the MSCEIT, at least in samples of adolescents and young adults, and conforms to the theoretical expectation that women would excel in EI (Altaras Dimitrijević & Jolić Marjanović, in press). The finding of higher VET scores in students of education sciences and humanities (in comparison to those studying natural/technical sciences) can also be meaningfully interpreted, as these studies, and the corresponding professions, are ultimately concerned with understanding and managing human needs and behaviour, and usually involve much social interaction, all of which requires skills from the domain of EI (Matthews et al., 2002). It might precisely be higher EI abilities that orient some students (the majority of whom are female) towards these professions, but it may also be the pursuit of the respective studies which tend to promote EI-related skills.
Although we found the VET to act as a statistically significant predictor of academic achievement in the full sample, our additional analyses revealed that it was in the Education Sciences/Humanities group that some portion of the criterion variance could be explained by verbal intelligence and emotional vocabulary, with the latter exhibiting incremental predictive power over the former; in the Natural/Technical Sciences group, neither VI nor VET scores were predictive of academic achievement. Given that university students are generally highly selected in terms of verbal intelligence, it seems that in the Natural/Technical Sciences group a threshold has been reached, beyond which individual differences in VI do not produce corresponding differences in achievement (which more likely would have been predicted by logical-mathematical and visualization abilities); in education sciences and humanities, however, VI remains a relevant predictor of success even at such high levels of competence building as are those taking place in university education. Nevertheless, with VET scores entering the equation, only emotional vocabulary made an independent contribution towards explaining academic achievement differences in the Education Sciences/Humanities group, testifying not only to the incremental validity of the VET but, more generally, to the relevance of EI in academic fields devoted to the understanding and managing of human behaviour (cf. Parker et al., 2009). Admittedly, the amount of additional variance explained by the VET was small (3%) but should be weighed against the fact that the test covers only one facet of EI, with the possibility of obtaining larger effects for the full range of EI abilities (cf. Costa & Faria, 2015).

Limitations and Future Directions
Several shortcomings with regard to sampling potentially limit the validity of our findings. First, the present sample was not balanced with respect to gender and field of study but included a considerably larger number of female participants and those involved in education sciences and humanities. Most problematically, the gender disbalance was also present at the level of subsamples, with the Education Sciences/Humanities group comprised of predominantly female participants. Obviously, this raised the question as to what extent the observed differences by field of study might be attributed to the effect of gender, yet at the same time prevented us from resolving it adequately, as one of the key assumptions for two-way ANOVAan equal number of observations within each subsamplewas not met. Ultimately, we may argue at this point thateven if the effect of field of study could be statistically reduced to that of genderthe latter variable would not in itself bear much explanatory potential when addressing group differences in EI (cf. Fernández-Berrocal, Cabello, Castillo, & Extremera, 2012); these differences eventually have to be interpreted with regard to the groups' specific experiences, which in turn brings us back to such variables as field of study. As a further limitation, however, we categorized participants into two large groups based on their academic field, disregarding potential differences between particular faculties within these major categories. Finally, participants were unequally distributed across different years of study, which prevented us from exploring any systematic effects of this variable on VET scores but could have affected the reliability and validity of the criterion measure (average grade based on all exams taken up to the moment of data collection).
Clearly, further studies of the VET would be needed to confirm the present findings and broaden the scope of validity evidence pertaining to this instrument. Considering that it has hitherto been tested in samples of students from three levels of education (elementary school through university), the logical next step would be to see how the test performs with adults pursuing different professions and to establish at which ages and levels of education it achieves optimal sensitivity. A related issue that needs to be attended to is the VET's predictive power beyond the academic context. Last but not least, no study has examined the VET concurrently to the MSCEIT, so that it remains for future research to establish how the former relates to this comprehensive measure of ability EI.
Overall, the findings obtained in this study certainly encourage more research on the VET, including efforts to adapt it into other languages. Providing further evidence of its reliability and validity, they also raise the prospects of enriching the spectrum of ability EI measures, particularly for the purpose of assessing emotional understanding abilities as a core aspect of EI.