Predicting Students' Outcomes in Blended Learning: An Empirical Investigation in the Higher Education Context

The main goal of this research was to clarify which aspects of blended learning increase a student's knowledge level measured by the course final grade. The questionnaire-based survey was used for gathering students' attitudes towards some aspects of blended learning. A principal component analysis and hierarchical clustering of variables were applied to extract the components that describe dimensions of blended learning and represent the explanatory variables in a multiple regression model with student's final grade as a dependent variable. Using a Two-Step cluster analysis to reveal natural groupings based on the answers in the questionnaire, two clusters were formed having a statistically significant difference between the means of final grades. The research revealed that the organization of a course and the study material supporting face-to-face teaching are essential features with an impact on student's final success. The study also showed that the aspects of traditional face-to-face teaching are more strongly linked to higher grades than the aspects of e-courses.


INTRODUCTION
From the beginning of the new millennium, universities and other higher education institutions have increasingly tended to involve information and communication technology (ICT) in the education process. We are faced with the digitalization of learning and teaching in higher education resulting in various forms of e-learning. Supported by technology, the pedagogical process enriches students' learning experience. Blended learning (BL), where face-to-face (F2F) is combined with online learning, appeared as a novel trend in teaching and learning modes [38] and is continuously growing at all levels of education. In the survey, investigating e-learning courses in the USA, authors reported that the number of BL use-cases in the colleges and universities is growing faster than those of the traditional ones [2].
A study about the type of e-learning in 249 higher education institutions (almost all were universities) from 38 European countries showed that almost all of the participating institutions had started embracing e-learning, with majority using BL (91%) and 82% offering full online learning courses. The results also revealed that the institutions, although from different countries and various systems, highlight equal reasons for introducing elearning, namely greater flexibility of learning, more efficient use of time in F2F mode, and more learning opportunities for students in online learning [14]. Efficient learning management system (LMS) provides the support for teaching and learning in online environment with many functionalities. Therefore, a LMS is not just a system that supports sending messages, providing learning material or keeping an online gradebook, but should also allow teachers and learners to be active participants in the e-learning environment, e.g. by using quizzes and assignments or even problem-solving teamwork activities, question-and-answer forums or (virtual) online simulations [4,19].
Teachers are faced with challenges on how to change their pedagogical process and redesign the methods deployed to comply with these new environments. A significant change was the shift from a teacher-centered educational process to the student-based learning mode [1].
In the new learning paradigm, students take an active role in their own learning process and take responsibility for acquiring knowledge in the learning process. Thus, the student-based style emphasizes the student's individuality, interests, abilities, and learning style, and the teacher is a tutor and learning consultant for student, helping and offering support to accumulate and get the knowledge.
The paper describes a questionnaire-based survey among students of the Faculty of Public Administration, University of Ljubljana. The main goal of the research was to identify any aspects of BL that increase a student's knowledge level measured by the course final grades. In the study presented, two consecutive academic years' surveys were analysed. Due to the variability of the courses, the analysis was made on each course separately.

LITERATURE REVIEW
With the development of ICT, blended learning, also known as hybrid learning or mixed-mode instruction, had changed and its definitions have evolved. In the beginning, BLwas defined as a combination of traditional F2F and distance delivery systems [30]. Few years later, Graham [15] defined BL as a combination of F2F instruction and computer-mediated instruction. Köse, [20] proposed that "blended learning is a learning approach that contains different types of education techniques and technologies". The proportion between online and F2F instruction can vary from mostly F2F with minimal activities and resources in online learning, to mostly e-learning [17]. In the literature, definition of BL is not uniform, however it is usually based on the fact that F2F time in class is reduced and replaced with online instructions. Owson & York.
(2018) recommended that the proportion of online content delivered in an e-course should range between 33% and 50%. Namely, the lower limit is great enough to exclude "incidental uses of Internet, such as downloading references and turning in assignments" [28] whereas the upper limit separates BL from fully implemented elearning [2].
Many authors agree that BL brings together the best of the traditional F2F and online learning methods (e.g. [16,21,31,35]. As Makkar, Alsadoon, Prasad & Elchouemi [25] stated, BL provides: (1) an environment for learning and teaching without time or distance restrictions; and (2) flexibility in students' desire to enhance their academic performance. Technology has facilitated the access to knowledge through online learning platforms without restrictions on time or space. However, when studentcentred mode and pedagogy are at the forefront, e-learning indicates also an interactive learning mode, where interactions and communications between participants are frequent and feedback is effective and useful [17].
BL courses are often implemented within platforms reachable via Internet. With the gradually evermore powerful learning management systems, besides providing rich and interactive learning resources, the teacher has the opportunity to use different tools for a collaborative learning, namely the interactions can go "student to student", "student to teacher" and "student to teacher and back to student". Therefore, it is very important to evaluate learning management systems [32], focusing on the effectiveness of BL [18] and the impact on student performance. According to Martinez-Caro [27], the increasing use of BL is clearly changing the traditional understanding of educational activities. Since student satisfaction in the classroom is a naturally desirable goal for all teachers and educational institutions, effective service quality measures are urgently required for BL in which student satisfaction and continuous improvement of the learning environment should be the two main areas of focus [40].
In recent years, the research studies were also focused on satisfaction in e-learning in higher education and its benefits for students, particularly for student performance [27], retention [10] and class attendance or student engagement [7] with the purpose of providing guidelines for improvements. Several models and methods are applied to measure the effectiveness of BL and student satisfaction, each with its own advantages and disadvantages, and also various aspects of BL were considered looking for the influence on learning performance (e.g. [18,39,40]). The results of many studies show that BL has a positive impact on student performance, such as raised exam grades (e.g. [22,23,29,37]. Lopez-Perez et al. [23] also found out that BL decreases dropout rates. Besides the positive effects on learning, the results of some studies point out that students performed better in a traditional F2F educational process than in a pure online learning where all the content is delivered exclusively online [11,13]. Brown & Liedholm [3] compared three modes of instruction -pure F2F, blended and pure online, and their conclusion was that in the F2F mode students did significantly better with the most complex contents than online students and better than students in a BL environment. The outcomes of a study by Kwak, Menezes & Sherwood [21] strongly suggest that BL does not influence student performance at all. Moreover, student performance is not affected by the introduction of BL, irrespective of students' age, nationality, primary language or achievement level. Nevertheless, they found out that introducing BL had a different impact on male and female students; more specifically, if it is positive for female students, it is negative for male.
In recent years, the need to analyse the data generated during the educational process in order to determine factors influencing learning performance of learners [5] has triggered research focused on various factors [24,33,34]. Process known as learning analytics tends to use educational data to improve learning and teaching [12]. One of common procedures used is analysing students' outcomes in e-learning environment and predicting their final performance. The prediction is based on variables measuring students' behaviours and outcomes in online environment extended with data from other resources. One of the benefits of results' prediction is an early identification of the students at risk and reduction of attrition rates.
For example, Romero et al. [33] analysed students' forum usage in the first-year course in computer science trying to predict students' success or failure in a course. They compared different classification and clustering algorithms and an overall conclusion was that the subset of variables, such as number of messages sent, the number of words written and the average evaluation obtained in messages, allows for accurate prediction of students' success. Examination performed at the Madison School of Pharmacy identified best practices for the use of BL. Using the focus groups method, they identified 10 best practices, including instructors' feedback and the introduction of user-friendly technologies [26]. Recently, Chen, Breslow, & DeBoer [6] found out that higher engagement with a computer-based feedback tool is positively correlated to performance of students in introductory physics course. Research conducted in a calculus course revealed seven factors having a significant impact on students' final academic performance (Lu et al., 2018). Four were linked to e-learning (number of activities per week, watching videos -number of clicks "play" and clicks "backward", weekly quizzes scores) while three of them were linked to F2F learning (homework, offline practice scores, interaction with tutors). Conijin, Snijders, Kleingeld & Matzat [9] studied the correlation between final exam grades and students' activities in 17 BL courses. Their research showed that the results of predictive model vary across courses. They also detected that discussion forums and wikis were significantly correlated in only few courses. Consequently, they suggested including additional data sources in further research to provide more accurate prediction.

EMPIRICAL RESEARCH 3.1 Data and Methodology
The empirical research was conducted among students of the Faculty of Public Administration (FPA), which is part of the University of Ljubljana, Slovenia. The FPA implemented BL in the 2010/11 academic year, using LMS Moodle [36]. Currently, 70% of each obligatory course is held in the traditional F2F way while for the remaining 30% study materials and activities for students are prepared in online courses.
The data used in the study were taken from the questionnaire-based surveys conducted in two consecutive academic years, 2014/15 and 2015/16 [37]. The survey was carried out online in the FPA's LMS Moodle environment. At the FPA teachers come from three chairs: (1) Chair of Economics and Public Sector Management (EPSM), (2) Chair of the Administrative-Legal Area (ALA), and (3) Chair of Organization and Informatics (OI). The undergraduate study lasts three years -there are two undergraduate study programs (UN: university study program, PS: professional study program).
Students voluntarily participated in the survey, without any coercion or undue influence. The questionnaire consisted of 23 statements on how BL related to the characteristics of an e-course (Tab. 1): students' attitudes to e-learning (EC1-EC6, GI1-GI7), extended with three questions regarding F2F learning (FF1-FF3) and seven questions on general attitudes to e-learning and LMS Moodle (GE1-GE7). The students expressed their level of agreement with the statements on an ordinal scale from 1 ("totally disagree") to 7 ("totally agree"). Table 1Statement from the questionnaire measuring aspects of blended learning Abb. Aspect of BL GE1 Working with computers for study purposes suits me. GE2 The Moodle e-learning system is easy to use. GE3 The Moodle system is reliable and stable (it does not crash, submitted tasks are not lost). GE4 I am satisfied with the support and assistance in the event of technical problems. GE5 Working with computers for study purposes is not difficult for me. GE6 E-learning contributes to higher student academic performance. GE7 E-learning is a quality replacement for traditional learning in the classroom. FF1 The content of the course interests me. FF2 Course lectures are interesting for me and I like to attend them. FF3 I find the face-to-face tutorial attractive and useful. EC1 The virtual classroom of the course is organized transparently. EC2 The goals (workload demands, grading) of this e-course were clearly stated at the start of the semester. EC3 This e-course offers a variety of ways of assessing my learning (quizzes, written work, forums, files…). EC4 I receive the teacher's comment/feedback on an assignment within less than 7 days. EC5 I prefer fewer lectures in the traditional way (face-to-face) and more learning material processed in the e-course. EC6 More course exercises could be carried out in the e-course instead of in the classroom. GI1 The general impression of the e-course is good. GI2 Study material and tasks of the e-course are presented in a clear and understandable way. GI3 Finding certain activities in the e-course is simple. GI4 The prepared learning material and tasks are consistent with the lectures in the classroom and supplement them. GI5 The prepared material and assignments supplement the tutorial in the classroom. GI6 Learning materials and activities in the e-course helped me to effectively study this subject matter. GI7 The teacher gives me feedback/a response on my submissions (assignment, forum posts).
Responses of 639 students, evaluating 46 undergraduate obligatory courses were collected. The final data included 3334 records. In addition, the student's final course grade was added to each record, which was attained from the students' information system database via the Student ID number, one of the data requested in the questionnaire.
In order to reduce a high dimensionality of the data set and make the results more comprehensive, principal component analysis on the obtained evaluations was performed. The Kaiser criterion determined the number of components and a varimax rotation was used to increase the interpretability of the obtained components. The values of new components were calculated as arithmetic means of variables with high factor loadings, threshold above 0.5. The obtained new components were evaluated using Cronbach's alpha and only the components with Cronbach's alpha above 0.7 were used to predict the student's grade. Additionally, the hierarchical clustering of the aspects was used to confirm findings from principal component analysis.
Linear regression analysis was performed for each course with extracted components as independent variables and final grade as a dependent variable.

Results
Principal component analysis (PCA) reduced 23 aspects of BL to 6 latent components with 67% of total variance explained. Since components 4 and 5 resulted in a poor Cronbach alpha (factor loadings below 0.5), they were excluded from further analysis. The remaining components represent four BL dimensions based on the meaning of aspects with highest loadings, namely aspects on e-course, technical properties and support, F2F learning, and teacher's feedback. In addition, these four latent components were used in the regression analysis as predictors (independent variables). The factor loadings, names of the components with the percent of total variance explained (TVE) and Cronbach's Alpha are shown in Tab.

2.
Tab. 2 provides a grouping of analysed aspects based on the results of the PCA. To confirm the stability of the grouping, the hierarchical clustering of the aspects were performed. The dissimilarity between the two aspects was measured with Euclidean distance, and Ward's linkage as agglomerative hierarchical clustering procedure was applied. The results of the clustering approach are shown in the dendrogram in   The other variables form clusters in a very similar way compared to the results of the PCA. The group of three aspects of F2F learning is well-isolated from others. Similarly, the largest group of aspects which describe the aspects of an e-course is also coherent. The same is true for the latent variable measuring the technical aspect. The last latent variable from Tab. 2 measures teachers' feedback with just two variables with high loadings, i.e. EC4 (I receive the teacher's comment/feedback on an assignment within less than 7 days.) and GI7 (The teacher gives me feedback/a response on my submissions (assignment, forum posts)). The hierarchical clustering approach groups them together but indicates their similarity with general aspects of e-courses. That is confirmed with cross-loading for these two variables. It means that students perceive teachers' feedback as an important aspect of an e-course. By confirming the four "dimensions" of BL, using two different methods, we tried to make a link between them and the final grade. On the full data set, we failed to detect any relationship between the four identified latent components and the final course grade. Because the students' response rate was low, certain courses received very few evaluations. Therefore, further analysis was focused on specific courses, those with more than 50 students' evaluations. This approach is suitable for several reasons: (1) it considers the courses' specifics; (2) it is more reasonable to compare the grades within a course rather than between different courses; (3) it reveals in which courses BL is of a big help in achieving higher grades.
Regression analysis revealed six courses where the four latent variables have an impact on the final grade (linear regression model with more than 13% of explained variability of the final grade [8]). Tab. 3 shows unstandardized regression coefficients (B) with the corresponding significances (Sig.), R 2 and number of responses (N). The course names were anonymized -only the chair to which the course belongs, the year of study, and the study program were revealed. Because the analysis was limited to the courses that received more than 50 evaluations, and because the third-year students were poorly responsive, no results for courses in the third year were given.
Five of six identified courses belonged to the Chair of Economics and Public Sector Management (EPSM) and one course (Course 2) was from the Chair of the Administrative-Legal Area (ALA). None of the resulting courses belonged to the Chair of Organization and Informatics, which is slightly surprising, because more computer-based skills are used in the teaching and learning process (Tab. 3).
However, the result is consistent with some previous research. The generalization of the method that predicts student performance from one educational environment to another has already been pointed out as a problem [34]. Even more, although within the same educational institution, predictive models can vary significantly [9].
It is interesting that the aspects of e-courses had a significant positive influence on the final grade in three courses (Course 2, Course 4 and Course 6) from the first year of study. The results therefore suggest that the characteristics related to an e-course, such as the design and organization of an e-course, clearly stated goals and study material and tasks which are consistent with the lectures in the classroom, have a positive impact on the final grade in the first year of study. It can be assumed that the students in higher years of study become used to the Moodle environment and all their e-course obligations (quizzes, assignments, etc.). It can also be concluded that students in higher years of study are more independent and self-regulated, so these aspects are not important any more. Therefore, the organization of an e-course in higher years of study plays a less important role than in the first year.
On the contrary, the component which measures the ease of use and stability of the LMS Moodle and satisfaction with technical support, did not have a significant impact on students' grade for any of the examined courses. Since this component is the only one not related to any of six courses, the correlation with the final grade with all courses together was explored. Additional empirical findings revealed no significant correlation (r = 0.007, p = 0.681) for the entire data set. We can therefore conclude that the technical aspect and administrative support exert no influence on students' final grades at the analysed levels.
In the cases of courses with the highest R 2 , a significant positive impact of the F2F aspects, such as interesting course lectures and attractive tutorials in the classroom, and the contents which grab student's interest, were detected for three courses (Course 1, Course 2 and Course 3), from both chairs (EPMS and ALA), first two years of study and both study programs. Teacher's Feedback -0.04 0.900 -0.38 0.000*** 0.02 0.921 -0.48 0.021** -0.02 0.908 0.28 0.006*** Regression coefficient is significant at the levels: 0.1 -*, 0.05 -**, and 0.01 -***.
The results suggest that the influence of this component was the strongest especially for the course from the Chair of the Administrative-Legal Area (Course 2), which is not surprising since the lectures and tutorials from this chair focus their teaching process on traditional classroom discussions. The regression coefficient of the aspect on F2F learning for the Course 2 was highly significant (B = 0.31, p = 0.002). Therefore, by increasing students' attitude and interest for the content (with quality lectures and attractive tutorials) by 1 point (on a 7-level scale), an average increase in the final grade of 0.31 (on a scale of 1 to 10) could be expected. For the other two courses, the increase would even be higher -for the Course 3, an increase in the average grade by more than 0.5 could be expected.
In the literature, the aspect of timely and properly given teacher's feedback has been identified as an important factor of students' performance in BL. As for example, Chen, Breslow & DeBoer (2018) focused more on computer-based feedback tool, Margolis [26] pointed out the instructors' feedback as one of the best practices in usage of BL. Although we discovered its significant impact on the final grade for three courses (Course 2, Course 4 and Course 6), the empirical findings are only promising for Course 6. The regression coefficients of teacher's feedback are negative for Course 2 and Course 4 (-0.38 and -0.48, respectively). The empirical findings suggest that the students with higher grades expected richer and more useful feedback from the teacher whereas the feedback was more useful for students with lower grades. In the future, this surprising finding requires further research.
The focus of the study was also on defining the clusters on the basis of student's attitudes towards BL that would group the students with similar learning achievements. To reveal natural groupings in our data a TwoStep Clustering using SPSS was performed. The procedure has automatically determined the optimal number of clusters.
All 23 original variables from the questionnaire on the entire dataset were taken into account. The variables were treated as continuous and applied their standardized values for the analyses, where the default parameters for TwoStep Clustering were used. As a cluster criterion, the loglikelihood distance measure and Schwarz's Bayesian criterion (BIC) were used.
Based on the Silhouette measure, the TwoStep Clustering determined that the data set consisted of two clusters. The average Silhouette score of 0.3 indicated a fair degree of cohesion and separation (Fig. 2). The obtained (two) clusters significantly differ in terms of nearly all mean values. The result is not surprising since the same variables were used for creating the clusters. The lowest p-values were detected for variables GI1, GI2, GI4, GI5, GI6 and EC1. All, except one, were the variables from the "students' attitudes to e-learning" set. Students of the Cluster 2 outlined the transparent organization, understandable study material and supplementation of the traditional way of the course as the most important. Additionally, the two clusters differ in the variables describing the students' performance measured with final grades. The students from the Cluster 2 outperform the students from the Cluster 1, since the average final grade in the Cluster 2 is higher than those in the Cluster 1 (Tab. 5). The difference in means is statistically significant (p = 2.21E−8). Therefore, it can be hypothesized that in the Cluster 2 students study the content from the e-classroom in depth, so they appreciate and notice good organization and content that are in line with the performance in the lecture room. The other observation from Tab. 4 is that a mean value of all variables is higher in the Cluster 2. The variables EC5 and EC6, describing the preferences to the traditional teaching in a lecture room, are the only two variables where the differences between the clusters are not significant and the means are considerably the smallest. Therefore, it can be assumed that all students accept a BL approach.
The other observation from the Tab. 4 is that a mean value of all variables is higher in Cluster 2. The variables EC5 and EC6, describing the preferences to the traditional teaching in a lecture room, are the only two variables where the differences between the clusters are not significant and the means are considerably the smallest. Therefore, it can be assumed that all students accept a BL approach.

CONCLUSION
The present study was conducted to investigate the aspects of BL which increase a students' knowledge level and have an impact on final grades. The research revealed that the organization of a course and study materials supporting F2F teaching are essential features. However, regression analysis could not identify an overall (global) significant relationship between the different aspects of BL and the final grade. These findings support those of previous research [9], stating that the results of prediction vary between the courses from the same institution.
Nevertheless, six courses were identified where the students' final grade was significantly linked to the aspects of BL. Four of them were in the first year of study, the other two in the second. However, the F2F approach has still the strongest influence. Further, the research also demonstrates that the technical aspects and administrative support were not the factors that influence the final success. The most surprising finding was the identification of two courses where the teacher's feedback was significantly negatively linked to the students' final grades. We suspect that the teachers of these two courses did not fulfil the expectations of students with better grades, while students with lower grades were satisfied with their feedback. The study also suggests that teachers should pay attention to the pedagogical aspects of the e-course and use the technology to support the traditional F2F teaching. Therefore, teachers must pay attention to designing the content and integrating the study materials into the e-course.
One of the main limitations of this study is that teachers have not been addressed in the research. They were not asked to express their opinions on BL and the teachers' activities in the e-courses were not investigated. Therefore, this remains as the challenge for future research. Another challenge is to increase the participation rate of students in the third year of study. They have much greater experience with various e-courses, overcame technical challenges in the first year and hold different expectations regarding the e-course quality, which could contribute to a more effective e-course design in order to support effective and efficient study.