Vo Thi Ngoc Chau, Nguyen Hua Phung

Main Article Content


Educational data clustering on the students’ data collected with a program can find several groups of the students sharing the similar characteristics in their behaviors and study performance. For some programs, it is not trivial for us to prepare enough data for the clustering task. Data shortage might then influence the effectiveness of the clustering process and thus, true clusters can not be discovered appropriately. On the other hand, there are other programs that have been well examined with much larger data sets available for the task. Therefore, it is wondered if we can exploit the larger data sets from other source programs to enhance the educational data clustering task on the smaller data sets from the target program. Thanks to transfer learning techniques, a transfer-learning-based clustering method is defined with the kernel k-means and spectral feature alignment algorithms in our paper as a solution to the educational data clustering task in such a context. Moreover, our method is optimized within a weighted feature space so that how much contribution of the larger source data sets to the clustering process can be automatically determined. This ability is the novelty of our proposed transfer learning-based clustering solution as compared to those in the existing works. Experimental results on several real data sets have shown that our method consistently outperforms the other methods using many various approaches with both external and internal validations.