Influential Text-Based Features in Predicting Admission Status of Online Degree Applicants
L@S '22: Proceedings of the Ninth ACM Conference on Learning @ Scale
This paper presents the progress made towards developing an equitable predictive model for admission success to an online Master's program with a large pool of applicants. The overarching goal of this project is to help the future development of a systematic evaluation tool for programs with large applications. In the first phase of the project, we collected and processed data on 9,044 applications and have trained a predictive model using applicants' profile information such as demographic data, academic background, and test scores. In an ongoing phase, we seek to expand the applicants' database by incorporating the information in the letters of recommendation (LORs) and statements of purpose (SOPs) that are essential components of the application package for graduate programs and are extensively used to make decisions on granting admission. In this study, we assess various aspects of the LORs and SOPs using natural language processing to extract a comprehensive list of text features that are used to develop a classifier. We implement machine-learning algorithms such as Gradient Boosting to predict admission status and to identify the text features with the highest weight on the applicants' success. This work provides an understanding of the level of significance of a variety of text features that eventually helps the development of a comprehensive predictive model.