Faculty Advisor or Committee Member
Randy C. Paffenroth, Advisor
As machine learning becomes an increasingly relevant field being incorporated into everyday life, so does the need for consistently high performing models. With these high expectations, along with potentially restrictive data sets, it is crucial to be able to use techniques for machine learning that increase the likelihood of success. Robust Principal Component Analysis (RPCA) not only extracts anomalous data, but also finds correlations among the given features in a data set, in which these correlations can themselves be used as features. By taking a novel approach to utilizing the output from RPCA, we address how our method effects the performance of such models. We take into account the efficiency of our approach, and use projectors to enable our method to have a 99.79% faster run time. We apply our method primarily to cyber security data sets, though we also investigate the effects on data sets from other fields (e.g. medical).
Worcester Polytechnic Institute
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted. If you have any questions, please contact email@example.com.
Bennett, Marissa A., "Improving Model Performance with Robust PCA" (2020). Masters Theses (All Theses, All Years). 1366.
Machine Learning, Robust PCA, Feature Engineering, Cyber Security, Projectors, PCA