Faculty Advisor or Committee Member
Randy C. Paffenroth, Advisor
Faculty Advisor or Committee Member
Lane T. Harrison, Committee Member
Identifier
etd-121218-150536
Abstract
Principal Components Analysis (PCA) is a statistical procedure commonly used for the purpose of analyzing high dimensional data. It is often used for dimensionality reduction, which is accomplished by determining orthogonal components that contribute most to the underlying variance of the data. While PCA is widely used for identifying patterns and capturing variability of data in lower dimensions, it has some known limitations. In particular, PCA represents its results as linear combinations of data attributes. PCA is therefore, often seen as difficult to interpret and because of the underlying optimization problem that is being solved it is not robust to outliers. In this thesis, we examine extensions to PCA that address these limitations. Specific techniques researched in this thesis include variations of Robust and Sparse PCA as well as novel combinations of these two methods which result in a structured low-rank approximation that is robust to outliers. Our work is inspired by the well known machine learning methods of Least Absolute Shrinkage and Selection Operator (LASSO) as well as pointwise and group matrix norms. Practical applications including robust and non-linear methods for anomaly detection in Domain Name System network data as well as interpretable feature selection with respect to a website classification problem are discussed along with implementation details and techniques for analysis of regularization parameters.
Publisher
Worcester Polytechnic Institute
Degree Name
MS
Department
Data Science
Project Type
Thesis
Date Accepted
2018-12-11
Copyright Statement
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work, subject to other agreements. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted.
Accessibility
Unrestricted
Repository Citation
Jutras, Melanie A., "Dimension Reduction and LASSO using Pointwise and Group Norms" (2018). Masters Theses (All Theses, All Years). 1254.
https://digitalcommons.wpi.edu/etd-theses/1254
Subjects
LASSO Robust Sparse PCA Dimension Reduction