Predictor Selection in Linear Regression: L1 regularization of a subset of parameters and Comparison of L1 regularization and stepwise selection

Hu, Qing

Etd

Predictor Selection in Linear Regression: L1 regularization of a subset of parameters and Comparison of L1 regularization and stepwise selection

Public

Background: Feature selection, also known as variable selection, is a technique that selects a subset from a large collection of possible predictors to improve the prediction accuracy in regression model. First objective of this project is to investigate in what data structure LASSO outperforms forward stepwise method. The second objective is to develop a feature selection method, Feature Selection by L1 Regularization of Subset of Parameters (LRSP), which selects the model by combining prior knowledge of inclusion of some covariates, if any, and the information collected from the data. Mathematically, LRSP minimizes the residual sum of squares subject to the sum of the absolute value of a subset of the coefficients being less than a constant. In this project, LRSP is compared with LASSO, Forward Selection, and Ordinary Least Squares to investigate their relative performance for different data structures. Results: simulation results indicate that for moderate number of small sized effects, forward selection outperforms LASSO in both prediction accuracy and the performance of variable selection when the variance of model error term is smaller, regardless of the correlations among the covariates; forward selection also works better in the performance of variable selection when the variance of error term is larger, but the correlations among the covariates are smaller. LRSP was shown to be an efficient method to deal with the problems when prior knowledge of inclusion of covariates is available, and it can also be applied to problems with nuisance parameters, such as linear discriminant analysis.

Creator

Hu, Qing

Contributors

Kim, Ryung S.

Degree

Unit

Mathematical Sciences

Publisher

Worcester Polytechnic Institute

Language

English

Identifier

etd-051107-154052

Keyword

Advisor

Kim, Ryung S.

Defense date

2007-05-14

Year

2007

Date created

2007-05-11

Resource type

Report

Rights statement

In Copyright

Relations

In Collection:

Masters Reports

Items

Title	Visibility	Actions
main.pdf	Public	Download
listoftables.pdf	Public	Download
abstract.pdf	Public	Download
contents.pdf	Public	Download
acknowledgments.pdf	Public	Download
listoffigures.pdf	Public	Download
title.pdf	Public	Download

Permanent link to this page: https://digital.wpi.edu/show/0k225b199

Explore, Discover, Share

Predictor Selection in Linear Regression: L1 regularization of a subset of parameters and Comparison of L1 regularization and stepwise selection

Downloadable Content

Relations

Items

Items