Faculty Advisor or Committee Member

Professor Carolina Ruiz, Advisor

Faculty Advisor or Committee Member

Professor Murali Mani

Faculty Advisor or Committee Member

Professor Michael Gennert

Identifier

etd-081606-083026

Abstract

"Our goal in this research is twofold: to develop clinical performance databases of cancer patients, and to conduct data mining and machine learning studies on collected patient records. We use these studies to develop models for predicting cancer patient medical outcomes. The clinical database is developed in conjunction with surgeons and oncologists at UMass Memorial Hospital. Aspects of the database design and representation of patient narrative are discussed here. Current predictive model design in medical literature is dominated by linear and logistic regression techniques. We seek to show that novel machine learning methods can perform as well or better than these traditional techniques. Our machine learning focus for this thesis is on pancreatic cancer patients. Classification and regression prediction targets include patient survival, wellbeing scores, and disease characteristics. Information research in oncology is often constrained by type variation, missing attributes, high dimensionality, skewed class distribution, and small data sets. We compensate for these difficulties using preprocessing, meta-learning, and other algorithmic methods during data analysis. The predictive accuracy and regression error of various machine learning models are presented as results, as are t-tests comparing these to the accuracy of traditional regression methods. In most cases, it is shown that the novel machine learning prediction methods offer comparable or superior performance. We conclude with an analysis of results and discussion of future research possibilities."

Publisher

Worcester Polytechnic Institute

Degree Name

MS

Department

Computer Science

Project Type

Thesis

Date Accepted

2006-08-16

Accessibility

Unrestricted

Subjects

Clinical Performance, Databases, Cancer, oncology, Knowledge Discovery in Databases, data mining, Cancer, Treatment, Data processing, Data mining

Share

COinS