Etd

Mining Oncology Data: Knowledge Discovery in Clinical Performance of Cancer Patients

Public

Downloadable Content

open in viewer

Our goal in this research is twofold: to develop clinical performance databases of cancer patients, and to conduct data mining and machine learning studies on collected patient records. We use these studies to develop models for predicting cancer patient medical outcomes. The clinical database is developed in conjunction with surgeons and oncologists at UMass Memorial Hospital. Aspects of the database design and representation of patient narrative are discussed here. Current predictive model design in medical literature is dominated by linear and logistic regression techniques. We seek to show that novel machine learning methods can perform as well or better than these traditional techniques. Our machine learning focus for this thesis is on pancreatic cancer patients. Classification and regression prediction targets include patient survival, wellbeing scores, and disease characteristics. Information research in oncology is often constrained by type variation, missing attributes, high dimensionality, skewed class distribution, and small data sets. We compensate for these difficulties using preprocessing, meta-learning, and other algorithmic methods during data analysis. The predictive accuracy and regression error of various machine learning models are presented as results, as are t-tests comparing these to the accuracy of traditional regression methods. In most cases, it is shown that the novel machine learning prediction methods offer comparable or superior performance. We conclude with an analysis of results and discussion of future research possibilities.

Creator
Contributors
Degree
Unit
Publisher
Language
  • English
Identifier
  • etd-081606-083026
Keyword
Advisor
Defense date
Year
  • 2006
Date created
  • 2006-08-16
Resource type
Rights statement

Relations

In Collection:

Items

Items

Permanent link to this page: https://digital.wpi.edu/show/1v53jx02k