This thesis focuses on the use of data mining techniques to investigate the expected survival time of patients with pancreatic cancer. Clinical patient data have been useful in showing overall population trends in patient treatment and outcomes. Models built on patient level data also have the potential to yield insights into the best course of treatment and the long-term outlook for individual patients. Within the medical community, logistic regression has traditionally been chosen for building predictive models in terms of explanatory variables or features. Our research demonstrates that the use of machine learning algorithms for both feature selection and prediction can significantly increase the accuracy of models of patient survival. We have evaluated the use of Artificial Neural Networks, Bayesian Networks, and Support Vector Machines. We have demonstrated (p<0.05) that data mining techniques are capable of improved prognostic predictions of pancreatic cancer patient survival as compared with logistic regression alone.
Worcester Polytechnic Institute
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted. If you have any questions, please contact firstname.lastname@example.org.
floyd, stuart, "Data Mining Techniques for Prognosis in Pancreatic Cancer" (2007). Masters Theses (All Theses, All Years). 671.
Data Mining, Machine Learning, Feature Selection, Pancreatic Cancer