Flaherty, Patrick J.
Advances in error correction for next generation sequencing have not matched increases in data production and as a result, the quality of the generated nucleotide sequences has suffered. The purpose of this project was to develop a processing pipeline which would remove errors from intensity data for a faster and more accurate analysis. The method employed to achieve these goals was to redesign the algorithm used to correct for bleaching and phasing to capture a greater number of misidentified bases. Two pipelines were created, pipeline 1 (illumina) and pipeline 2 (Oracle ), it was determined that pipeline 2 out preformed pipeline one in terms of accuracy. But pipeline 1 was determined to be faster in processing time and thus the main question is asked do you sacrifice time for efficiency?
Worcester Polytechnic Institute
Humanities and Arts
Major Qualifying Project
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work, subject to other agreements. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted.