Faculty Advisor
Eltabakh, Mohamed Y.
Abstract
Three data platforms were benchmarked against each other in this project: CouchDB, MongoDB, and Apache Spark. Each was used to execute a series of queries on a specific dataset. The benchmarking was performed on AWS EC2 instances, ensuring hardware resource consistency. Query latency was the quantitative performance metric used to analyze benchmarking. Each platform was also evaluated using ease-of-use metrics. This report introduces the reader to each of the platforms and provides appropriate background information to help explain the purpose of this evaluation. The motives behind the queries and performance metrics are explained to provide a foundation for the project’s methodology. The metrics are used to analyze testing results and draw conclusions from each platform's performance.
Publisher
Worcester Polytechnic Institute
Date Accepted
March 2017
Major
Computer Science
Project Type
Major Qualifying Project
Copyright Statement
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work, subject to other agreements. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted.
Accessibility
Unrestricted
Advisor Department
Computer Science