Eltabakh, Mohamed Y.
Three data platforms were benchmarked against each other in this project: CouchDB, MongoDB, and Apache Spark. Each was used to execute a series of queries on a specific dataset. The benchmarking was performed on AWS EC2 instances, ensuring hardware resource consistency. Query latency was the quantitative performance metric used to analyze benchmarking. Each platform was also evaluated using ease-of-use metrics. This report introduces the reader to each of the platforms and provides appropriate background information to help explain the purpose of this evaluation. The motives behind the queries and performance metrics are explained to provide a foundation for the project’s methodology. The metrics are used to analyze testing results and draw conclusions from each platform's performance.
Worcester Polytechnic Institute
Major Qualifying Project
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work, subject to other agreements. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted.