Faculty Advisor

Eltabakh, Mohamed Y.

Abstract

Three data platforms were benchmarked against each other in this project: CouchDB, MongoDB, and Apache Spark. Each was used to execute a series of queries on a specific dataset. The benchmarking was performed on AWS EC2 instances, ensuring hardware resource consistency. Query latency was the quantitative performance metric used to analyze benchmarking. Each platform was also evaluated using ease-of-use metrics. This report introduces the reader to each of the platforms and provides appropriate background information to help explain the purpose of this evaluation. The motives behind the queries and performance metrics are explained to provide a foundation for the project’s methodology. The metrics are used to analyze testing results and draw conclusions from each platform's performance.

Publisher

Worcester Polytechnic Institute

Date Accepted

March 2017

Major

Computer Science

Project Type

Major Qualifying Project

Accessibility

Unrestricted

Advisor Department

Computer Science

Share

COinS