Eltabakh, Mohamed Y.
Today, with the rapid development of technology, human entered a new era of Information Technology. Data is being transfer from paper to digital second by second. Therefore, the demand of data storage is increasing quickly. Human need a new technique to handle Big Data, that’s why Hadoop was born. However, the conflicts and duplicates of data is still happen in many cases. In this report, we will illustrate a new technique for entity resolution in big data which uses Hadoop's Map-Reduce framework.
Worcester Polytechnic Institute
Major Qualifying Project
All authors have granted to WPI a nonexclusive royalty-free license to distribute copies of the work, subject to other agreements. Copyright is held by the author or authors, with all rights reserved, unless otherwise noted.