Asian Journal of Information Technology

Year: 2013

Volume: 12

Issue: 7

Page No. 236 - 241

DOI: 10.36478/ajit.2013.236.241

Download PDF References Abstract

Synonym Based Duplicate Record Detection

Authors : K. Amshakala and R. Nedunchezhian

Abstract: As the amount of data and data providers are increasing tremendously, there is a high demand for integrating data from heterogeneous data sources. Often, in the real world, entities have two or more representations and data are not defined in a consistent way across different data sources. When answering user’s query, results are returned to the users by combining data from several databases and the results include duplicate entries. Duplicate detection techniques detect multiple representations of identical real world entities. Without using duplicate record detection techniques, the quality of the extracted data remains low. This study presents an unsupervised duplicate record detection technique which does not require expert’s knowledge or hand coded rules to detect duplicate records. A large lexical database called WordNet ontology is used to match the entities.

How to cite this article:

K. Amshakala and R. Nedunchezhian, 2013. Synonym Based Duplicate Record Detection. Asian Journal of Information Technology, 12: 236-241.

DOI: 10.36478/ajit.2013.236.241

URL: https://medwelljournals.com/abstract/?doi=ajit.2013.236.241

Related Links

Journals By Subject

Asian Journal of Information Technology

Synonym Based Duplicate Record Detection

How to cite this article: