Abstract: Business Intelligence (BI), a technology-driven process and presenting actionable information has become important for improvement of organisation, business units etc. BI requires mining information from a huge volume of unstructured text data. This mining task requires sophisticated natural language processing tasks. One of the crucial tasks is identifying the chain of referential entities in the given text which is described as coreference resolution. Coreference is the referent in one expression of the same referent in another expression and the referents must exist in the real world. Coreference chain is formed by connecting entities referring to same entity. We approach this resolution task for a morphologically rich language, Tamil as two subtasks and use two machine learning approaches. The two subtasks, the pronominal resolution and noun phrase coreferencing is done using Tree Conditional Random Fields (Tree CRFs) and Support Vector Machines (SVM), respectively. Coreference chains are evaluated with standard metrics and the results are encouraging.
Vijay Sundar Ram and Sobha Lalitha Devi, 2016. Two Layer Machine Learning Approach for Mining Referential Entities for a Morphologically Rich Language. Asian Journal of Information Technology, 15: 2831-2838.