IDEAS home Printed from https://ideas.repec.org/a/hin/jnlmpe/4953280.html
   My bibliography  Save this article

A Method for Entity Resolution in High Dimensional Data Using Ensemble Classifiers

Author

Listed:
  • Liu Yi
  • Diao Xing-chun
  • Cao Jian-jun
  • Zhou Xing
  • Shang Yu-ling

Abstract

In order to improve utilization rate of high dimensional data features, an ensemble learning method based on feature selection for entity resolution is developed. Entity resolution is regarded as a binary classification problem, an optimization model is designed to maximize each classifier’s classification accuracy and dissimilarity between classifiers and minimize cardinality of features. A modified multiobjective ant colony optimization algorithm is employed to solve the model for each base classifier, two pheromone matrices are set up, weighted product method is applied to aggregate values of two pheromone matrices, and feature’s Fisher discriminant rate of records’ similarity vector is calculated as heuristic information. A solution which is called complementary subset is selected from Pareto archive according to the descending order of three objectives to train the given base classifier. After training all base classifiers, their classification outputs are aggregated by max-wins voting method to obtain the ensemble classifiers’ final result. A simulation experiment is carried out on three classical datasets. The results show the effectiveness of our method, as well as a better performance compared with the other two methods.

Suggested Citation

  • Liu Yi & Diao Xing-chun & Cao Jian-jun & Zhou Xing & Shang Yu-ling, 2017. "A Method for Entity Resolution in High Dimensional Data Using Ensemble Classifiers," Mathematical Problems in Engineering, Hindawi, vol. 2017, pages 1-11, February.
  • Handle: RePEc:hin:jnlmpe:4953280
    DOI: 10.1155/2017/4953280
    as

    Download full text from publisher

    File URL: http://downloads.hindawi.com/journals/MPE/2017/4953280.pdf
    Download Restriction: no

    File URL: http://downloads.hindawi.com/journals/MPE/2017/4953280.xml
    Download Restriction: no

    File URL: https://libkey.io/10.1155/2017/4953280?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    More about this item

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:hin:jnlmpe:4953280. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Mohamed Abdelhakeem (email available below). General contact details of provider: https://www.hindawi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.