IDEAS home Printed from https://ideas.repec.org/a/gam/jmathe/v11y2023i11p2548-d1161950.html
   My bibliography  Save this article

Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation

Author

Listed:
  • Andrei-Marius Avram

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania)

  • Verginica Barbu Mititelu

    (Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, 050711 Bucharest, Romania)

  • Vasile Păiș

    (Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, 050711 Bucharest, Romania)

  • Dumitru-Clementin Cercel

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania)

  • Ștefan Trăușan-Matu

    (Computer Science and Engineering Department, Faculty of Automatic Control and Computers, University Politehnica of Bucharest, 060042 Bucharest, Romania
    Research Institute for Artificial Intelligence “Mihai Drăgănescu”, Romanian Academy, 050711 Bucharest, Romania)

Abstract

Correctly identifying multiword expressions (MWEs) is an important task for most natural language processing systems since their misidentification can result in ambiguity and misunderstanding of the underlying text. In this work, we evaluate the performance of the mBERT model for MWE identification in a multilingual context by training it on all 14 languages available in version 1.2 of the PARSEME corpus. We also incorporate lateral inhibition and language adversarial training into our methodology to create language-independent embeddings and improve its capabilities in identifying multiword expressions. The evaluation of our models shows that the approach employed in this work achieves better results compared to the best system of the PARSEME 1.2 competition, MTLB-STRUCT, on 11 out of 14 languages for global MWE identification and on 12 out of 14 languages for unseen MWE identification. Additionally, averaged across all languages, our best approach outperforms the MTLB-STRUCT system by 1.23% on global MWE identification and by 4.73% on unseen global MWE identification.

Suggested Citation

  • Andrei-Marius Avram & Verginica Barbu Mititelu & Vasile Păiș & Dumitru-Clementin Cercel & Ștefan Trăușan-Matu, 2023. "Multilingual Multiword Expression Identification Using Lateral Inhibition and Domain Adaptation," Mathematics, MDPI, vol. 11(11), pages 1-18, June.
  • Handle: RePEc:gam:jmathe:v:11:y:2023:i:11:p:2548-:d:1161950
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/2227-7390/11/11/2548/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/2227-7390/11/11/2548/
    Download Restriction: no
    ---><---

    References listed on IDEAS

    as
    1. Se Hyun Nam & Yu Hwan Kim & Jiho Choi & Chanhum Park & Kang Ryoung Park, 2023. "LCA-GAN: Low-Complexity Attention-Generative Adversarial Network for Age Estimation with Mask-Occluded Facial Images," Mathematics, MDPI, vol. 11(8), pages 1-33, April.
    2. Rosario Arroyo González & Eric Fernández-Lancho & Juan Antonio Maldonado Jurado, 2021. "Learning Effect in a Multilingual Web-Based Argumentative Writing Instruction Model, Called ECM, on Metacognition, Rhetorical Moves, and Self-Efficacy for Scientific Purposes," Mathematics, MDPI, vol. 9(17), pages 1-24, September.
    3. Drazen Draskovic & Darinka Zecevic & Bosko Nikolic, 2022. "Development of a Multilingual Model for Machine Sentiment Analysis in the Serbian Language," Mathematics, MDPI, vol. 10(18), pages 1-17, September.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.

      Corrections

      All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jmathe:v:11:y:2023:i:11:p:2548-:d:1161950. See general information about how to correct material in RePEc.

      If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

      If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

      If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

      For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

      Please note that corrections may take a couple of weeks to filter through the various RePEc services.

      IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.