IDEAS home Printed from https://ideas.repec.org/a/spr/elcore/v20y2020i2d10.1007_s10660-019-09371-6.html
   My bibliography  Save this article

Entity name recognition of cross-border e-commerce commodity titles based on TWs-LSTM

Author

Listed:
  • Yongcong Luo

    (Nanjing University of Aeronautics and Astronautics)

  • Jing Ma

    (Nanjing University of Aeronautics and Astronautics)

  • Chi Li

    (Cainiao Logistics Co., Ltd.)

Abstract

Commodity information must be matched to HSCode so as to be quickly through customs for export. So it is particularly important to identify entity name in the commodity title of e-commerce platform quickly and accurately. Aim at the problem, an approach based on TWs-LSTM is proposed to identify the entity name of commodity. In this paper, we apply TFIDF algorithm to manipulate text corpus of the commodity for getting the weight matrix of the commodity words. Meanwhile, we use the Word2Vec model to represent the semantic meanings of the words extracted from the bag of words. Then, the weight vector of commodity titles and every word vector of the title are combined into a new one-dimensional vector. We use these one-dimensional vectors to represent the commodity titles, named TWs model. Finally, we put the TWs vector into the LSTM for commodity entity name recognition. In the experimental stage, we compare the TWs-LSTM model with other text processing models for experimental calculation by dividing the commodity entity name data into a training set and a testing set. After applying the TWs-LSTM model, the F1-Score reached 64.58% with the commodity title corpus of the Tmall platform, where the TWs-LSTM achieves a state-of-the-art in comparison with the baseline models and previous studies.

Suggested Citation

  • Yongcong Luo & Jing Ma & Chi Li, 2020. "Entity name recognition of cross-border e-commerce commodity titles based on TWs-LSTM," Electronic Commerce Research, Springer, vol. 20(2), pages 405-426, June.
  • Handle: RePEc:spr:elcore:v:20:y:2020:i:2:d:10.1007_s10660-019-09371-6
    DOI: 10.1007/s10660-019-09371-6
    as

    Download full text from publisher

    File URL: http://link.springer.com/10.1007/s10660-019-09371-6
    File Function: Abstract
    Download Restriction: Access to the full text of the articles in this series is restricted.

    File URL: https://libkey.io/10.1007/s10660-019-09371-6?utm_source=ideas
    LibKey link: if access is restricted and if your library uses this service, LibKey will redirect you to where you can use your library subscription to access this item
    ---><---

    As the access to this document is restricted, you may want to search for a different version of it.

    References listed on IDEAS

    as
    1. Daqian Wei & Bo Wang & Gang Lin & Dichen Liu & Zhaoyang Dong & Hesen Liu & Yilu Liu, 2017. "Research on Unstructured Text Data Mining and Fault Classification Based on RNN-LSTM with Malfunction Inspection Report," Energies, MDPI, vol. 10(3), pages 1-22, March.
    2. Shuqing Li & Ying Sun & Dagobert Soergel, 2016. "Erratum to: A new method for automatically constructing domain-oriented term taxonomy based on weighted word co-occurrence analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 108(2), pages 1005-1005, August.
    3. Kai Hu & Huayi Wu & Kunlun Qi & Jingmin Yu & Siluo Yang & Tianxing Yu & Jie Zheng & Bo Liu, 2018. "A domain keyword analysis approach extending Term Frequency-Keyword Active Index with Google Word2Vec model," Scientometrics, Springer;Akadémiai Kiadó, vol. 114(3), pages 1031-1068, March.
    4. Tom Magerman & Bart Looy & Xiaoyan Song, 2010. "Exploring the feasibility and accuracy of Latent Semantic Analysis based text mining techniques to detect similarity between patent documents and scientific publications," Scientometrics, Springer;Akadémiai Kiadó, vol. 82(2), pages 289-306, February.
    5. Iwona Grabska-Gradzińska & Andrzej Kulig & Jarosław Kwapień & Stanisław Drożdż, 2012. "Complex Network Analysis Of Literary And Scientific Texts," International Journal of Modern Physics C (IJMPC), World Scientific Publishing Co. Pte. Ltd., vol. 23(07), pages 1-15.
    Full references (including those not matched with items on IDEAS)

    Most related items

    These are the items that most often cite the same works as this one and are cited by the same works as this one.
    1. Julie Callaert & Joris Grouwels & Bart Looy, 2012. "Delineating the scientific footprint in technology: Identifying scientific publications within non-patent references," Scientometrics, Springer;Akadémiai Kiadó, vol. 91(2), pages 383-398, May.
    2. Jongchan Kim & Jaehyun Choi & Sangsung Park & Dongsik Jang, 2018. "Patent Keyword Extraction for Sustainable Technology Management," Sustainability, MDPI, vol. 10(4), pages 1-18, April.
    3. Xuefeng Wang & Huichao Ren & Yun Chen & Yuqin Liu & Yali Qiao & Ying Huang, 2019. "Measuring patent similarity with SAO semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 121(1), pages 1-23, October.
    4. Wei Du & Yibo Wang & Wei Xu & Jian Ma, 2021. "A personalized recommendation system for high-quality patent trading by leveraging hybrid patent analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 126(12), pages 9369-9391, December.
    5. Jiyoung Woo & Jaeseok Yun, 2020. "Content Noise Detection Model Using Deep Learning in Web Forums," Sustainability, MDPI, vol. 12(12), pages 1-16, June.
    6. Magerman, Tom & Looy, Bart Van & Debackere, Koenraad, 2015. "Does involvement in patenting jeopardize one’s academic footprint? An analysis of patent-paper pairs in biotechnology," Research Policy, Elsevier, vol. 44(9), pages 1702-1713.
    7. Kai Ding & Chen Yao & Yifan Li & Qinglong Hao & Yaqiong Lv & Zengrui Huang, 2022. "A Review on Fault Diagnosis Technology of Key Components in Cold Ironing System," Sustainability, MDPI, vol. 14(10), pages 1-28, May.
    8. Karol Król & Dariusz Zdonek, 2023. "Cultural Heritage Topics in Online Queries: A Comparison between English- and Polish-Speaking Internet Users," Sustainability, MDPI, vol. 15(6), pages 1-20, March.
    9. Davi Alves Oliveira & Hernane Borges de Barros Pereira, 2024. "Modeling texts with networks: comparing five approaches to sentence representation," The European Physical Journal B: Condensed Matter and Complex Systems, Springer;EDP Sciences, vol. 97(6), pages 1-12, June.
    10. Xiang Zhu & Yunqiu Zhang, 2020. "Co-word analysis method based on meta-path of subject knowledge network," Scientometrics, Springer;Akadémiai Kiadó, vol. 123(2), pages 753-766, May.
    11. Lu Huang & Xiang Chen & Yi Zhang & Changtian Wang & Xiaoli Cao & Jiarun Liu, 2022. "Identification of topic evolution: network analytics with piecewise linear representation and word embedding," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(9), pages 5353-5383, September.
    12. Hongchen Li & Zhong Yang & Jiaming Han & Shangxiang Lai & Qiuyan Zhang & Chi Zhang & Qianhui Fang & Guoxiong Hu, 2020. "TL-Net: A Novel Network for Transmission Line Scenes Classification," Energies, MDPI, vol. 13(15), pages 1-15, July.
    13. Veugelers, Reinhilde & Wang, Jian, 2019. "Scientific novelty and technological impact," Research Policy, Elsevier, vol. 48(6), pages 1362-1372.
    14. Ghosh, Dipak & Chakraborty, Sayantan & Samanta, Shukla, 2019. "Study of translational effect in Tagore’s Gitanjali using Chaos based Multifractal analysis technique," Physica A: Statistical Mechanics and its Applications, Elsevier, vol. 523(C), pages 1343-1354.
    15. Diego R Amancio, 2015. "Probing the Topological Properties of Complex Networks Modeling Short Written Texts," PLOS ONE, Public Library of Science, vol. 10(2), pages 1-17, February.
    16. Lijie Feng & Yuxiang Niu & Zhenfeng Liu & Jinfeng Wang & Ke Zhang, 2019. "Discovering Technology Opportunity by Keyword-Based Patent Analysis: A Hybrid Approach of Morphology Analysis and USIT," Sustainability, MDPI, vol. 12(1), pages 1-35, December.
    17. Lu Huang & Yijie Cai & Erdong Zhao & Shengting Zhang & Yue Shu & Jiao Fan, 2022. "Measuring the interdisciplinarity of Information and Library Science interactions using citation analysis and semantic analysis," Scientometrics, Springer;Akadémiai Kiadó, vol. 127(11), pages 6733-6761, November.
    18. Sabrina L. Woltmann & Lars Alkærsig, 2018. "Tracing university–industry knowledge transfer through a text mining approach," Scientometrics, Springer;Akadémiai Kiadó, vol. 117(1), pages 449-472, October.
    19. Higham, Kyle & de Rassenfosse, Gaétan & Jaffe, Adam B., 2021. "Patent Quality: Towards a Systematic Framework for Analysis and Measurement," Research Policy, Elsevier, vol. 50(4).
    20. Samira Ranaei & Arho Suominen & Alan Porter & Stephen Carley, 2020. "Evaluating technological emergence using text analytics: two case technologies and three approaches," Scientometrics, Springer;Akadémiai Kiadó, vol. 122(1), pages 215-247, January.

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:spr:elcore:v:20:y:2020:i:2:d:10.1007_s10660-019-09371-6. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    If CitEc recognized a bibliographic reference but did not link an item in RePEc to it, you can help with this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: Sonal Shukla or Springer Nature Abstracting and Indexing (email available below). General contact details of provider: http://www.springer.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.