IDEAS home Printed from https://ideas.repec.org/p/zbw/wtowps/314422.html
   My bibliography  Save this paper

Beyond six digits: Automated tariff line HS transposition using Natural Language Processing

Author

Listed:
  • Bayona, Pamela

Abstract

This paper explores the application of Natural Language Processing (NLP) techniques to automate Harmonized System (HS) tariff line transposition, employing a three-stage process: unique 1:1 tariff code matching (Round 1), exact description matching (Round 2), and "smart" description matching (Round 3) using Artificial Intelligence (AI) and lexical similarity methods paired with harmonized 6- digit concordance and cosine similarity. Similarity is calculated using either Term Frequency Inverse Document Frequency (TF-IDF) vectors or Sentence-BERT (SBERT) embeddings, comparing two scenarios: a straightforward case (Economy A) with standardized descriptions, and a complex case (Economy B), with more detailed technical descriptions. Results indicate that automated HS transposition can significantly augment the efficiency of traditionally manual methods, reducing processing time from two to three weeks to approximately half a day (up to 30 times faster). The overall accuracy rate is 99.6% for the simpler scenario and 98.8% for the complex one, for a standard set of approximately 10,000 HS codes. While non-AI techniques cover most of the accurate matches, AI-based Round 3 techniques address cases requiring the most manual effort. SBERT generally outperforms TF-IDF, however including subheadings tends to reduce its accuracy. In certain cases, particularly for highly technical tariffs, TF-IDF's straightforward approach provides an advantage over SBERT. Overall, NLP techniques hold significant potential for improving HS transposition methods and facilitating the development of richer tariffs and trade datasets to enable more in-depth analyses. Future research should focus on refining these techniques across diverse datasets to optimize their broader application in tariff and trade data analysis.

Suggested Citation

  • Bayona, Pamela, 2025. "Beyond six digits: Automated tariff line HS transposition using Natural Language Processing," WTO Staff Working Papers ERSD-2025-04, World Trade Organization (WTO), Economic Research and Statistics Division.
  • Handle: RePEc:zbw:wtowps:314422
    as

    Download full text from publisher

    File URL: https://www.econstor.eu/bitstream/10419/314422/1/192058840X.pdf
    Download Restriction: no
    ---><---

    More about this item

    Keywords

    Harmonized System; tariff line; HS transposition; correlation tables; concordance; natural language processing;
    All these keywords.

    JEL classification:

    • F10 - International Economics - - Trade - - - General
    • F13 - International Economics - - Trade - - - Trade Policy; International Trade Organizations

    Statistics

    Access and download statistics

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:zbw:wtowps:314422. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: ZBW - Leibniz Information Centre for Economics (email available below). General contact details of provider: https://edirc.repec.org/data/wtoerch.html .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.