Imports, exports and tariffs are quite the theme in the news these days, be it in the context of Brexit, the US-China trade war or the Iran nuclear deal. Executive decisions on what duties should be levied on goods crossing borders are the norm of the day. Have you ever wondered how these decisions are practically implemented at the ground-level though? The answer - Harmonized Tariff Schedules (HTS), a taxonomy built by the World Customs Organization (WCO) to classify and define internationally traded goods. Semantics3 offers automated HTS code classification solutions to help logistics providers modernize their customs workflows.

Harmonized Tariff Schedule (HTS) code classification is a surprisingly challenging machine learning problem - while at face value it is a simple multi-label classification, the real-world specifics are often deceptively intractable:

  • For starters, the quality of data available from most sources is rather poor, so automated decision making systems have to learn to pull in external knowledge, and to develop a good understanding of understood norms.
  • In addition, target code classes change across geographies and with time, requiring algorithms to keep an eye out for stale data.
  • What's more, it's surprisingly difficult to have trained human annotators agree on what the right HS code for a given product should be - in datasets annotated by trained professionals, we usually see differing labels for the same product at least 30% of the time.

How do you build automated systems that can deal with these challenges? In this article, I'll cover five techniques that have helped us deal with these problems.

1) Data Augmentation through Catalogs and Crawling

It's not uncommon for HTS specialists to spend up to 50% of their time Google-ing for product information on the web. When your input product simply reads "A1920" or "Consuming Impulse", you need a good old search engine to even understand what the product is. At times, even when you understand what a product is (for example, "Winnipeg Chair"), you may need to search for more information to identify what the 6th or the 10th digit should be (what material is it made of)?

For products with inadequate information, we run searches against our e-commerce catalog of 200 million+ products to look for matches. In the absence of matches, we also run multiple concurrent live crawls for each product, and use our patent-pending unsupervised content extraction systems to gather enhanced data.

2) Structured and Normalized Attributes

Conceptually, we like to frame the HS code classification as a two step challenge. First, identify what the product is (table? chair?). Second, look for the specific attributes required to make your decision (is the material wood, metal or plastic?). The second part is arguably the toughest challenge for an algorithm - not only does it have to identify what attributes it should be looking for, it also has to extract the attribute for the product, while also understanding what attribute values are permissible ("MDF" and "Ply" should also be interpreted as "material:wood").

To tame this problem, we build statistical and Named Entity Recognition (NER) models on data crawled from e-commerce websites. These websites often provide data in a structured form, albeit in their own custom formats. With human assistance to interpret this data, we can build large training datasets that allow for heuristics and models that can extract structured attributes from unstructured product information.

Having these attributes clearly setup as normalized key-value pairs makes the process of picking a matching HS code, either through automated means or through human classification, significantly easier.

3) Transfer Learning with Pre-Trained Weights

Finding training datasets for HS code classification is difficult. But finding e-commerce data is relatively easier - over the years, we’ve gathered hundreds of millions of product records by crawling the web.

This matters because the hardest challenge in any NLP (Natural Language Processing) problem is that of building that first embedding / transformer module. Publicly available weights trained on corpora like Wikipedia don’t translate that well to the e-commerce domain, since e-commerce names don’t follow a real grammatical structure, and are dominated by proper nouns to a larger degree.

We deal with this by initializing our classifiers with tokenizers and embeddings pre-trained in an unsupervised fashion on the mounds of data in our product catalog.

4) One-Shot Learning with Taxonomy Vectors

From a deep learning perspective, the easiest way to deal with this problem would be to adopt a multi-class classification approach. This approach, however, doesn’t work well if you have HS codes with very limited, or even no data available.

One way in which we’ve approached this is to setup the architecture as a Q&A problem. Instead of asking the model “what is the HS code for this product”, we run through each of the possible HS codes at each decision point (chapter, heading, sub-heading and so on) and ask the model “does the product fit this particular HS code description”. With this setup, the model only has to learn how to understand the taxonomy, rather than having to build a unique representation for each of the possible HS codes.

5) Search Space Reduction and more, with Humans-in-the-loop

Our vision of a real-world system for HTS classification is one in which humans work in tandem with algorithms. Practically, we achieve this in a few ways:

  • When human classification is required, we frame the problem as a multiple-choice question, rather than an open-ended task that requires the classifier to scour the entire HS code taxonomy. This form of “search space reduction” helps significantly reduce the time taken to successfully assign an HS code.
  • We provide data gathered by crawling as supporting information for the annotation task, so that the overhead time to Google for additional information is reduced.
  • We use our NER modules to provide product data in the form of structured attributes to reduce the cognitive overhead of reading and understanding product information.
  • We have conventions by which human classifiers can provide feedback when they find issues that may be representative of a systemic problematic patterns; how this heuristic system interacts with the machine learning model is covered in detail here.

tl;dr: HTS classification is a complex challenge. Any system that aims to tackle this problem in an automated fashion needs to learn how to crawl for supplemental information where required, transform unstructured data into structured form, understand the “language” of commerce (which can be quite different from natural English language), be equipped with the right architecture that can use the HTS taxonomy itself, and finally, learn how to work harmoniously with expert human classifiers.