Technology

Data + AI + Humans.
The secret sauce behind the tech stack at Semantics3.

Retail Catalog Feeds

Retail catalog feeds from top retailers across the ecommerce world, made possible by:

Content Extraction: Supervised and unsupervised systems (patent pending) that can systematically spider through websites and extract meaningful information from HTML.

WhisperCrawl: Scalable crawling infrastructure that crawls and processes terabytes of data on a daily basis.

Attribute Suite

The suite includes:

Extraction
: Generates structured attributes from unstructured text, HTML and images.

Normalization
: Maps attribute values to a standardized unified representation.

Inference
: Imputes and estimates structured attributes by association, even if the data isn't explicitly stated in the input.

Universal Product Catalog

A curated database of hundreds of millions of products, aggregated from thousands of retailers across the web.

The Catalog is kept updated by virtue of daily crawls which both refresh this data, and discover new SKUs. Quality is maintained by an elaborate pipeline built on the back of algorithms, statistics and humans-in-the-loop.

This data is maintained as a standard database, and in the form of a knowledge graph.

HS Classification

Automatic assignment of HS codes to products. The system is built on the back of real-world decisions made by professionals on-the-ground. The system is also tuned to understand Schedule descriptions and CROSS Rulings.

Smart/Expanded Crawl

Given a URL, UPC, ASIN, model number or even just a keyword, this system is capable of intelligently launching search missions across the web to identify product information relevant to the request. This helps automatically enrich incomplete datasets.

Vector Search & Deduplication

Algorithmic pipeline that helps identify and eliminate duplicates from catalogs. This is built on the back of a Product2Vector search engine that enables similarity search at scale across hundreds of millions of products.

Ranking & Historical Data

This system mines demand signals such as rank, rating and review count from ecommerce websites. This helps reverse engineer the demand of a SKU on a particular platform.

Our historical databases built over the years span tens of billions of data points.

Ecommerce Categorization

AI-based classification of ecommerce goods to standardized taxonomies.

This has been built to work not just with popular taxonomies, but also with custom trees, by virtue of transfer learning.

Every ecommerce Company has to re-invent the product data science stack.

Get started right now.