18977

Semantic Product Search

Priyanka Nigam, Yiwei Song, Vijai Mohan, Vihan Lakshman, Weitian (Allen)Ding, Ankit Shingavi, Choon Hui Teo, Hao Gu, Bing Yin
Amazon, Palo Alto, California, USA
arXiv:1907.00937 [cs.IR], (1 Jul 2019)

@misc{nigam2019semantic,

   title={Semantic Product Search},

   author={Nigam, Priyanka and Song, Yiwei and Mohan, Vijai and Lakshman, Vihan and Ding, Weitian (Allen) and Shingavi, Ankit and Teo, Choon Hui and Gu, Hao and Yin, Bing},

   year={2019},

   eprint={1907.00937},

   archivePrefix={arXiv},

   primaryClass={cs.IR}

}

Download Download (PDF)   View View   Source Source   

1429

views

We study the problem of semantic matching in product search, that is, given a customer query, retrieve all semantically related products from the catalog. Pure lexical matching via an inverted index falls short in this respect due to several factors: a) lack of understanding of hypernyms, synonyms, and antonyms, b) fragility to morphological variants (e.g. "woman" vs. "women"), and c) sensitivity to spelling errors. To address these issues, we train a deep learning model for semantic matching using customer behavior data. Much of the recent work on large-scale semantic search using deep learning focuses on ranking for web search. In contrast, semantic matching for product search presents several novel challenges, which we elucidate in this paper. We address these challenges by a) developing a new loss function that has an inbuilt threshold to differentiate between random negative examples, impressed but not purchased examples, and positive examples (purchased items), b) using average pooling in conjunction with n-grams to capture short-range linguistic patterns, c) using hashing to handle out of vocabulary tokens, and d) using a model parallel training architecture to scale across 8 GPUs. We present compelling offline results that demonstrate at least 4.7% improvement in Recall@100 and 14.5% improvement in mean average precision (MAP) over baseline state-of-the-art semantic search methods using the same tokenization method. Moreover, we present results and discuss learnings from online A/B tests which demonstrate the efficacy of our method.
Rating: 2.0/5. From 1 vote.
Please wait...

* * *

* * *

HGPU group © 2010-2024 hgpu.org

All rights belong to the respective authors

Contact us: