A parallel accelerator for semantic search
NEC Laboratories America, Inc., Princeton, NJ, USA
IEEE 9th Symposium on Application Specific Processors (SASP), 2011
@inproceedings{majumdar2011parallel,
title={A parallel accelerator for semantic search},
author={Majumdar, A. and Cadambi, S. and Chakradhar, S.T. and Graf, H.P.},
booktitle={Application Specific Processors (SASP), 2011 IEEE 9th Symposium on},
pages={122–128},
organization={IEEE}
}
Semantic text analysis is a technique used in advertisement placement, cognitive databases and search engines. With increasing amounts of data and stringent response-time requirements, improving the underlying implementation of semantic analysis becomes critical. To this end, we look at Supervised Semantic Indexing (SSI), a recently proposed algorithm for semantic analysis. SSI ranks a large number of documents based on their semantic similarity to a text query. For each query, it computes millions of dot products on unstructured data, generates a large intermediate result, and then performs ranking. SSI underperforms on both state-of-the-art multi-cores as well as GPUs. Its performance scalability on multi-cores is hampered by their limited support for fine-grained data parallelism. GPUs, though beat multi-cores by running thousands of threads, cannot handle large intermediate data because of their small on-chip memory. Motivated by this, we present an FPGA-based hardware accelerator for semantic analysis. As a key feature, the accelerator combines hundreds of simple processing elements together with in-memory processing to simultaneously generate and process (consume) the large intermediate data. It also supports "dynamic parallelism" – a feature that configures the PEs differently for full utilization of the available processin logic after the FPGA is programmed. Our FPGA prototype is 10-13x faster than a 2.5 GHz quad-core Xeon, and 1.5-5x faster than a 240 core 1.3 GHz Tesla GPU, despite operating at a modest frequency of 125 MHz.
July 30, 2011 by hgpu