Accelerating Web Search using GPUs

hgpu.org » Applications » Computer science » Accelerating Web Search using GPUs

Accelerating Web Search using GPUs

Rimon Tadros

The University of British Columbia

The University of British Columbia, 2015

@article{tadros2015accelerating,

title={Accelerating web search using GPUs},

author={Tadros, Rimon},

year={2015},

publisher={University of British Columbia}

}

Download (PDF)

View

Source

2506

views

The amount of content on the Internet is growing rapidly as well as the number of the online Internet users. As a consequence, web search engines need to increase their computing capabilities and data continually while maintaining low search latency and without a significant rise in the cost per query. To serve this larger numbers of online users, web search engines utilize a large distributed system in the data centers. They partition their data across several hundred of thousands of independent commodity servers called Index Serving Nodes (ISNs). These ISNs work together to serve search queries as a single coherent system in a distributed manner. The choice of a high number of commodity servers vs. a smaller number of supercomputers is due to the need for scalability, high availability/reliability, performance, and cost efficiency. For the web search engines to serve a larger data, the web search engines can be scaled either vertically or horizontally [21]. Vertical scaling enables ranking more documents per query within a single node by employing machines with higher single thread and throughput performance with bigger and faster memory. Horizontal scaling supports a larger index by adding more index serving nodes at the cost of increased synchronization, aggregation overhead, and power consumption. This thesis evaluates the potential for achieving better vertical scaling by using Graphics processing unit (GPUs) to reduce the documents ranking latency per query at a reasonable initial cost increase. It achieves this by exploiting the parallelism in ranking the numerous potential documents that match a query to offload to the GPU. We evaluate this approach using hundreds of rankers from a commercial web search engine on real production data. Our results show an 8.8x harmonic mean reduction in the latency and 2x power efficiency when ranking 10000 web documents per query for a variety of rankers using C++AMP and a commodity GPU.

Tags: C++ AMP, Computer science, nVidia, nVidia GeForce GTX 660 Ti, nVidia GeForce GTX 750 Ti, Search, Thesis

September 8, 2015 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

high performance computing on graphics processing units: hgpu.org