high performance computing on graphics processing units: hgpu.org

hgpu.org » Applications » Computer science » Data Stream Classification using Random Feature Functions and Novel Method Combinations

Data Stream Classification using Random Feature Functions and Novel Method Combinations

Diego Marron, Jesse Read, Albert Bifet, Nacho Navarro

Department of Computer Architecture, Universitat Politecnica de Catalunya, Spain

arXiv:1511.00971 [cs.LG], (3 Nov 2015)

BibTeX

Download (PDF)

View

Source

1581

views

Big Data streams are being generated in a faster, bigger, and more commonplace. In this scenario, Hoeffding Trees are an established method for classification. Several extensions exist, including high-performing ensemble setups such as online and leveraging bagging. Also, $k$-nearest neighbors is a popular choice, with most extensions dealing with the inherent performance limitations over a potentially-infinite stream. At the same time, gradient descent methods are becoming increasingly popular, owing in part to the successes of deep learning. Although deep neural networks can learn incrementally, they have so far proved too sensitive to hyper-parameter options and initial conditions to be considered an effective ‘off-the-shelf’ data-streams solution. In this work, we look at combinations of Hoeffding-trees, nearest neighbour, and gradient descent methods with a streaming preprocessing approach in the form of a random feature functions filter for additional predictive power. We further extend the investigation to implementing methods on GPUs, which we test on some large real-world datasets, and show the benefits of using GPUs for data-stream learning due to their high scalability. Our empirical evaluation yields positive results for the novel approaches that we experiment with, highlighting important issues, and shed light on promising future directions in approaches to data-stream classification.

Tags: Computer science, CUDA, Deep learning, Nearest neighbour, Neural networks, nVidia, Tesla K40

November 8, 2015 by hgpu

No votes yet.

Please wait...

Your response

You must be logged in to post a comment.

* * *

high performance computing on graphics processing units: hgpu.org

Data Stream Classification using Random Feature Functions and Novel Method Combinations

Your response

Recent source codes

GEAK-agent: LLM-based AI agent, which can write correct and efficient GPU kernels automatically

OpenDwarfs 2025: re-engineered version of the OpenDwarfs benchmark suite, for compatibility with modern platforms

Specx: Speculative task-based runtime system

Mutual-Supervised Learning for Sequential-to-Parallel Code Translation

Hardware Compute Partitioning on NVIDIA GPUs for Composable Systems

KISim: Kubernetes Intelligent Scheduling Simulator

Efficient GPU Implementation of Multi-Precision Integer Division

ParEval: A Parallel Code Evaluation Benchmark

FlashSparse: Minimizing Computation Redundancy for Fast Sparse Matrix Multiplications on Tensor Cores

exa-AMD: Exascale Accelerated Materials Discovery

Most viewed papers (last 30 days)

Data Stream Classification using Random Feature Functions and Novel Method Combinations

Share this:

Your response

Recent source codes

Most viewed papers (last 30 days)