The Boundary Forest Algorithm for Online Supervised and Unsupervised Learning

Charles Mathy; Nate Derbinsky; Jose Bento; Jonathan Rosenthal; Jonathan Yedidia

doi:10.1609/aaai.v29i1.9622

Authors

Charles Mathy Disney Research Boston
Nate Derbinsky Wentworth Institute of Technology
Jose Bento Boston College
Jonathan Rosenthal Disney Research Boston
Jonathan Yedidia Disney Research Boston

DOI:

https://doi.org/10.1609/aaai.v29i1.9622

Keywords:

Machine learning, Real time, tree-based search, Classification, Regression, Retrieval

Abstract

We describe a new instance-based learning algorithm called the Boundary Forest (BF) algorithm, that can be used for supervised and unsupervised learning. The al- gorithm builds a forest of trees whose nodes store previ- ously seen examples. It can be shown data points one at a time and updates itself incrementally, hence it is nat- urally online. Few instance-based algorithms have this property while being simultaneously fast, which the BF is. This is crucial for applications where one needs to respond to input data in real time. The number of chil- dren of each node is not set beforehand but obtained from the training procedure, which makes the algorithm very flexible with regards to what data manifolds it can learn. We test its generalization performance and speed on a range of benchmark datasets and detail in which settings it outperforms the state of the art. Empirically we find that training time scales as O(DN log(N )) and testing as O(Dlog(N)), where D is the dimensionality and N the amount of data.

The Boundary Forest Algorithm for Online Supervised and Unsupervised Learning

Authors

DOI:

Keywords:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information

Developed By

Subscription