Accelerated Combinatorial Search for Outlier Detection with Provable Bound on Sub-Optimality
DOI:
https://doi.org/10.1609/aaai.v35i14.17475Keywords:
Heuristic Search, Dimensionality Reduction/Feature Selection, Anomaly/Outlier Detection, Unsupervised & Self-Supervised LearningAbstract
Outliers negatively affect the accuracy of data analysis. In this paper we are concerned with their influence on the accuracy of Principal Component Analysis (PCA). Algorithms that attempt to detect outliers and remove them from the data prior to applying PCA are sometimes called Robust PCA, or Robust Subspace Recovery algorithms. We propose a new algorithm for outlier detection that combines two ideas. The first is "chunk recursive elimination" that was used effectively to accelerate feature selection, and the second is combinatorial search, in a setting similar to A*. Our main result is showing how to combine these two ideas. One variant of our algorithm is guaranteed to compute an optimal solution according to some natural criteria, but its running time makes it impractical for large datasets. Other variants are much faster and come with provable bounds on sub-optimality. Experimental results show the effectiveness of the proposed approach.Downloads
Published
2021-05-18
How to Cite
Wan, G., & Schweitzer, H. (2021). Accelerated Combinatorial Search for Outlier Detection with Provable Bound on Sub-Optimality. Proceedings of the AAAI Conference on Artificial Intelligence, 35(14), 12436-12444. https://doi.org/10.1609/aaai.v35i14.17475
Issue
Section
AAAI Technical Track on Search and Optimization