Robustness Can Be Cheap: A Highly Efficient Approach to Discover Outliers under High Outlier Ratios

Siqi Wang; En Zhu; Xiping Hu; Xinwang Liu; Qiang Liu; Jianping Yin; Fei Wang

doi:10.1609/aaai.v33i01.33015313

Authors

Siqi Wang National University of Defense Technology
En Zhu National University of Defense Technology
Xiping Hu Chinese Academy of Science
Xinwang Liu National University of Defense Technology
Qiang Liu National University of Defense Technology
Jianping Yin National University of Defense Technology
Fei Wang Cornell University

DOI:

https://doi.org/10.1609/aaai.v33i01.33015313

Abstract

Efficient detection of outliers from massive data with a high outlier ratio is challenging but not explicitly discussed yet. In such a case, existing methods either suffer from poor robustness or require expensive computations. This paper proposes a Low-rank based Efficient Outlier Detection (LEOD) framework to achieve favorable robustness against high outlier ratios with much cheaper computations. Specifically, it is worth highlighting the following aspects of LEOD: (1) Our framework exploits the low-rank structure embedded in the similarity matrix and considers inliers/outliers equally based on this low-rank structure, which facilitates us to encourage satisfying robustness with low computational cost later; (2) A novel re-weighting algorithm is derived as a new general solution to the constrained eigenvalue problem, which is a major bottleneck for the optimization process. Instead of the high space and time complexity (O((2n)²)/O((2n)³)) required by the classic solution, our algorithm enjoys O(n) space complexity and a faster optimization speed in the experiments; (3) A new alternative formulation is proposed for further acceleration of the solution process, where a cheap closed-form solution can be obtained. Experiments show that LEOD achieves strong robustness under an outlier ratio from 20% to 60%, while it is at most 100 times more memory efficient and 1000 times faster than its previous counterpart that attains comparable performance. The codes of LEOD are publicly available at https://github.com/demonzyj56/LEOD.

Robustness Can Be Cheap: A Highly Efficient Approach to Discover Outliers under High Outlier Ratios

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information