More Accurate Learning of k-DNF Reference Classes
In machine learning, predictors trained on a given data distribution are usually guaranteed to perform well for further examples from the same distribution on average. This often may involve disregarding or diminishing the predictive power on atypical examples; or, in more extreme cases, a data distribution may be composed of a mixture of individually “atypical” heterogeneous populations, and the kind of simple predictors we can train may find it difficult to fit all of these populations simultaneously. In such cases, we may wish to make predictions for an atypical point by selecting a suitable reference class for that point: a subset of the data that is “more similar” to the given query point in an appropriate sense. Closely related tasks also arise in applications such as diagnosis or explaining the output of classifiers. We present new algorithms for computing k-DNF reference classes and establish much stronger approximation guarantees for their error rates.