An Information-Theoretic Quantification of Discrimination with Exempt Features
The needs of a business (e.g., hiring) may require the use of certain features that are critical in a way that any discrimination arising due to them should be exempted. In this work, we propose a novel information-theoretic decomposition of the total discrimination (in a counterfactual sense) into a non-exempt component, which quantifies the part of the discrimination that cannot be accounted for by the critical features, and an exempt component, which quantifies the remaining discrimination. Our decomposition enables selective removal of the non-exempt component if desired. We arrive at this decomposition through examples and counterexamples that enable us to first obtain a set of desirable properties that any measure of non-exempt discrimination should satisfy. We then demonstrate that our proposed quantification of non-exempt discrimination satisfies all of them. This decomposition leverages a body of work from information theory called Partial Information Decomposition (PID). We also obtain an impossibility result showing that no observational measure of non-exempt discrimination can satisfy all of the desired properties, which leads us to relax our goals and examine alternative observational measures that satisfy only some of these properties. We then perform a case study using one observational measure to show how one might train a model allowing for exemption of discrimination due to critical features.