CrowdMask: Using Crowds to Preserve Privacy in Crowd-Powered Systems via Progressive Filtering
Keywords:crowdsourcing, human computation, privacy
Crowd-powered systems leverage human intelligence to go beyond the capabilities of automated systems, but also introduce privacy and security concerns because unknown people must view the data that the system processes. While automated approaches cannot robustly filter private information from these datasets, people have the ability to do so if the risk from them viewing the data can be mitigated. We present a crowd-powered approach to masking private content in data by segmenting and distributing smaller segments to crowd workers so that individual workers can identify potentially private content without being able to fully view it themselves. We introduce a novel pyramid workflow for segmentation that uses segments at multiple levels of granularity to overcome problems with fixed-sized approaches. We implement our approach in CrowdMask, a system that allows images with potentially sensitive content to be masked by appearing in progressively larger, more identifiable segments, and masking portions of the image as soon as a risk is identified. Our experiments with 4134 Mechanical Turk workers show that CrowdMask can effectively mask private content from images without revealing sensitive content to constituent workers, while still enabling future systems to use the filtered result.