POLAR: A Per-User Association Test in Embedding Space
DOI:
https://doi.org/10.1609/icwsm.v20i1.42636Abstract
Most intrinsic association probes operate at the word, sentence, or corpus level, obscuring author-level variation. We present COMPASS (COsine-Based Measure of Per-User Association with Semantic Subspaces), a per-user lexical association test that runs in the embedding space of a lightly adapted masked language model. Authors are represented by private deterministic tokens; COMPASS projects these vectors onto curated lexical axes and reports standardized effects with permutation p-values and Benjamini--Hochberg control. On a balanced bot--human Twitter benchmark, COMPASS cleanly separates LLM-driven bots from organic accounts; on an extremist forum, it quantifies strong alignment with slur lexicons and reveals rightward drift over time. The method is modular to new attribute sets and provides concise, per-author diagnostics for computational social science.Downloads
Published
2026-05-25
How to Cite
Bento, P., Buzelin, A., Chagas, A., Aquino, Y., Estanislau, V., Malaquias, S., … Meira Jr., W. (2026). POLAR: A Per-User Association Test in Embedding Space. Proceedings of the International AAAI Conference on Web and Social Media, 20(1), 250–261. https://doi.org/10.1609/icwsm.v20i1.42636
Issue
Section
Full Papers