Proceedings of the International AAAI Conference on Web and Social Media

Decentralised Moderation for Interoperable Social Networks: A Conversation-Based Approach for Pleroma and the Fediverse

2024-05-28T04:59:22-07:00

The recent development of decentralised and interoperable social networks (such as the "fediverse") creates new challenges for content moderators. This is because millions of posts generated on one server can easily "spread" to another, even if the recipient server has very different moderation policies. An obvious solution would be to leverage moderation tools to automatically tag (and filter) posts that contravene moderation policies, e.g. related to toxic speech. Recent work has exploited the conversational context of a post to improve this automatic tagging, e.g. using the replies to a post to help classify if it contains toxic speech. This has shown particular potential in environments with large training sets that contain complete conversations. This, however, creates challenges in a decentralised context, as a single conversation may be fragmented across multiple servers. Thus, each server only has a partial view of an entire conversation because conversations are often federated across servers in a non-synchronized fashion. To address this, we propose a decentralised conversation-aware content moderation approach suitable for the fediverse. Our approach employs a graph deep learning model (GraphNLI) trained locally on each server. The model exploits local data to train a model that combines post and conversational information captured through random walks to detect toxicity. We evaluate our approach with data from Pleroma, a major decentralised and interoperable micro-blogging network containing 2 million conversations. Our model effectively detects toxicity on larger instances, exclusively trained using their local post information (0.8837 macro-F1). Yet, we show that this approach does not perform well on smaller instances that do not possess sufficient local training data. Thus, in cases where a server contains insufficient data, we strategically retrieve information (posts or model parameters) from other servers to reconstruct larger conversations and improve results. With this, we show that we can attain a macro-F1 of 0.8826. Our approach has considerable scope to improve moderation in decentralised and interoperable social networks such as Pleroma or Mastodon.

Analyzing the Stance of Facebook Posts on Abortion Considering State-Level Health and Social Compositions

2024-05-28T04:59:24-07:00

Abortion remains one of the most controversial topics, especially after overturning Roe v. Wade ruling in the United States. Previous literature showed that the illegality of abortion could have serious consequences, as women might seek unsafe pregnancy terminations leading to increased maternal mortality rates and negative effects on their reproductive health. Therefore, the stances of the abortion-related Facebook posts were analyzed at the state level in the United States from May 4 until June 30, 2022, right after the Supreme Court’s decision was disclosed. In more detail, a pre-trained Transformer architecture-based model was fine-tuned on a manually labeled training set to obtain a stance detection model suitable for the collected dataset. Afterward, we employed appropriate statistical tests to examine the relationships between public opinion regarding abortion, abortion legality, political leaning, and factors measuring the overall population’s health, health knowledge, and vulnerability per state. We found that infant mortality rate, political affiliation, abortion rates, and abortion legality are associated with stances toward abortion at the state level in the US. While aligned with existing literature, these findings indicate how public opinion, laws, and women’s and infants’ health are related, as well as how these relationships can be demonstrated by using social media data.

Users’ Behavioral and Emotional Response to Toxicity in Twitter Conversations

2024-05-28T04:59:25-07:00

Prior works have shown connections between online toxicity attacks, such as harassment, cyberbullying, and hate speech, and the subsequent increase in offline violence, as well as negative psychological effects on victims. These correlations are primarily identified through user studies conducted via virtual environments, simulations, and questionnaires. However, no work has investigated how, in practice and authentically, people react to online toxicity both emotionally, showing anger, anxiety, and sadness, and behaviorally in terms of engaging with and responding to toxicity instigators, considering conversations as a whole and the relation between emotions and behaviors. This data-driven study investigates the effect of toxicity on Twitter users' behaviors and emotions considering confounding factors, such as account identifiability, activity, and conversation's structure and topic. We collected about 80K Twitter conversations and identified those with and without toxic replies. Performing statistical tests along with propensity score matching, we investigated the causal association of receiving toxicity and users' responses. We found that authors of conversations with toxic replies are more likely to engage in conversations, reply in a toxic way, and unfollow toxicity instigators. In terms of users' emotional responses, we found that sadness and anger after the first toxic reply are more likely to increase as the amount of toxicity increases. These findings not only emphasize the negative emotional and behavioral effects of online toxicity on social media users but also, as demonstrated in this paper, can be utilized to build prediction models for users' reactions, which could then aid the implementation of proactive detection and intervention measures helping users in such situations.

From Isolation to Desolation: Investigating Self-Harm Discussions in Incel Communities

2024-05-28T04:59:27-07:00

Incel communities have recently attracted the public's interest mainly due to their high degree of extreme views and involvement in real-world violence. A common theme in Incel communities is self-harm discussions. Despite this, beyond small-scale qualitative analyses of self-harm discussions in Incel communities, we lack a large-scale quantitative understanding of how Incels discuss self-harm and how it differs from mainstream communities. In this work, we aim to demystify self-harm discussions in Incel communities using a data-driven approach and understand how Incels differentiate from mainstream communities. We use a dataset of 6.4M posts from 18 Incel subreddits and 2.4M posts from an Incel forum, as well as 5.8M posts from two mainstream subreddits discussing mental health. Using word embedding approaches, temporal analyses, topic modeling, and qualitative analysis, we shed light on self-harm discussions in Incel and mainstream communities and their evolution over time. We find substantial differences in the language related to self-harm deployed among the communities; we find that Incels use niche terms related to self-harm, which is not the case in mainstream communities. We observe that over time, language related to self-harm evolves considerably more among Incels than in mainstream communities. Also, we observe that negative perception of their physical appearance is the most recurrent theme in self-harm conversations for Incels, which does not feature in mainstream communities. Finally, by analyzing social factors, we find that Substance abuse is the most closely associated social factor to self-harm in Incel and mainstream communities and that Physical Appearance, over time, is becoming increasingly closely related to self-harm discussions in Incel communities.

Consequences of Conflicts in Online Conversations

2024-05-28T04:59:28-07:00

Interpersonal conflicts occur frequently in both offline and online groups, with conditions for conflict especially ripe online. This research attempts to understand the consequences of online group conflict and reporting it to group administrators, both for the protagonists in the conflict and observers. If group conflict is aversive, then group members should reduce their group participation after observing conflict. Theories of imitation and behavioral mimicry suggest that even onlookers will exhibit more conflict and negative language after observing conflict conversations in their group. In contrast, theories of deterrence suggest that both the instigator of the conflict and onlookers will reduce their conflict and onlookers might even increase their engagement if conflicts are reported to group administrators. The current study uses de-identified and aggregated data from Facebook group conversations and Mahalanobis distance matching to test these ideas. Results are consistent with the hypothesis that conflict in group conversations reduces engagement within the group and increases the amount of conflict and the negativity of language users express in the group. However, inconsistent with deterrence theories, conflict and language negativity increase and group engagement decreases when conflict is reported to group administrators.

Curated and Asymmetric Exposure: A Case Study of Partisan Talk during COVID on Twitter

2024-05-28T04:59:30-07:00

Social media has been at the center of discussions about political polarization in the United States. However, scholars are actively debating both the scale of political polarization online, and how important online polarization is to the offline world. One question at the center of this debate is what interactions across parties look like online, and in particular 1) whether increasing the number of such interactions is likely to increase or reduce polarization, and 2) what technological affordances may make it more likely that these cross-party interactions benefit, rather than detract from, existing political challenges. The present work aims to provide insights into the latter; that is, we focus on providing a better understanding of how a set of 400,000 partisan users on a particular social media platform, Twitter, used the platform's affordances to interact within and across parties in a large dataset of tweets about COVID in 2021. Our findings suggest that Republican use of cross-party interaction were both more potent and potentially more strategic during COVID, that cross-party interaction was driven heavily by a small set of users and conversations, and that there exist non-obvious indirect pathways to cross-party exposure when different modes of interaction are chained together (especially retweets of quotes). These findings have implications beyond Twitter, we believe, in understanding how affordances of platforms can help to shape partisan exposure and interaction.

MultiFOLD: Multi-source Domain Adaption for Offensive Language Detection

2024-05-28T04:59:31-07:00

Automatic offensive language detection remains challenging, and is a crucial part of preserving the openness of digital spaces, which are an integral part of our everyday experi- ence. The ever-growing forms of offensive online content makes traditional supervised approaches harder to scale due to the financial and psychological costs incurred by collect- ing human annotations. In this work, we propose a domain adaptation framework for offensive language detection, Mul- tiFOLD, which learns and adapts from multiple existing data sets (or source domains) to an unlabeled target domain. Under the hood, a curriculum learning algorithm is employed that kicks off learning with the instances most similar to the target domain while gradually expanding to more distant instances. The proposed model is trained with a standard task-specific loss and a domain adversarial objective which aims to min- imize the language distinctions across the multiple sources and the target, allowing the classifier to distinguish offen- siveness rather than domain. Our experiments on six pub- licly available data sets demonstrate the effectiveness of Mul- tiFOLD. Relative improvement in F1 of 0.5% (WOAH) to 29.7% (ICWSM) is found across five out of the six datasets compared to the state-of-the-art domain adaptation baseline BERT-DAA, resulting in an average of 6% relative F1-score gain.

Orphan Articles: The Dark Matter of Wikipedia

2024-05-28T04:59:32-07:00

With 60M articles in more than 300 language versions, Wikipedia is the largest platform for open and freely accessible knowledge. While the available content has been growing continuously at a rate of around 200K new articles each month, very little attention has been paid to the discoverability of the content. One crucial aspect of discoverability is the integration of hyperlinks into the network so the articles are visible to readers navigating Wikipedia. To understand this phenomenon, we conduct the first systematic study of orphan articles, which are articles without any incoming links from other Wikipedia articles, across 319 different language versions of Wikipedia. We find that a surprisingly large extent of content, roughly 15% (8.8M) of all articles, is de facto invisible to readers navigating Wikipedia, and thus, rightfully term orphan articles as the dark matter of Wikipedia. We also provide causal evidence through a quasi-experiment that adding new incoming links to orphans (de-orphanization) leads to a statistically significant increase in their visibility in terms of the number of pageviews. We further highlight the challenges faced by editors for de-orphanizing articles, demonstrate the need to support them in addressing this issue, and provide potential solutions for developing automated tools based on cross-lingual approaches. Overall, our work not only unravels a key limitation in the link structure of Wikipedia and quantitatively assesses its impact but also provides a new perspective on the challenges of maintenance associated with content creation at scale in Wikipedia.

You Are a Bot! – Studying the Development of Bot Accusations on Twitter

2024-05-28T04:59:34-07:00

The characterization and detection of bots with their presumed ability to manipulate society on social media platforms have been subject to many research endeavors over the last decade. In the absence of ground truth data (i.e., accounts that are labeled as bots by experts or self-declare their automated nature), researchers interested in the characterization and detection of bots may want to tap into the wisdom of the crowd. But how many people need to accuse another user as a bot before we can assume that the account is most likely automated? And more importantly, are bot accusations on social media at all a valid signal for the detection of bots? Our research presents the first large-scale study of bot accusations on Twitter and shows how the term bot became an instrument of dehumanization in social media conversations since it is predominantly used to deny the humanness of conversation partners. Consequently, bot accusations on social media should not be naively used as a signal to train or test bot detection models.

Temporal Network Analysis of Email Communication Patterns in a Long Standing Hierarchy

2024-05-28T04:59:36-07:00

An important concept in organisational behaviour is how hierarchy affects the voice of individuals, whereby members of a given organisation exhibit differing power relations based on their hierarchical position. Although there have been prior studies of the relationship between hierarchy and voice, they tend to focus on more qualitative small-scale methods and do not account for structural aspects of the organisation. This paper develops large-scale computational techniques utilising temporal network analysis to measure the effect that organisational hierarchy has on communication patterns throughout an organisation, focusing on the structure of pairwise interactions between individuals. To this end, we focus on one major organisation as a case study --- the Internet Engineering Task Force (IETF) --- a major technical standards development organisation for the Internet. A particularly useful feature of the IETF is a transparent hierarchy, where participants take on explicit roles (e.g., Area Directors, Working Group Chairs), and because its processes are open we have visibility into the communication of people at different hierarchy levels over a long time period. Exploiting this, we utilise a temporal network dataset of 989,911 email interactions among 23,741 participants to study how hierarchy impacts communication patterns. We show that the middle levels of the IETF are growing in terms of their dominance in communications. Higher levels consistently experience a higher proportion of incoming communication than lower levels, with higher levels initiating more communications too. We find that, overall, communication tends to flow "up" the hierarchy more than "down". Finally, we find that communication with higher-levels is associated with future communication more than for lower-levels, which we interpret as "facilitation". We conclude by discussing the implications this has on patterns within the wider IETF and the impact our analysis can have for other organisations.

InstaSynth: Opportunities and Challenges in Generating Synthetic Instagram Data with ChatGPT for Sponsored Content Detection

2024-05-28T04:59:39-07:00

Large Language Models (LLMs) raise concerns about lowering the cost of generating texts that could be used for unethical or illegal purposes, especially on social media. This paper investigates the promise of such models to help enforce legal requirements related to the disclosure of sponsored content online. We investigate the use of LLMs for generating synthetic Instagram captions with two objectives: The first objective (fidelity) is to produce realistic synthetic datasets. For this, we implement content-level and network-level metrics to assess whether synthetic captions are realistic. The second objective (utility) is to create synthetic data useful for sponsored content detection. For this, we evaluate the effectiveness of the generated synthetic data for training classifiers to identify undisclosed advertisements on Instagram. Our investigations show that the objectives of fidelity and utility may conflict and that prompt engineering is a useful but insufficient strategy. Additionally, we find that while individual synthetic posts may appear realistic, collectively they lack diversity, topic connectivity, and realistic user interaction patterns.

The Persuasive Power of Large Language Models

2024-05-28T04:59:40-07:00

The increasing capability of Large Language Models to act as human-like social agents raises two important questions in the area of opinion dynamics. First, whether these agents can generate effective arguments that could be injected into the online discourse to steer the public opinion. Second, whether artificial agents can interact with each other to reproduce dynamics of persuasion typical of human social systems, opening up opportunities for studying synthetic social systems as faithful proxies for opinion dynamics in human populations. To address these questions, we designed a synthetic persuasion dialogue scenario on the topic of climate change, where a 'convincer' agent generates a persuasive argument for a 'skeptic' agent, who subsequently assesses whether the argument changed its internal opinion state. Different types of arguments were generated to incorporate different linguistic dimensions underpinning psycho-linguistic theories of opinion change. We then asked human judges to evaluate the persuasiveness of machine-generated arguments. Arguments that included factual knowledge, markers of trust, expressions of support, and conveyed status were deemed most effective according to both humans and agents, with humans reporting a marked preference for knowledge-based arguments. Our experimental framework lays the groundwork for future in-silico studies of opinion dynamics, and our findings suggest that artificial agents have the potential of playing an important role in collective processes of opinion formation in online social media.

Socio-Linguistic Characteristics of Coordinated Inauthentic Accounts

2024-05-28T04:59:42-07:00

Online manipulation is a pressing concern for democracies, but the actions and strategies of coordinated inauthentic accounts, which have been used to interfere in elections, are not well understood. We analyze a five million-tweet multilingual dataset related to the 2017 French presidential election, when a major information campaign led by Russia called "#MacronLeaks" took place. We utilize heuristics to identify coordinated inauthentic accounts and detect attitudes, concerns and emotions within their tweets, collectively known as socio-linguistic characteristics. We find that coordinated accounts retweet other coordinated accounts far more than expected by chance, while being exceptionally active just before the second round of voting. Concurrently, socio-linguistic characteristics reveal that coordinated accounts share tweets promoting a candidate at three times the rate of non-coordinated accounts. Coordinated account tactics also varied in time to reflect news events and rounds of voting. Our analysis highlights the utility of socio-linguistic characteristics to inform researchers about tactics of coordinated accounts and how these may feed into online social manipulation.

Opinion Market Model: Stemming Far-Right Opinion Spread Using Positive Interventions

2024-05-28T04:59:43-07:00

Online extremism has severe societal consequences, including normalizing hate speech, user radicalization, and increased social divisions. Various mitigation strategies have been explored to address these consequences. One such strategy uses positive interventions: controlled signals that add attention to the opinion ecosystem to boost certain opinions. To evaluate the effectiveness of positive interventions, we introduce the Opinion Market Model (OMM), a two-tier online opinion ecosystem model that considers both inter-opinion interactions and the role of positive interventions. The size of the opinion attention market is modeled in the first tier using the multivariate discrete-time Hawkes process; in the second tier, opinions cooperate and compete for market share, given limited attention using the market share attraction model. We demonstrate the convergence of our proposed estimation scheme on a synthetic dataset. Next, we test OMM on two learning tasks, applying to two real-world datasets to predict attention market shares and uncover latent relationships between online items. The first dataset comprises Facebook and Twitter discussions containing moderate and far-right opinions about bushfires and climate change. The second dataset captures popular VEVO artists' YouTube and Twitter attention volumes. OMM outperforms the state-of-the-art predictive models on both datasets and captures latent cooperation-competition relations. We uncover (1) self- and cross-reinforcement between far-right and moderate opinions on the bushfires and (2) pairwise artist relations that correlate with real-world interactions such as collaborations and long-lasting feuds. Lastly, we use OMM as a testbed for positive interventions and show how media coverage modulates the spread of far-right opinions.

SLaNT: A Semi-supervised Label Noise-Tolerant Framework for Text Sentiment Analysis

2024-05-28T04:59:45-07:00

The exponential growth of user-generated comment data on social media platforms has greatly promoted research on text sentiment analysis. However, the presence of conflicting sentiments within user comments, known as 'user comments with noisy labels', poses a significant challenge to the reliability of sentiment analysis models. Many current approaches address this issue by either discarding noisy samples or assigning small weights to them during training, but these strategies can lead to sample wastage and reduced model robustness. In this paper, we present SLaNT, a novel semi-supervised label noise-tolerant framework specifically designed for text sentiment analysis. SLaNT employs a four-module pipeline that includes Noisy Data Identification, Data Augmentation, Noisy Data Relabeling, and Re-training. Notably, SLaNT introduces an early stopping strategy to efficiently identify noisy samples. Additionally, to mitigate confirmation bias during the relabeling of noisy data, a unique co-relabeling strategy based on ensemble learning is integrated into SLaNT. Experimental results on four text user comment datasets demonstrate that SLaNT significantly outperforms four selected strong baselines.

The Quiet Power of Social Media: Impact on Fish-Oil Purchases in Iceland during COVID-19

2024-05-28T04:59:46-07:00

The rise of social media has revolutionized communication and the sharing of information and interests, with a significant impact on purchasing behavior. Consumers increasingly rely on social media for product recommendations and reviews, often finding themselves "accidentally influenced" by other users' posts and advice. This study examines the impact of social media in Iceland during the COVID-19 pandemic when there was a surge of posts giving dietary advice to prevent or treat the virus or its symptoms. One example is the rise and fall of fish oil advice. Using a large-scale dataset from one of the most popular supermarket chains in Iceland and netnography, we apply Data Science to analyze: sales data; Google search trends; Twitter posts from 2019 and 2020 to understand the impact of the online world on purchasing behavior in the offline world. Our results show the massive power of social media on people's purchasing behavior, particularly during a pandemic, and provide a comparison of consumer behavior before and during COVID-19.

Detection and Discovery of Misinformation Sources Using Attributed Webgraphs

2024-05-28T04:59:47-07:00

Website reliability labels underpin almost all research in misinformation detection. However, misinformation sources often exhibit transient behavior, which makes many such labeled lists obsolete over time. We demonstrate that Search Engine Optimization (SEO) attributes provide strong signals for predicting news site reliability. We introduce a novel attributed webgraph dataset with labeled news domains and their connections to outlinking and backlinking domains. We demonstrate the success of graph neural networks in detecting news site reliability using these attributed webgraphs, and show that our baseline news site reliability classifier outperforms current SoTA methods on the PoliticalNews dataset, achieving an F1 score of 0.96. Finally, we introduce and evaluate a novel graph-based algorithm for discovering previously unknown misinformation news sources.

Understanding Community Resilience: Quantifying the Effects of Sudden Popularity via Algorithmic Curation

2024-05-28T04:59:49-07:00

The sudden popularity communities gain via algorithmically-curated "trending'" or "hot" social media feeds can be beneficial or disruptive. On one hand, increased attention often brings new users and promotes community growth. On the other hand, the unexpected influx of newcomers can burden already overworked moderation teams. To examine the impact of sudden popularity, we studied 6,306 posts that reached Reddit's front page---a feed called r/popular that millions of users browse daily---and the effects of sudden popularity within 1,320 subreddits. We find that on average, r/popular posts have 45 times the comments, 42 times the removed comments, and 70 times the number of newcomers compared to posts from the same community that did not reach r/popular. Additionally, r/popular posts led to a peak 85% median increase in the subreddit's comment rate, and these effects lingered for about 12 hours. Our regression analysis shows that stricter moderation and previous r/popular appearances were associated with shorter and less intense effects on the community. By quantifying the differential effects of sudden popularity, we provide recommendations for moderators to promote stability and community resilience in the face of unexpected disruptions.

How Audit Methods Impact Our Understanding of YouTube’s Recommendation Systems

2024-05-28T04:59:50-07:00

Computational audits of social media websites have generated data that forms the basis of our understanding of the problematic behaviors of algorithmic recommendation systems. Focusing on YouTube, this paper demonstrates that conducting audits to make specific inferences about the underlying content recommendation system is more methodologically challenging than one might expect. Obtaining scientifically valid results requires considering many methodological decisions, and each of these decisions incurs costs. For example, should an auditor use logged-in YouTube accounts while gathering recommendations to ensure more accurate inferences from the collected data? We systematically explore the impact of this and many other decisions and make important discoveries about the methodological choices that impact YouTube’s recommendations. Assessed together, our research suggests auditing configurations that can be used by researchers and auditors to reduce economic and computing costs, without sacrificing inference quality and accuracy.

Intermedia Agenda Setting during the 2016 and 2020 U.S. Presidential Elections

2024-05-28T04:59:52-07:00

Intermedia agenda setting (IAS) theory suggests that different news sources can influence each other's agenda. While this theory has been well-established in existing literature, whether it still holds in today's high-choice media environment, which includes news producers of different credibility and ideology dispositions, is an open question. Through two case studies--the 2016 and 2020 U.S. presidential elections--we show that media are still largely aligned, especially in broad topics they choose to cover, and that the level of alignment along the credibility dimension is comparable to that along the ideology dimension. Furthermore, we find that the coverage of the Republican candidate is better aligned across different media types than that of the Democratic candidate, and that media divergence has increased along both dimensions from 2016 to 2020. Finally, we demonstrate that high-credibility media still plays a dominant role in the IAS process, yet with a cautious warning of its declining IAS power for the Democratic candidate over the course of four years.

“I Am 30F and Need Advice!”: A Mixed-Method Analysis of the Effects of Advice-Seekers’ Self-Disclosure on Received Replies

2024-05-28T04:59:54-07:00

In community question answering sites, users can easily make a post to ask questions or seek advice. Others volunteer replies to these posts to provide answers of varying quality, detail, and helpfulness. In the advice-seeking process, self-disclosure enables posters to provide a relatable context for their requests but comes at a cost of greater identifiability. We focus on the "r/Advice" Reddit community and present a mixed-method study on how self-disclosure of advice-seekers shapes the prevalence and detail of the feedback received. We focus particularly on age and gender disclosure as both are reliably detected and normatively considered in the context of giving advice. We use both hurdle negative binomial regression models and discourse analysis to examine the relationship between self-disclosure and the replies received and explore themes related to disclosure. The results show that advice-seekers' age or gender disclosure correlates with more replies and more helpful replies, but the effects of age and gender disclosure are not additive. We also find both reciprocity and homophily effects in disclosure as reply-givers are more likely to self-disclose when the advice-seeker does so. The lack of additive effects alongside the thematic analysis suggests disclosure practices are used to elicit sufficient credibility or basis for empathy, whereas too much or too little disclosure creates uncertainty or inhibits the applicability of the received advice.

#LetsTalk: Understanding Social Media Usage of Self-Harm Users in India

2024-05-28T04:59:56-07:00

Mental health concerns, such as depression, pose significant challenges for support systems in effectively identifying affected individuals. On the other hand, people suffering from depression often find it easier to discuss on social media rather than in face-to-face interactions. Additionally, the development of distressing conditions typically arises from a multitude of factors accumulated over time rather than a singular event. To gain a fine-grained understanding of these facets, in this work, we perform a longitudinal analysis of the tweeting behaviour of Indian users who post content related to self-harm. We categorise users based on their posting frequency and examine various aspects including their social network, bio descriptions, tweeting preferences, temporal variations and cognitive indicators. By elucidating these nuances, we aim to contribute insights that could aid in the early detection of mental health issues and prompt timely intervention from support networks.

Community Needs and Assets: A Computational Analysis of Community Conversations

2024-05-28T04:59:57-07:00

A community needs assessment is a tool used by non-profits and government agencies to quantify the strengths and issues of a community, allowing them to allocate their resources better. Such approaches are transitioning towards leveraging social media conversations to analyze the needs of communities and the assets already present within them. However, manual analysis of exponentially increasing social media conversations is challenging. There is a gap in the present literature in computationally analyzing how community members discuss the strengths and needs of the community. To address this gap, we introduce the task of identifying, extracting, and categorizing community needs and assets from conversational data using sophisticated natural language processing methods. To facilitate this task, we introduce the first dataset about community needs and assets consisting of 3,511 conversations from Reddit, annotated using crowdsourced workers. Using this dataset, we evaluate an utterance-level classification model compared to sentiment classification and a popular large language model (in a zero-shot setting), where we find that our model outperforms both baselines at an F1 score of 94% compared to 49% and 61% respectively. Furthermore, we observe through our study that conversations about needs have negative sentiments and emotions, while conversations about assets focus on location and entities.

The Effects of Group Sanctions on Participation and Toxicity: Quasi-experimental Evidence from the Fediverse

2024-05-28T04:59:58-07:00

Online communities often overlap and coexist, despite incongruent norms and approaches to content moderation. When communities diverge, decentralized and federated communities may pursue group-level sanctions, including defederation (disconnection) to block communication between members of specific communities. We investigate the effects of defederation in the context of the Fediverse, a set of decentralized, interconnected social networks with independent governance. Mastodon and Pleroma, the most popular software powering the Fediverse, allow administrators on one server to defederate from another. We use a difference-in-differences approach and matched controls to estimate the effects of defederation events on participation and message toxicity among affected members of the blocked and blocking servers. We find that defederation causes a drop in activity for accounts on the blocked servers, but not on the blocking servers. Also, we find no evidence of an effect of defederation on message toxicity.

Emergent Influence Networks in Good-Faith Online Discussions

2024-05-28T04:59:59-07:00

Town hall-type debates are increasingly moving online, irrevocably transforming public discourse. Yet, we know relatively little about crucial social dynamics that determine which arguments are more likely to be successful. This study investigates the impact of one's position in the discussion network created via responses to others' arguments on one's persuasiveness in unfacilitated online debates. We propose a novel framework for measuring the relationship between network position and persuasiveness, using a combination of social network analysis and machine learning. Complementing existing studies investigating the effect of linguistic aspects on persuasiveness, we show that the user's position in a discussion network is associated with their persuasiveness online. Moreover, the recognition of successful persuasion is linked to an increase in dominant network position. Our findings offer important insights into the complex social dynamics of online discourse and provide practical insights for organizations and individuals seeking to understand the interplay between influential positions in a discussion network and persuasive strategies in digital spaces.

Classifying Conspiratorial Narratives at Scale: False Alarms and Erroneous Connections

2024-05-28T05:00:01-07:00

Online discussions frequently involve conspiracy theories, which can contribute to the proliferation of belief in them. However, not all discussions surrounding conspiracy theories promote them, as some are intended to debunk them. Existing research has relied on simple proxies or focused on a constrained set of signals to identify conspiracy theories, which limits our understanding of conspiratorial discussions across different topics and online communities. This work establishes a general scheme for classifying discussions related to conspiracy theories based on authors' perspectives on the conspiracy belief, which can be expressed explicitly through narrative elements, such as the agent, action, or objective, or implicitly through references to known theories, such as chemtrails or the New World Order. We leverage human-labeled ground truth to train a BERT-based model for classifying online CTs, which we then compared to the Generative Pre-trained Transformer machine (GPT) for detecting online conspiratorial content. Despite GPT's known strengths in its expressiveness and contextual understanding, our study revealed significant flaws in its logical reasoning, while also demonstrating comparable strengths from our classifiers. We present the first large-scale classification study using posts from the most active conspiracy-related Reddit forums and find that only one-third of the posts are classified as positive. This research sheds light on the potential applications of large language models in tasks demanding nuanced contextual comprehension.

Stories That Heal: Characterizing and Supporting Narrative for Suicide Bereavement

2024-05-28T05:00:02-07:00

Clinical group bereavement therapy often promotes narrative sharing as a therapeutic intervention to facilitate grief processing. Increasingly, people turn to social media to express stories of loss and seek support surrounding bereavement experiences, specifically, the loss of loved ones from suicide. This paper reports the results of a computational linguistic analysis of narrative expression within an online suicide bereavement support community. We identify distinctive characteristics of narrative posts (compared to non-narrative posts) in linguistic style. We then develop and validate a machine-learning model for tagging narrative posts at scale and demonstrate the utility of applying this machine-learning model to a more general grief support community. Through comparison, we validate our model's narrative tagging accuracy and compare the proportion of narrative posts between the two communities we have analyzed. Narrative posts make up about half of all total posts in these two grief communities, demonstrating the importance of narrative posts to grief support online. Finally, we consider how the narrative tagging tool presented in this study can be applied to platform design to more effectively support people expressing the narrative sharing of grief in online grief support spaces.

Fair or Fare? Understanding Automated Transcription Error Bias in Social Media and Videoconferencing Platforms

2024-05-28T05:00:03-07:00

As remote work and learning increases in popularity, individuals, especially those with hearing impairments or who speak English as a second language, may depend on automated transcriptions to participate in business, school, entertainment, or basic communication. In this work, we investigate the automated transcription accuracy of seven popular social media and videoconferencing platforms with respect to some personal characteristics of their users, including gender, age, race, first language, speech rate, F0 frequency, and speech readability. We performed this investigation on a new corpus of 194 hours of English monologues by 846 TED talk speakers. Our results show the presence of significant bias, with transcripts less accurate for speakers that are male or non-native English speakers. We also observe differences in accuracy among platforms for different types of speakers. These results indicate that, while platforms have improved their automatic captioning, much work remains to make captions accessible for a wider variety of speakers and listeners.

#TeamFollowBack: Detection & Analysis of Follow Back Accounts on Social Media

2024-05-28T05:00:05-07:00

Follow back accounts inflate their follower counts by engaging in reciprocal followings. Such accounts manipulate the public and the algorithms by appearing more popular than they really are. Despite their potential harm, no studies have analyzed such accounts at scale. In this study, we present the first large-scale analysis of follow back accounts. We formally define follow back accounts and employ a honeypot approach to collect a dataset of such accounts on X (formerly Twitter). We discover and describe 12 communities of follow back accounts from 12 different countries, some of which exhibit clear political agenda. We analyze the characteristics of follow back accounts and report that they are newer, more engaging, and have more followings and followers. Finally, we propose a classifier for such accounts and report that models employing profile metadata and the ego network have some success, although achieving high recall is challenging. Our study enhances understanding of the follow back accounts and discovering such accounts in the wild.

Cascade-Based Randomization for Inferring Causal Effects under Diffusion Interference

2024-05-28T05:00:06-07:00

The presence of interference, where the outcome of an individual may depend on the treatment assignment and behavior of neighboring nodes, can lead to biased causal effect estimation. Current approaches to network experiment design focus on limiting interference through cluster-based randomization, in which clusters are identified using graph clustering, and cluster randomization dictates the node assignment to treatment and control. However, cluster-based randomization approaches perform poorly when interference propagates in cascades, whereby the response of individuals to treatment propagates to their multi-hop neighbors. When we have knowledge of the cascade seed nodes, we can leverage this interference structure to mitigate the resulting causal effect estimation bias. With this goal, we propose a cascade-based network experiment design that initiates treatment assignment from the cascade seed node and propagates the assignment to their multi-hop neighbors to limit interference during cascade growth and thereby reduce the overall causal effect estimation error. Our extensive experiments on real-world and synthetic datasets demonstrate that our proposed framework outperforms the existing state-of-the-art approaches in estimating causal effects in network data.

A Crisis of Civility? Modeling Incivility and Its Effects in Political Discourse Online

2024-05-28T05:00:10-07:00

Growing concerns have been raised about the detrimental effects of uncivil comments on the web towards democracy. However, there is still a lack of understanding about online incivility's nuanced and complicated nature and its impact on conversation development and user behaviors. This work aims to fill that research gap by modeling incivility and its relationship to political discussions. We develop a comprehensive and fine-grained taxonomy that characterizes incivility with vulgarity, name-calling (inter-personal and third-party attacks), aspersion, and stereotypes, and then apply the framework to quantify the level of each incivility category in over 40 million comments from Reddit. Using large-scale quantitative analysis, we investigate the types of interactions and contexts in which incivility is more likely to occur, model how incivility shapes subsequent conversations, and examine user engagement patterns and behavioral changes after exposure to incivility. Our findings show that conversations that start out uncivil tend to become more uncivil in responses, and exposure to different incivility categories has differing effects on community members' engagement. We conclude with the implications of our research in assisting the design and moderation of online political communities.

Reliability Analysis of Psychological Concept Extraction and Classification in User-Penned Text

2024-05-28T05:00:11-07:00

The social NLP research community witness a recent surge in the computational advancements of mental health analysis to build responsible AI models for a complex interplay between language use and self-perception. Such responsible AI models aid in quantifying the psychological concepts from user-penned texts on social media. On thinking beyond the low-level (classification) task, we advance the existing binary classification dataset, towards a higher-level task of reliability analysis through the lens of explanations, posing it as one of the safety measures. We annotate the LoST dataset to capture nuanced textual cues that suggest the presence of low self-esteem in the posts of Reddit users. We further state that the NLP models developed for determining the presence of low self-esteem, focus more on three types of textual cues: (i) Trigger: words that triggers mental disturbance, (ii) LoST indicators: text indicators emphasizing low self-esteem, and (iii) Consequences: words describing the consequences of mental disturbance. We implement existing classifiers to examine the attention mechanism in pre-trained language models (PLMs) for a domain-specific psychology-grounded task. Our findings suggest the need of shifting the focus of PLMs from Trigger and Consequences to a more comprehensive explanation, emphasizing LoST indicators while determining low self-esteem in Reddit posts.

Unraveling the Dynamics of Television Debates and Social Media Engagement: Insights from an Indian News Show

2024-05-28T05:00:13-07:00

The relationship between television shows and social media has become increasingly intertwined in recent years. Social media platforms, particularly Twitter, have emerged as significant sources of public opinion and discourse on topics discussed in television shows. In India, news debates leverage the popularity of social media to promote hashtags and engage users in discussions and debates on a daily basis. This paper focuses on the analysis of one of India's most prominent and widely-watched TV news debate shows: 'Arnab Goswami -- The Debate'. The study examines the content of the show by analyzing the hashtags used to promote it and the social media data corresponding to these hashtags. The goal is to understand the composition of the audience engaged in social media discussions related to the show. The findings reveal that the show exhibits a strong bias towards the ruling Bharatiya Janata Party (BJP), with over 60% of the debates featuring either pro-BJP or anti-opposition content. Social media support for the show primarily comes from BJP supporters. Notably, BJP leaders and influencers play a significant role in promoting the show on social media, leveraging their existing networks and resources to artificially trend specific hashtags. Furthermore, the study uncovers a reciprocal flow of information between the TV show and social media. We find evidence that the show's choice of topics is linked to social media posts made by party workers, suggesting a dynamic interplay between traditional media and online platforms. By exploring the complex interaction between television debates and social media support, this study contributes to a deeper understanding of the evolving relationship between these two domains in the digital age. The findings hold implications for media researchers and practitioners, offering insights into the ways in which social media can influence traditional media and vice versa.

Rank, Pack, or Approve: Voting Methods in Participatory Budgeting

2024-05-28T05:00:14-07:00

Participatory budgeting is a popular method to engage residents in budgeting decisions by local governments. The Stanford Participatory Budgeting platform is an online platform that has been used to engage residents in more than 150 budgeting processes. We present a data set with anonymized budget opinions from these processes with K-approval, K-ranking or knapsack primary ballots. For a subset of the voters, it includes paired votes with a different elicitation method in the same process. This presents a unique data set, as the voters, projects and setting are all related to real-world decisions that the voters have an actual interest in. With data from primary ballots we find that while ballot complexity (number of projects to choose from, number of projects to select and ballot length) is correlated with a higher median time spent by voters, it is not correlated with a higher abandonment rate.

We use vote pairs with different voting methods to analyze the effect of voting methods on the cost of selected projects, more comprehensively than was previously possible. In most elections, voters selected significantly more expensive projects using K-approval than using knapsack, although we also find a small number of examples with a significant effect in the opposite direction. This effect happens at the aggregate level as well as for individual voters, and is influenced both by the implicit constraints of the voting method and the explicit constraints of the voting interface. Finally, we validate the use of K-ranking elicitation to offer a paper alternative for knapsack voting.

Clock against Chaos: Dynamic Assessment and Temporal Intervention in Reducing Misinformation Propagation

2024-05-28T05:00:15-07:00

As social networks become the primary sources of information, the rise of misinformation poses a significant threat to the information ecosystem. Here, we address this challenge by proposing a dynamic system for real-time evaluation and assignment of misinformation scores to tweets, which can support the ongoing efforts to counteract the impact of misinformation public health, public opinion, and society. We use a unique combination of Temporal Graph Network (TGN) and Recurrent Neural Networks (RNNs) to capture both structural and temporal characteristics of misinformation propagation. We further use active learning to refine the understanding of misinformation, and a dual model system to ensure the accurate grading of tweets. Our system also incorporates a temporal embargo strategy based on belief scores, allowing for comprehensive assessment of information over time. We further outline a retraining strategy to keep the model current and robust in the dynamic misinformation landscape. The evaluation results across five social media misinformation datasets show promising accuracy in identifying false information and reducing propagation by a significant margin.

Lived Experience Matters: Automatic Detection of Stigma toward People Who Use Substances on Social Media

2024-05-28T05:00:17-07:00

Stigma toward people who use substances (PWUS) is a leading barrier to seeking treatment. Further, those in treatment are more likely to drop out if they experience higher levels of stigmatization. While related concepts of hate speech and toxicity, including those targeted toward vulnerable populations, have been the focus of automatic content moderation research, stigma and, in particular, people who use substances have not. This paper explores stigma toward PWUS using a data set of roughly 5,000 public Reddit posts. We performed a crowd-sourced annotation task where workers are asked to annotate each post for the presence of stigma toward PWUS and answer a series of questions related to their experiences with substance use. Results show that workers who use substances or know someone with a substance use disorder are more likely to rate a post as stigmatizing. Building on this, we use a supervised machine learning framework that centers workers with lived substance use experience to label each Reddit post as stigmatizing. Modeling person-level demographics in addition to comment-level language results in a classification accuracy (as measured by AUC) of 0.69 -- a 17% increase over modeling language alone. Finally, we explore the linguist cues which distinguish stigmatizing content: PWUS substances and those who don't agree that language around othering ("people", "they") and terms like "addict" are stigmatizing, while PWUS (as opposed to those who do not) find discussions around specific substances more stigmatizing. Our findings offer insights into the nature of perceived stigma in substance use. Additionally, these results further establish the subjective nature of such machine learning tasks, highlighting the need for understanding their social contexts.

Search Engine Revenue from Navigational and Brand Advertising

2024-05-28T05:00:18-07:00

Keyword advertising on general web search engines is a multi-billion dollar business. Keyword advertising turns contentious, however, when businesses target ads against their competitors' brand names---a practice known as "competitive poaching." To stave off poaching, companies defensively bid on ads for their own brand names. Google, in particular, has faced lawsuits and regulatory scrutiny since it altered its policies in 2004 to allow poaching. In this study, we investigate the sources of advertising revenue earned by Google, Bing, and DuckDuckGo by examining ad impressions, clicks, and revenue on navigational and brand searches. Using logs of searches performed by a representative panel of US residents, we estimate that ads on these searches account for 28--36% of Google's search revenue, while Bing earns even more. We also find that the effectiveness of these ads for advertisers varies. We conclude by discussing the implications of our findings for advertisers and regulators.

The Evolution of Occupational Identity in Twitter Biographies

2024-05-28T05:00:20-07:00

Occupational identity concerns the self-image of an individual’s affinities and socioeconomic class, and directs how a person should behave in certain ways. Understanding the establishment of occupational identity is important to study work-related behaviors. However, large-scale quantitative studies of occupational identity are difficult to perform due to its indirect observable nature. But profile biographies on social media contain concise yet rich descriptions about self- identity. Analysis of these self-descriptions provides powerful insights concerning how people see themselves and how they change over time. In this paper, we present and analyze a longitudinal corpus recording the self-authored public biographies of 51.18 million Twitter users as they evolve over a six-year period from 2015-2021. In particular, we investigate the social approval (e.g., job prestige and salary) effects in how people self-disclose occupational identities, quantifying over-represented occupations as well as the occupational transitions w.r.t. job prestige over time. We show that self-reported jobs and job transitions are biased toward more prestigious occupations. We also present an intriguing case study about how self-reported jobs changed amid COVID-19 and the subsequent "Great Resignation" trend with the latest full year data in 2022. These results demonstrate that social media biographies are a rich source of data for quantitative social science studies, allowing unobtrusive observation of the intersections and transitions obtained in online self-presentation.

HINENI: Human Identity across the Nations of the Earth Ngram Investigator

2024-05-28T05:00:21-07:00

Self-reported biographical strings on social media profiles provide a powerful tool to study self-identity. We present HINENI, a dataset of 420 million Twitter user profiles collected over a 12 year period, partitioned into 32 distinct national cohorts, which we believe is the largest publicly available data resource for identity research. We report on the major design decisions underlying HINENI, including a new notion of sampling (k-persistence) which spans the divide between traditional cross-sectional and longitudinal approaches. We demonstrate the power of HINENI to study the relative survival rate (half-life) of different tokens, and the use of emoji analysis across national cohorts to study the effects of gender, national, and sports identities.

Partial Mobilization: Tracking Multilingual Information Flows amongst Russian Media Outlets and Telegram

2024-05-28T05:00:22-07:00

In response to disinformation and propaganda from Russian online media following the invasion of Ukraine, Russian media outlets such as Russia Today and Sputnik News were banned throughout Europe. To maintain viewership, many of these Russian outlets began to heavily promote their content on messaging services like Telegram. In this work, we study how 16 Russian media outlets interacted with and utilized 732 Telegram channels throughout 2022. Leveraging the foundational model MPNet, DP-Means clustering, and Hawkes processes, we trace how narratives spread between news sites and Telegram channels. We show that news outlets not only propagate existing narratives through Telegram but that they source material from the messaging platform. For example, across the websites in our study, between 2.3% (ura.news) and 26.7% (ukraina.ru) of articles discussed content that originated/resulted from activity on Telegram. Finally, tracking the spread of individual topics, we measure the rate at which news outlets and Telegram channels disseminate content within the Russian media ecosystem, finding that websites like ura.news and Telegram channels such as @genshab are the most effective at disseminating their content.

Machine-Made Media: Monitoring the Mobilization of Machine-Generated Articles on Misinformation and Mainstream News Websites

2024-05-28T05:00:25-07:00

As large language models (LLMs) like ChatGPT have gained traction, an increasing number of news websites have begun utilizing them to generate articles. However, not only can these language models produce factually inaccurate articles on reputable websites but disreputable news sites can utilize LLMs to mass produce misinformation. To begin to understand this phenomenon, we present one of the first large-scale studies of the prevalence of synthetic articles within online news media. To do this, we train a DeBERTa-based synthetic news detector and classify over 15.46 million articles from 3,074 misinformation and mainstream news websites. We find that between January 1, 2022, and May 1, 2023, the relative number of synthetic news articles increased by 57.3% on mainstream websites while increasing by 474% on misinformation sites. We find that this increase is largely driven by smaller less popular websites. Analyzing the impact of the release of ChatGPT using an interrupted-time-series, we show that while its release resulted in a marked increase in synthetic articles on small sites as well as misinformation news websites, there was not a corresponding increase on large mainstream news websites.

Making the Pick: Understanding Professional Editor Comment Curation in Online News

2024-05-28T05:00:27-07:00

Online comments within news articles are a key way people share opinions. Discovering insightful comments can, however, be challenging for readers. A solution to this problem is using comment curation, whereby professional editors select the highest quality comments manually --- referred to as ''editor-picks''. This paper studies the growing use of professional editor-curation for user-generated comments. We focus on the New York Times as a case study, using a dataset covering 80k articles. We study the characteristics of editor-pick comments, highlighting how editor criteria vary across news sections (e.g. sports, entertainment). We find that editor-pick comments tend to be longer, more relevant to the article, positive in sentiment, and contain low toxicity. Our analysis further reveals that editors within different news sections exhibit differing criteria when they perform comment selection. Thus, we finally propose a set of models that can automatically identify good candidate editor-picks. Our ultimate goal is to reduce editor and journalistic workload, increasing productivity and the quality of curated comments.

CPL-NoViD: Context-Aware Prompt-Based Learning for Norm Violation Detection in Online Communities

2024-05-28T05:00:28-07:00

Detecting norm violations in online communities is critical to maintaining healthy and safe spaces for online discussions. Existing machine learning approaches often struggle to adapt to the diverse rules and interpretations across different communities due to the inherent challenges of fine-tuning models for such context-specific tasks. In this paper, we introduce Context-aware Prompt-based Learning for Norm Violation Detection (CPL-NoViD), a novel method that employs prompt-based learning to detect norm violations across various types of rules. CPL-NoViD outperforms the baseline by incorporating context through natural language prompts and demonstrates improved performance across different rule types. Significantly, it not only excels in cross-rule-type and cross-community norm violation detection but also exhibits adaptability in few-shot learning scenarios. Most notably, it establishes a new state-of-the-art in norm violation detection, surpassing existing benchmarks. Our work highlights the potential of prompt-based learning for context-sensitive norm violation detection and paves the way for future research on more adaptable, context-aware models to better support online community moderators.

Characterizing Information Propagation in Fringe Communities on Telegram

2024-05-28T05:00:29-07:00

Online messaging platforms are key communication tools but are vulnerable to fake news and conspiracy theories. Mainstream platforms such as Facebook are increasing content moderation of harmful and conspiratorial content. In response, users from fringe communities are migrating to alternative platforms like Telegram. These platforms offer more freedom and less intervention. Currently, Telegram is one of the leading messaging platforms hosting fringe communities. Despite the popularity, as a research community, we lack knowledge of how content spreads over this network. Motivated by the importance and impact of messaging platforms on society, we aim to measure the information propagation within fringe communities on the Telegram network, focusing on how public groups and channels exchange messages. We collect and explore about 140 million messages from 9,000 channels and groups on Telegram. We examine message forwarding and the lifetime of the messages from different aspects. Among other things, we find inequality in content creation; 6% of the users are responsible for 90% of forwarded messages. We also discover that while the forwarding feature considerably amplifies the reach of messages, the spread of content within our dataset remains largely localized. Additionally, we find that 5% of the channels are responsible for 40% of the forwarded messages in the entire dataset. Finally, our lifetime analysis shows that messages disseminated in groups with numerous active users exhibit significantly longer lifespans compared to those circulated in channels.

A Visual Approach to Tracking Emotional Sentiment Dynamics in Social Network Commentaries

2024-05-28T05:00:31-07:00

The expansion of social media has unlocked a real-time barometer of public opinion. This paper introduces a novel framework to analyze sentiment shifts in social network comment sections, a reflection of the broader public discourse over time. Leveraging a pre-trained uncased RoBERTa model, we predict emotional scores from user comments, mapping these to key sentiment trends such as Approval, Toxicity, Obscenity, Threat, Hate, Offensive, and Neutral. Our methodology employs machine learning techniques to train a dataset that connects emotional scores with these trends, generating trend probability scores. We utilize a bottom-up recursive algorithm to aggregate emotional scores within comment threads, enabling the prediction of trend scores using three distinct aggregation methods. The results demonstrate that our emotional prediction model achieves an AUC of 0.92, and XGBoost stands out with an F1 score exceeding 0.40. Our research elucidates the temporal evolution of online public sentiment, enhancing the understanding of digital social dynamics and offering insights for strategic online interaction, intervention, and content moderation.

ChatGPT Giving Relationship Advice – How Reliable Is It?

2024-05-28T05:00:33-07:00

In the evolving realm of natural language processing (NLP), generative AI models like ChatGPT are increasingly utilized across various applications. Among the possible purposes, many people are considering asking ChatGPT for relationship advice. However, the lack of in-depth examination of ChatGPT's response quality could be concerning when it is used for personal topics like mental health issues and intimate relationship problems. In these topics, a piece of misleading advice could cause harmful repercussions. In response to people's growing interest in using ChatGPT as a relationship advisor, our research evaluates ChatGPT's proficiency in discerning relationship advice. Specifically, we investigate its alignment with human judgements. We conducted our analysis with 13,138 Reddit posts about intimate relationship problems to examine the overall alignment. Furthermore, we investigate ChatGPT's consistency in judging intimate relationship advice by re-prompting identical queries. Our results indicate a significant disparity between ChatGPT and human judgments, with the model displaying inconsistency in its own decisions. Our findings emphasize the need for comprehensive insights into ChatGPT's mechanisms for intimacy problems and future improvements in its proficiency in helping people's relationship struggles.

Leveraging Recommender Systems to Reduce Content Gaps on Peer Production Platforms

2024-05-28T05:00:34-07:00

Peer production platforms like Wikipedia commonly suffer from content gaps. Prior research suggests recommender systems can help solve this problem, by guiding editors towards underrepresented topics. However, it remains unclear whether this approach would result in less relevant recommendations, leading to reduced overall engagement with recommended items. To answer this question, we first conducted offline analyses (Study 1) on SuggestBot, a task-routing recommender system for Wikipedia, then did a three-month controlled experiment (Study 2). Our results show that presenting users with articles from underrepresented topics increased the proportion of work done on those articles without significantly reducing overall recommendation uptake. We discuss the implications of our results, including how ignoring the article discovery process can artificially narrow recommendations on peer production platforms.

Market or Markets? Investigating Google Search’s Market Shares under Vertical Segmentation

2024-05-28T05:00:35-07:00

Is Google Search a monopoly with gatekeeping power? Regulators from the US, UK, and Europe have argued that it is based on the assumption that Google Search dominates the market for horizontal (a.k.a. “general”) web search. Google disputes this, claiming that competition extends to all vertical (a.k.a. “specialized”) search engines, and that under this market definition it does not have monopoly power. In this study we present the first analysis of Google Search’s market share under vertical segmentation of online search. We leverage observational trace data collected from a panel of US residents that includes their web browsing history and copies of the Google Search Engine Result Pages they were shown.We observe that participants’ search sessions begin at Google greater than 50% of the time in 24 out of 30 vertical market segments (which comprise almost all of our participants’ searches). Our results inform the consequential and ongoing debates about the market power of Google Search and the conceptualization of online markets in general.

A Multi-modal Prompt Learning Framework for Early Detection of Fake News

2024-05-28T05:00:36-07:00

Information spreads quickly through social media platforms, especially fake news with negative or even malicious intentions. In recent years, psychological studies have found that explicit reminders of fake news would diminish its consequence. Therefore, it is crucial to identify their authenticity at an early stage to avoid serious consequences. However, existing methods for fake news detection either utilize auxiliary information including users’ profiles and related events propagation networks or require sufficient and high-quality training data, which is not suitable for early fake news detection in real. An increasing number of social media news not only involves natural language content but also visual content such as images and videos, which give us a new view of fake news detection at an early stage by multi-modal data. In this paper, we propose a Multi-modal Prompt Learning framework (MPL) based on the multi-modal pre-trained model CLIP for early detection of fake news. A learnable prompt module is developed to adaptively and efficiently generate prompt representations to boost the semantic context. MPL can be implemented in supervised or few-shot settings. Extensive experiments show that the proposed MPL obtains substantial performance and efficiency improvement for the early-stage fake news detection task. The results demonstrate that MPL performs considerably well compared to both the state-ofthe-art supervised multi-modal models and the latest promptbased few-shot multi-modal models. Especially, the high recall of fake news and the high precision of real news that MPL achieved compared to other baselines verify that it will better approach one of the motivations that providing early notification of “maybe real” or “maybe fake” with the release of the news.

ArguSense: Argument-Centric Analysis of Online Discourse

2024-05-28T05:00:38-07:00

How can we model arguments and their dynamics in online forum discussions? The meteoric rise of online forums presents researchers across different disciplines with an unprecedented opportunity: we have access to texts containing discourse between groups of users generated in a voluntary and organic fashion. Most prior work so far has focused on classifying individual monological comments as either argumentative or not argumentative. However, few efforts quantify and describe the dialogical processes between users found in online forum discourse: the structure and content of interpersonal argumentation. Modeling dialogical discourse requires the ability to identify the presence of arguments, group them into clusters, and summarize the content and nature of clusters of arguments within a discussion thread in the forum. In this work, we develop ArguSense, a comprehensive and systematic framework for understanding arguments and debate in online forums. Our framework consists of methods for, among other things: (a) detecting argument topics in an unsupervised manner; (b) describing the structure of arguments within threads with powerful visualizations; and (c) quantifying the content and diversity of threads using argument similarity and clustering algorithms. We showcase our approach by analyzing the discussions of four communities on the Reddit platform over a span of 21 months. Specifically, we analyze the structure and content of threads related to GMOs in forums related to agriculture or farming to demonstrate the value of our framework.

A Semantic Interpreter for Social Media Handles

2024-05-28T05:00:41-07:00

A handle is a short string of characters that identifies a user or account in a social media platform and is unique within the scope of the platform. Though usually of limited length, a handle can often be the most information-dense string in a social media user profile, potentially containing clues to the user’s name, age, location, demographics, or group affiliations. Despite this, the handle has been frequently set aside in work related to inferring user information from their social media profiles. We present a technique for semantic parsing of handles, which seeks to extract relevant information from the handle string. The technique is rule-based and relies on a set of tokenization rules and a variety of external databases (e.g., of names or places) to provide potential interpretations of handles in terms of names, locations, dates, indices, years, ages, positive/negative sentiments, and acronyms. We evaluate an implementation of the technique for English against existing corpora as well as manually evaluate parses of randomly sampled handles, showing that our method achieves good results in both tokenizing the handles (84.9% chance that the correct tokenization is in top three parses while 97% chance that one of the top three parses are at least reasonable) and providing overall “optimistic” interpretation performance of 90.1% accuracy and 0.89 F1. We also evaluate performance on each of the semantic aspects we interpret (name, location, index, year, age, sentiment, acronym). The technique not only allows us to extract additional information about a user from their handle but also allows us to measure trends in how handles are constructed on specific social media websites. We find that 59% of the handles in our data contain at least part of a person’s name, and over 69% of the handles are indicative of the user’s gender identity in some way. While our implementation targets English, it can be easily adapted to other languages given the appropriate databases. We release both our code and annotated evaluation data to aid other researchers in validating or extending our work.

With Flying Colors: Predicting Community Success in Large-Scale Collaborative Campaigns

2024-05-28T05:00:42-07:00

Online communities develop unique characteristics, establish social norms, and exhibit distinct dynamics among their members. Activity in online communities often results in concrete “off-line” actions with a broad societal impact (e.g., political street protests and shifting norms related to sexual misconduct). While community dynamics, information diffusion, and online collaborations have been widely studied in the past two decades, quantitative studies that measure the effectiveness of online communities in promoting their agenda are scarce. In this work, we study the correspondence between the effectiveness of a community, measured by its success level in a competitive online campaign, and the underlying dynamics between its members. To this end, we de- fine a novel task: predicting the success level of online communities in Reddit’s r/place – a large-scale distributed experiment that required collaboration between community members. We consider an array of definitions for success level; each is geared toward different aspects of the collaborative achievement. We experiment with several hybrid models, combining various types of features. Our models significantly outperform all baseline models over all definitions of ‘success level’. Analysis of the results and the factors that contribute to the success of coordinated campaigns can provide a better understanding of the resilience or the vulnerability of communities to online social threats such as election interference or anti-science trends. We make all data used for this study publicly available for further research.

Tight Sampling in Unbounded Networks

2024-05-28T05:00:43-07:00

The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioritize the inclusion of entire cohesive communities rather than any kind of representativeness, breadth, or depth of coverage. The method is illustrated on a concrete example, and experiments on synthetic networks suggest that it behaves as desired.

Facebook Political Ads and Accountability: Outside Groups Are Most Negative, Especially When Hiding Donors

2024-05-28T05:00:45-07:00

The emergence of online political advertising has come with little regulation, allowing political advertisers on social media to avoid accountability. We analyze how transparency and accountability deficits caused by dark money and disappearing groups relate to the sentiment of political ads on Facebook. We obtained 430,044 ads with FEC-registered advertisers from Facebook’s ad library that ran between August-November 2018. We compare ads run by candidates, parties, and outside groups, which we classify by (1) their donor transparency (dark money or disclosed) and (2) the group's permanence (only FEC-registered in 2018 or persistent across cycles). The most negative advertising came from dark money and disappearing outside groups, which were mostly corporations or 501(c) organizations. However, only dark money was associated with a significant decrease in ad sentiment. These results suggest that accountability for political speech matters for advertising tone, especially in the context of affective polarization on social media.

Which Came First, Price or Activity? A Vicious Circle of a Blockchain-Based Social Media in the Bear Market (Extended Abstract)

2024-05-28T05:00:46-07:00

This paper explores the relationship between the user activity on Steemit---the most widely used blockchain-based social media---and the price of STEEM---the cryptocurrency that can be earned from the activity on Steemit. Users can get STEEM by writing posts or upvoting ("liking") posts on Steemit. One may expect that activities on Steemit may affect the STEEM price, or conversely, the STEEM price may affect the user activity. We measure the Steemit activity by DAU (Daily Active Users) which is calculated as the number of unique users who write posts or comments on Steemit each day. We conduct the VAR (vector autoregressive) model analysis and the bidirectional Granger causality test on the STEEM price and the Steemit DAU for three different time regimes: full, bull-market, and bear-market regimes. We show that in all regimes, the STEEM price Granger causes the Steemit DAU. Conversely, the Steemit DAU does not Granger cause the STEEM price in both the full and bull-market regimes. In contrast, in the bear-market regime, the Steemit DAU Granger causes the STEEM price. That is, the STEEM price and the Steemit DAU Granger cause each other in the bear market, which can be seen as a vicious circle. We also show that the same results hold with the Hive blockchain, which is a fork of the Steem blockchain.

Exploring Platform Migration Patterns between Twitter and Mastodon: A User Behavior Study

2024-05-28T05:00:47-07:00

A recent surge of users migrating from Twitter to alternative platforms, such as Mastodon, raised questions regarding what migration patterns are, how different platforms impact user behaviors, and how migrated users settle in the migration process. In this study, we elaborate on how we investigate these questions by collecting data over 10,000 users who migrated from Twitter to Mastodon within the first ten weeks following the ownership change of Twitter. Our research is structured in three primary steps. First, we develop algorithms to extract and analyze migration patterns. Second, by leveraging behavioral analysis, we examine the distinct architectures of Twitter and Mastodon to learn how user behaviors correspond with the characteristics of each platform. Last, we determine how particular behavioral factors influence users to stay on Mastodon. We share our findings of user migration, insights, and lessons learned from the user behavior study.

Look Ahead Text Understanding and LLM Stitching

2024-05-28T05:00:49-07:00

This paper proposes a look ahead text understanding problem with look ahead section identification (LASI) as an example. This problem may appear in generative AI as well as human interactions, where we want to understand the direction of a developing text or conversation. We tackle the problem using transformer-based LLMs. We show that LASI is more challenging than classic section identification (SI). We argue that both bidirectional contextual information (e.g., BERT) and unidirectional predictive ability (e.g., GPT) will benefit the task. We propose two approaches to stitch together BERT and GPT. Experiments show that our approach outperforms the established models, especially when there is noise in the text (which is often the case for developing text in generative AI). Our paper sheds light on other look ahead text understanding tasks that are important to social media, such as look ahead sentiment classification, and points out the opportunities to leverage pre-trained LLMs through stitching.

Games and Beyond: Analyzing the Bullet Chats of Esports Livestreaming

2024-05-28T05:00:50-07:00

Esports, short for electronic sports, is a form of competition using video games and has attracted more than 530 million audiences worldwide. To watch esports, people utilize online livestreaming platforms. Recently, a novel interaction method, namely "bullet chats," has been introduced on these platforms. Different from conventional comments, bullet chats are scrolling comments posted by audiences that are synchronized to the livestreaming timeline, enabling audiences to share and communicate their immediate perspectives. The real-time nature of bullet chats, therefore, brings a new perspective to esports analysis. In this paper, we conduct the first empirical study on the bullet chats for esports, focusing on one of the most popular video games, i.e., League of Legends (LoL). Specifically, we collect 21 million bullet chats of LoL from Jan. 2023 to Mar. 2023 across two mainstream platforms (Bilibili and Huya). By performing quantitative analysis, we reveal how the quantity and toxicity of bullet chats distribute (and change) w.r.t. three aspects, i.e., the season, the team, and the match. Our findings show that teams with higher rankings tend to attract a greater quantity of bullet chats, and these chats are often characterized by a higher degree of toxicity. We then utilize topic modeling to identify topics among bullet chats. Interestingly, we find that a considerable portion of topics (14.14% on Bilibili and 22.94% on Huya) discuss themes beyond the game, including genders, entertainment stars, non-esports athletes, and so on. Besides, by further modeling topics on toxic bullet chats, we find hateful speech targeting different social groups, ranging from professions, regions, etc. To the best of our knowledge, this work is the first measurement of bullet chats on esports livestreaming. We believe our study can shed light on esports research from the perspective of bullet chats.

Examining Similar and Ideologically Correlated Imagery in Online Political Communication

2024-05-28T05:00:52-07:00

This paper investigates visual media shared by US national politicians on Twitter, how a politician's variety of image types shared reflects their political position, and identifies a hazard in using standard methods for image characterization in this context. While past work has yielded valuable results on politicians' use of imagery in social media, that work has focused primarily on photographic media, which may be insufficient given the variety of visual media shared in such spaces (e.g., infographics, illustrations, or memes). Leveraging multiple popular, pretrained, deep-learning models to characterize politicians' visuals, this work uses clustering to identify eight types of visual media shared on Twitter, several of which are not photographic in nature. Results show individual politicians share a variety of these types, and the distributions of their imagery across these clusters is correlated with their overall ideological position -- e.g., liberal politicians appear to share a larger proportion of infographic-style images, and conservative politicians appear to share more patriotic imagery. Manual assessment, however, reveals that these image-characterization models often group visually similar images with different semantic meaning into the same clusters, which has implications for how researchers interpret clusters in this space and cluster-based correlations with political ideology. In particular, collapsing semantic meaning in these pretrained models may drive null findings on certain clusters of images rather than politicians across the ideological spectrum sharing common types of imagery. We end this paper with a set of researcher recommendations to prevent such issues.

The Strange Case of Jekyll and Hyde: Analysis of R/ToastMe and R/RoastMe Users on Reddit

2024-05-28T05:00:54-07:00

This study, focusing on two Reddit subcommunities of r/ToastMe and r/RoastMe, aims to (1) characterize and understand users (named Jekyll and Hyde) who simultaneously participate in two subreddits with opposing tones and purposes, (2) build predictive models detecting those Jekyll and Hyde users to assess how unique and idiosyncratic their characteristics are, and (3) investigate their motivations of participation and potential interaction between the two contrasting activities through a survey and one-on-one interviews. Our results reveal that the Jekyll and Hyde users are generally more active and popular than ordinary users. Also, they use assimilated language customized to each community’s tone. Combining these findings with their motivations unveiled through the survey and interviews, we conclude that the Jekyll and Hyde users are digitally culture-savvy, who know how to utilize online community benefits and enjoy each community’s culture by assimilating themselves into the community and observing its rules. Moreover, the users’ duality observed in this process underscores the dynamic and multifaceted nature of online personas. These findings highlight the need for a nuanced approach to understanding online behaviors and provide insights for designing healthier online environments, emphasizing the importance of clear community norms and the potential interplay of users’ activities across different communities.

Gender Gaps in Online Social Connectivity, Promotion and Relocation Reports on LinkedIn

2024-05-28T05:00:57-07:00

Online professional social networking platforms provide opportunities to expand networks strategically for job opportunities and career advancement. A large body of research shows that women’s offline networks are less advantageous than men’s. How online platforms such as LinkedIn may reflect or reproduce gendered networking behaviours, or how online social connectivity may affect outcomes differentially by gender is not well understood. This paper analyses aggregate, anonymised data from almost 10 million LinkedIn users in the UK and US information technology (IT) sector collected from the site’s advertising platform to explore how being connected to Big Tech companies (‘social connectivity’) varies by gender, and how gender, age, seniority and social connectivity shape the propensity to report job promotions or relocations. Consistent with previous studies, we find there are fewer women compared to men on LinkedIn in IT. Furthermore, female users are less likely to be connected to Big Tech companies than men. However, when we further analyse recent promotion or relocation reports, we find women are more likely than men to have reported a recent promotion at work, suggesting high-achieving women may be self-selecting onto LinkedIn. Even among this positively selected group, though, we find men are more likely to report a recent relocation. Social connectivity emerges as a significant predictor of promotion and relocation reports, with an interaction effect between gender and social connectivity indicating the payoffs to social connectivity for promotion and relocation reports are larger for women. This suggests that online networking has the potential for larger impacts for women, who experience greater disadvantage in traditional networking contexts, and calls for further research to understand differential impacts of online networking for socially disadvantaged groups.

Strategies and Attacks of Digital Militias in WhatsApp Political Groups

2024-05-28T05:00:58-07:00

WhatsApp provides a fertile ground for the large-scale dissemination of information, particularly in countries like Brazil and India. Given its increasing popularity and use for political discussions, it is paramount to ensure that WhatsApp groups are adequately protected from attackers who aim to disrupt the activity of WhatsApp groups. Motivated by this, in this work, we characterize two types of attacks that may disrupt WhatsApp groups. We look into the flooding attack, where an attacker shares a large number of usually duplicate messages within a short period, and the hijacking attack, where attackers aim to obtain complete control of the group. We collect a large dataset of 19M messages shared in 1.6K WhatsApp public political groups from Brazil and analyze them to identify and characterize flooding and hijacking attacks. Among other things, we find that approximately 7% of the groups receive flooding attacks, which are usually short-lived (usually less than four minutes), and groups can receive multiple flooding attacks, even within the same day. Also, we find that most flooding attacks are executed using stickers (62% of all flooding attacks) and that, in most cases, attackers use both flooding and hijacking attacks to obtain complete control of the WhatsApp groups. Our work aims to raise user awareness about such attacks on WhatsApp and emphasizes the need to develop effective moderation tools to assist group administrators in preventing or mitigating such attacks.

Assessing the Impact of Online Harassment on Youth Mental Health in Private Networked Spaces

2024-05-28T05:01:00-07:00

Online harassment negatively impacts mental health, with victims expressing increased concerns such as depression, anxiety, and even increased risk of suicide, especially among youth and young adults. Yet, research has mainly focused on building automated systems to detect harassment incidents based on publicly available social media trace data, overlooking the impact of these negative events on the victims, especially in private channels of communication. Looking to close this gap, we examine a large dataset of private message conversations from Instagram shared and annotated by youth aged 13-21. We apply trained classifiers from online mental health to analyze the impact of online harassment on indicators pertinent to mental health expressions. Through a robust causal inference design involving a difference-in-differences analysis, we show that harassment results in greater expression of mental health concerns in victims up to 14 days following the incidents, while controlling for time, seasonality, and topic of conversation. Our study provides new benchmarks to quantify how victims perceive online harassment in the immediate aftermath of when it occurs. We make social justice-centered design recommendations to support harassment victims in private networked spaces. We caution that some of the paper's content could be triggering to readers.

A Cross-Platform Topic Analysis of the Nazi Narrative on Twitter and Telegram during the 2022 Russian Invasion of Ukraine

2024-05-28T05:01:01-07:00

To influence the information landscape preceding and during the military invasion of Ukraine in February 2022, Russia initiated a disinformation campaign portraying Ukraine as a Nazi state. This study aims to compare discussions related to this campaign on Twitter and Telegram. The analysis reveals that the Nazis and Ukraine narrative was constant on Twitter but only emerged on Telegram after the invasion in channels that had previously focused on a broader set of conspiracy theories. Beyond the examination of Russian disinformation in this case study, the paper introduces an innovative methodology for constructing topic networks from social media data. This approach expands upon traditional topic modeling by incorporating the network properties of social media data to establish directed networks that characterize the interplay between conversation topics. Through this methodology, we gain the ability to observe topical evolutions, providing fresh insights into the disinformation campaign and its efficacy in shaping discussions around the Russian invasion on social media.

SensitivAlert: Image Sensitivity Prediction in Online Social Networks Using Transformer-Based Deep Learning Models

2024-05-28T05:01:03-07:00

Billions images are shared daily on social networks. When shared with an inappropriate audience, user-generated images can, however, compromise users' privacy and may have severe consequences, such as dismissals. To address this issue, different solutions were proposed, ranging from graphical user interfaces to Deep Learning (DL) models to alert users based on image sensitivity prediction. Although these models show promising results, they are evaluated on datasets relying on small participants' samples. To address this limitation, we first introduce SensitivAlert, a dataset that re-annotates the previously annotated images from two existing datasets, but using a German-speaking cohort of 907 participants. We then leverage it to classify images according to two sensitivity classes---private or public---using recent transformer-based DL models. In our evaluation, we first consider consensus-based generic models using our dataset as benchmark based on image content itself and its associated user tags. Moreover, we show that our fine-tuned models trained on our dataset better reflect users' image privacy conceptions. We finally focus on individual user's privacy estimation by investigating three approaches: (1) a generic approach based on participants' consensus for fine-tuning, (2) a user-wise approach based on user's privacy preferences only, and (3) a hybrid approach that combines individual preferences with consensus-based preferences. Our results finally show that the generic and hybrid approaches outperform the user-wise one for most users, thus ensuring the feasibility of image privacy prediction preferences at the individuals' level.

Watch Your Language: Investigating Content Moderation with Large Language Models

2024-05-28T05:01:04-07:00

Large language models (LLMs) have exploded in popularity due to their ability to perform a wide array of natural language tasks. Text-based content moderation is one LLM use case that has received recent enthusiasm, however, there is little research investigating how LLMs can help in content moderation settings. In this work, we evaluate a suite of commodity LLMs on two common content moderation tasks: rule-based community moderation and toxic content detection. For rule-based community moderation, we instantiate 95 subcommunity specific LLMs by prompting GPT-3.5 with rules from 95 Reddit subcommunities. We find that GPT-3.5 is effective at rule-based moderation for many communities, achieving a median accuracy of 64% and a median precision of 83%. For toxicity detection, we evaluate a range of LLMs (GPT-3, GPT-3.5, GPT-4, Gemini Pro, LLAMA 2) and show that LLMs significantly outperform currently widespread toxicity classifiers. However, we also found that increases in model size add only marginal benefit to toxicity detection, suggesting a potential performance plateau for LLMs on toxicity detection tasks. We conclude by outlining avenues for future work in studying LLMs and content moderation.

Socially-Motivated Music Recommendation

2024-05-28T05:01:06-07:00

Extensive literature spanning psychology, sociology, and musicology has sought to understand the motivations for why people listen to music, including both individually and socially motivated reasons. Music's social functions, while present throughout the world, may be particularly important in collectivist societies, but music recommender systems generally target individualistic functions of music listening. In this study, we explore how a recommender system focused on social motivations for music listening might work by addressing a particular motivation: the desire to listen to music that is trending in one’s community. We frame a recommendation task suited to this desire and propose a corresponding evaluation metric to address the timeliness of recommendations. Using listening data from Spotify, we construct a simple, heuristic-based approach to introduce and explore this recommendation task. Analyzing the effectiveness of this approach, we discuss what we believe is an overlooked trade-off between the precision and timeliness of recommendations, as well as considerations for modeling users' musical communities. Finally, we highlight key cultural differences in the effectiveness of this approach, underscoring the importance of incorporating a diverse cultural perspective in the development and evaluation of recommender systems.

Stance Detection with Collaborative Role-Infused LLM-Based Agents

2024-05-28T05:01:07-07:00

Stance detection automatically detects the stance in a text towards a target, vital for content analysis in web and social media research. Despite their promising capabilities, LLMs encounter challenges when directly applied to stance detection. First, stance detection demands multi-aspect knowledge, from deciphering event-related terminologies to understanding the expression styles in social media platforms. Second, stance detection requires advanced reasoning to infer authors' implicit viewpoints, as stances are often subtly embedded rather than overtly stated in the text. To address these challenges, we design a three-stage framework COLA (short for Collaborative rOle-infused LLM-based Agents) in which LLMs are designated distinct roles, creating a collaborative system where each role contributes uniquely. Initially, in the multidimensional text analysis stage, we configure the LLMs to act as a linguistic expert, a domain specialist, and a social media veteran to get a multifaceted analysis of texts, thus overcoming the first challenge. Next, in the reasoning-enhanced debating stage, for each potential stance, we designate a specific LLM-based agent to advocate for it, guiding the LLM to detect logical connections between text features and stance, tackling the second challenge. Finally, in the stance conclusion stage, a final decision maker agent consolidates prior insights to determine the stance. Our approach avoids extra annotated data and model training and is highly usable. We achieve state-of-the-art performance across multiple datasets. Ablation studies validate the effectiveness of each role design in handling stance detection. Further experiments have demonstrated the explainability and the versatility of our approach. Our approach excels in usability, accuracy, effectiveness, explainability and versatility, highlighting its value.

The Geography of Information Diffusion in Online Discourse on Europe and Migration

2024-05-28T05:01:08-07:00

The online diffusion of information related to Europe and migration has been little investigated from an external point of view. However, this is a very relevant topic, especially if users have had no direct contact with Europe and its perception depends solely on information retrieved online. In this work we analyse the information circulating online about Europe and migration after retrieving a large amount of data from social media (Twitter), to gain new insights into topics, magnitude, and dynamics of their diffusion. We combine retweets and hashtags network analysis with geolocation of users, linking thus data to geography and allowing analysis from an “outside Europe” perspective, with a special focus on Africa. We also introduce a novel approach based on cross-lingual quotes, i.e. when content in a language is commented and retweeted in another language, assuming these interactions are a proxy for connections between very distant communities. Results show how the majority of online discussions occurs at a national level, especially when discussing migration. Language (English) is pivotal for information to become transnational and reach far. Transnational information flow is strongly unbalanced, with content mainly produced in Europe and amplified outside. Conversely Europe-based accounts tend to be self-referential when they discuss migration-related topics. Football is the most exported topic from Europe worldwide. Moreover, important nodes in the communities discussing migration-related topics include accounts of official institutions and international agencies, together with journalists, news, commentators and activists.

Understanding the Features of Text-Image Posts and Their Received Social Support in Online Grief Support Communities

2024-05-28T05:01:10-07:00

People in grief can create posts with text and images to disclose themselves and seek social support in online grief support communities. Existing work largely focuses on understanding the received social support of a post in pure text but often overlooks the post that attaches an image in grief communities. In this paper, we first computationally characterize the textual (e.g., theme), visual (e.g., color), and text-image coherence (i.e., semantic and sentiment coherence) features of text-image posts in a grief support community. Then, we conduct regression analyses to systematically examine the effects of these features on their received informational, emotional, esteem, and network support. We find that attaching a selfie image in the post positively predicts received informational and emotional support, while the social image of a post is a positive predictor of network and esteem support. A post is also likely to get more social support if its text is describing the visible content or telling a story depicted in the image or the perceived emotions in the text and image are not conflict. These results supplement existing research on mental health communities and provide actionable insights into assisting grief people to seek social support online.

How to Train Your YouTube Recommender to Avoid Unwanted Videos

2024-05-28T05:01:13-07:00

YouTube provides features for users to indicate disinterest when presented with unwanted recommendations, such as the “Not interested” and “Don’t recommend channel” buttons. These buttons are intended to allow the user to correct “mistakes” made by the recommendation system. Yet, relatively little is known about the empirical efficacy of these buttons. Neither is much known about users’ awareness of and confidence in them. To address these gaps, we simulated YouTube users with sock puppet agents. Each agent first executed a “stain phase”, where it watched many videos of an assigned topic; it then executed a “scrub phase”, where it tried to remove recommendations from the assigned topic. Each agent repeatedly applied a single scrubbing strategy, either indicating disinterest in one of the videos visited in the stain phase (disliking it or deleting it from the watch history), or indicating disinterest in a video recommended on the homepage (clicking the “not interested” or “don’t recommend channel” button or opening the video and clicking the dislike button). We found that the stain phase significantly increased the fraction of the recommended videos dedicated to the assigned topic on the user’s homepage. For the scrub phase, using the “Not interested” button worked best, significantly reducing such recommendations in all topics tested, on average removing 88% of them. Neither the stain phase nor the scrub phase, however, had much effect on videopage recommendations (those given to users while they watch a video). We also ran a survey (N = 300) asking adult YouTube users in the US whether they were aware of and used these buttons before, as well as how effective they found these buttons to be. We found that 44% of participants were not aware that the “Not interested” button existed. Those who were aware of it often used it to remove unwanted recommendations (82.8%) and found it to be modestly effective (3.42 out of 5).

How Does Empowering Users with Greater System Control Affect News Filter Bubbles?

2024-05-28T05:01:14-07:00

While recommendation systems enable users to find articles of interest, they can also create "filter bubbles" by presenting content that reinforces users' pre-existing beliefs. Users are often unaware that the system placed them in a filter bubble and, even when aware, they often lack direct control over it. To address these issues, we first design a political news recommendation system augmented with an enhanced interface that exposes the political and topical interests the system inferred from user behavior. This allows the user to adjust the recommendation system to receive more articles on a particular topic or presenting a particular political stance. We then conduct a user study to compare our system to a traditional interface and found that the transparent approach helped users realize that they were in a filter bubble. Additionally, the enhanced system led to less extreme news for most users but also allowed others to move the system to more extremes. Similarly, while many users moved the system from extreme liberal/conservative to the center, this came at the expense of reducing political diversity of the articles shown. These findings suggest that, while the proposed system increased awareness of the filter bubbles, it had heterogeneous effects on news consumption depending on user preferences.

Measuring Causal Effects of Civil Communication without Randomization

2024-05-28T05:01:16-07:00

Understanding the causal effects of civility is critical when analyzing online social communication, yet measuring causality is difficult. A/B tests and other randomized experiments are the gold standard for establishing causal effects but they are inapplicable in this setting due to 1) the inability to control civility levels in an experiment, and more importantly, 2) ethical constraints on intentionally randomizing civility levels. We develop a novel quasi-experimental approach to quantify the causal effect of civility in online communities on the Roblox social 3D platform without requiring explicit randomization. This method uses residual stochasticity in the "matchmaking" assignment of users to servers as a quasi-randomization mechanism in observational historical data. We find that assigning a user to a server with higher levels of civil communication could increase engagement time by as much as 1.5% in particular experiences. Given the 4.8B person hours spent monthly on the platform, this implies a potential increase of over 8,000 person years of social interaction every month. Furthermore, this effect is mis-estimated by non-causal methods. Quasi-experimental approaches promise new avenues for measuring the causal impact of user behavior in online communities without adversely affecting users through randomized experiments.

Topic Shifts as a Proxy for Assessing Politicization in Social Media

2024-05-28T05:01:17-07:00

Politicization is a social phenomenon studied by political science characterized by the extent to which ideas and facts are given a political tone. A range of topics, such as climate change, religion and vaccines has been subject to increasing politicization in the media and social media platforms. In this work, we propose a computational method for assessing politicization in online conversations based on topic shifts, i.e., the degree to which people switch topics in online conversations. The intuition is that topic shifts from a non-political topic to politics are a direct measure of politicization – making something political, and that the more people switch conversations to politics, the more they perceive politics as playing a vital role in their daily lives. A fundamental challenge that must be addressed when one studies politicization in social media is that, a priori, any topic may be politicized. Hence, any keyword-based method or even machine learning approaches that rely on topic labels to classify topics are expensive to run and potentially ineffective. Instead, we learn from a seed of political keywords and use Positive-Unlabeled (PU) Learning to detect political comments in reaction to non-political news articles posted on Twitter, YouTube, and TikTok during the 2022 Brazilian presidential elections. Our findings indicate that all platforms show evidence of politicization as discussion around topics adjacent to politics such as economy, crime and drugs tend to shift to politics. Even the least politicized topics had the rate in which their topics shift to politics increased in the lead up to the elections and after other political events in Brazil – an evidence of politicization. The code is available at https://github.com/marceloslo/Topic-Shifts-as-a-Proxy-for-Assessing-Politicization-in-Social-Media.

Othering and Low Status Framing of Immigrant Cuisines in US Restaurant Reviews and Large Language Models

2024-05-28T05:01:19-07:00

Identifying implicit attitudes toward food can help mitigate social prejudice due to food’s pervasive role as a marker of ethnic identity. Stereotypes about food are representational harms that may contribute to racialized discourse and negatively impact economic outcomes for restaurants. Understanding the presence of representational harms in online corpora in particular is important, given the increasing use of large language models (LLMs) for text generation and their tendency to reproduce attitudes in their training data. Through careful linguistic analyses, we evaluate social theories about attitudes toward immigrant cuisine in a large-scale study of framing differences in 2.1M English language Yelp reviews. Controlling for factors such as restaurant price and neighborhood racial diversity, we find that immigrant cuisines are more likely to be othered using socially constructed frames of authenticity (e.g., authentic, traditional), and that non-European cuisines (e.g., Indian, Mexican) in particular are described as more exotic compared to European ones (e.g., French). We also find that non-European cuisines are more likely to be described as cheap and dirty, even after controlling for price, and even among the most expensive restaurants. Finally, we show that reviews generated by LLMs reproduce similar framing tendencies, pointing to the downstream retention of these representational harms. Our results corroborate social theories of gastronomic stereotyping, revealing racialized evaluative processes and linguistic strategies through which they manifest.

Computational Assessment of Hyperpartisanship in News Titles

2024-05-28T05:01:20-07:00

The growing trend of partisanship in news reporting can have a negative impact on society. Assessing the level of partisanship in news headlines is particularly crucial, as they are easily accessible and frequently provide a condensed summary of the article's opinions or events. Therefore, they can significantly influence readers' decision to read the full article, making them a key factor in shaping public opinion. We first adopt a human-guided machine learning framework to develop a new dataset for hyperpartisan news title detection with 2,200 manually labeled and 1.8 million machine-labeled titles that were posted from 2014 to the present by nine representative media organizations across three media bias groups - Left, Central, and Right in an active learning manner. A fine-tuned transformer-based language model achieves an overall accuracy of 0.84 and an F1 score of 0.78 on an external validation set. Next, we conduct a computational analysis to quantify the extent and dynamics of partisanship in news titles. While some aspects are as expected, our study reveals new or nuanced differences between the three media groups. We find that overall the Right media tends to use proportionally more hyperpartisan titles. Roughly around the 2016 Presidential Election, the proportions of hyperpartisan titles increased across all media bias groups, with the Left media exhibiting the most significant relative increase. We identify three major topics including foreign issues, political systems, and societal issues that are suggestive of hyperpartisanship in news titles using logistic regression models and the Shapley values. Through an analysis of the topic distribution, we find that societal issues gradually gain more attention from all media groups. We further apply a lexicon-based language analysis tool to the titles of each topic and quantify the linguistic distance between any pairs of the three media groups, uncovering three distinct patterns. Codes and data are available at https://github.com/VIStA-H/Hyperpartisan-News-Titles.

Keeping Up with the Winner! Targeted Advertisement to Communities in Social Networks

2024-05-28T05:01:22-07:00

When a new product enters a market already dominated by an existing product, will it survive along with this dominant product? Most of the existing works have shown the coexistence of two competing products spreading/being adopted on overlaid graphs with same set of users. However, when it comes to the survival of a weaker product on the same graph, it has been established that the stronger one dominates the market and wipes out the other. This paper makes a step towards narrowing this gap so that a new/weaker product can also survive along with its competitor with a positive market share. Specifically, we identify a locally optimal set of users to induce a community that is targeted with advertisement by the product launching company under a given budget constraint. To this end, we model the system as competing Susceptible-Infected-Susceptible (SIS) epidemics and employ perturbation techniques to quantify and attain a positive market share in a cost-efficient manner. Our extensive simulation results with real-world graph dataset show that with our choice of target users, a new product can establish itself with positive market share, which otherwise would be dominated and eventually wiped out of the competitive market under the same budget constraint.

Throw Your Hat in the Ring (of Wikipedia): Exploring Urban-Rural Disparities in Local Politicians' Information Supply

2024-05-28T05:01:23-07:00

In this era of digital politics, understanding the factors that influence the supply of political information is important. This study investigates the relationship between socio-economic status and the political information supplied on Wikipedia. To this end, it employs a dataset of politicians who ran for local elections in Japan over approximately 20 years and discovers that the creation and revisions of local politicians' pages are associated with socio-economic factors such as the employment ratio by industry and age distribution. We find that the majority of the suppliers of politicians' information are unregistered and primarily interested in politicians' pages compared to registered users. Additional analysis reveals that users who supply information about politicians before and after an election are more active on Wikipedia than the average user. The findings presented imply that the information supply on Wikipedia, which relies on voluntary contributions, may reflect regional socio-economic disparities.

Sensemaking about Contraceptive Methods across Online Platforms

2024-05-28T05:01:25-07:00

Selecting a birth control method is a complex healthcare decision. While birth control methods provide important benefits, they can also cause unpredictable side effects and be stigmatized, leading many people to seek additional information online, where they can privately find reviews, advice, hypotheses, and experiences of other birth control users. However, the relationships between their healthcare concerns, sensemaking activities, and online settings are not well understood. We gather texts about birth control shared on Twitter and Reddit—popular communities with different affordances, moderation, and audiences—to study where and how birth control is discussed online. Using a combination of topic modeling and hand annotation, we identify and characterize the dominant sensemaking practices across these platforms, and we create lexica to draw comparisons across birth control methods and side effects. We use these to measure variations from survey reports of side effect experiences, highlighting topics that social media users discuss more than expected online. Our findings characterize how online platforms are used to make sense of difficult healthcare choices, including analyzing risks, calculating timing and dosages, hypothesizing about causes of side effects, and storytelling about painful experiences. We contribute both to understanding unmet needs of birth control users and to exploring context-specific patterns in social media discussions.

Don’t Break the Chain: Measuring Message Forwarding on WhatsApp

2024-05-28T05:01:26-07:00

WhatsApp has evolved into a popular communication tool, facilitating the exchange of billions of multimedia messages globally. With its large public groups and forwarding features, the platform has enabled messages to go viral, rapidly disseminating across the WhatsApp network. This has also brought WhatsApp to a central position in spreading misinformation campaigns, prompting the company to implement measures to counter bulk message dissemination, such as limiting simultaneous forwards and flagging viral content. Despite these measures, there remains a gap in our understanding of how forwarded messages function within this ecosystem and the effectiveness of the restrictions in containing the spread of viral content. In this study, we analyze approximately 10 million messages from 1,101 public WhatsApp groups dedicated to political discussion in Brazil, focusing on forwarded content. We investigate the structure of message forwarding, assess the reach of Forwarded Many Times (FTM) labeling mechanism, and evaluate the platform's ability to detect and flag duplicated media. Our findings reveal that forwarded messages constitute a substantial portion of the content shared in public WhatsApp groups. Moreover, we discover that the measures implemented by WhatsApp to restrict the dissemination of such messages can be easily circumvented, allowing users to intentionally bypass the architecture of the system and share media beyond the imposed limits. Notably, we identify that 59% of duplicated content flagged as FMT by WhatsApp does not receive the corresponding flag and find evidences of misinformation circulating virally in those groups. This research provides valuable insights into the dynamics of forwarded messages on WhatsApp and highlights the need for more effective strategies to combat the spread of viral content within the platform.

News Media and Violence against Women: Understanding Framings of Stigma

2024-05-28T05:01:29-07:00

Discussions of Violence Against Women (VAW) in publicly accessible forums like online news media can influence the perceptions of people and organizations. Language reinforcing stigma around VAW can result in negative consequences such as unethical representation of survivors and trivialization of the act of violence. In this work, we study the presence of stigmatized framings in news media and how it differs based on media attributes like regionality, political leaning, veracity, and latent communities of news sources. We also investigate the interactions between VAW-based stigma and 14 issue-generic policies used to describe political communications. We found that articles from national, right-leaning, and conspiratorial news sources contain more stigma compared to their counterparts. Furthermore, alignment of articles to the issue-generic policies offers the highest explanation for the presence of stigma in news articles. We discuss implications for institutions to improve safe reporting guidelines on VAW.

Reviewing War: Unconventional User Reviews as a Side Channel to Circumvent Information Controls

2024-05-28T05:01:31-07:00

During the first days of the 2022 Russian invasion of Ukraine, Russia's media regulator blocked access to many global social media platforms and news sites, including Twitter, Facebook, and the BBC. To bypass the information controls set by Russian authorities, pro-Ukrainian groups explored unconventional ways to reach out to the Russian population, such as posting war-related content in the user reviews of Russian businesses available on Google Maps or Tripadvisor. This paper provides a first analysis of this new phenomenon by analyzing the unconventional strategies used to avoid state censorship in the Russian Federation during the conflict. Specifically, we analyze reviews posted on these platforms from the beginning of the war to September 2022. We measure the channeling of war-related messages through user reviews on Tripadvisor and Google Maps. Our analysis of the content posted on these services reveals that users leveraged these platforms to seek and exchange humanitarian and travel advice, but also to disseminate disinformation and polarized messages. Finally, we analyze the response of platforms in terms of content moderation and their impact.

Wildlife Product Trading in Online Social Networks: A Case Study on Ivory-Related Product Sales Promotion Posts

2024-05-28T05:01:32-07:00

Wildlife trafficking (WLT) has evolved into a pressing global concern, as traffickers increasingly utilize online platforms such as e-commerce websites and social networks to expand their illicit trade. This paper addresses the pivotal challenge of detecting and recognizing promotional behaviors related to the sale of wildlife products within online social networks—a critical step in combating these environmentally detrimental activities. To confront these illicit operations effectively, our research undertakes the following key initiatives: 1. Data Collection and Labeling: We employ a network-based approach to gather a scalable dataset pertaining to wildlife product trading. Through a human-in-the-loop machine learning process, this dataset is meticulously labeled, distinguishing between positive class samples containing wildlife product selling posts and hard-negatives representing regular posts misclassified as potential WLT posts, subsequently rectified by human annotators. 2. Machine Learning Framework Development: We present a robust framework that benchmarks machine learning results on the collected dataset. This framework autonomously identifies suspicious wildlife selling posts and accounts, effectively harnessing the multi-modal nature of online social networks. 3. In-depth Analysis of Trading Behaviors: Our research delves into a comprehensive analysis of trading posts, illuminating the systematic and organized selling behaviors prevalent in the current landscape. By providing detailed insights into the nature of these behaviors, we contribute valuable information for understanding and countering illegal wildlife product trading. Moreover, we emphasize our commitment to openness and collaboration by making our code and dataset openly available, thereby fostering cooperative efforts towards the development of more effective strategies in combating illegal wildlife trafficking.

Auditing Algorithmic Explanations of Social Media Feeds: A Case Study of TikTok Video Explanations

2024-05-28T05:01:34-07:00

In recent years, user feeds on social media platforms have shifted from simple, chronologically ordered content posted by their network connections (i.e., friends) to opaque, algorithmically ordered and curated content. This shift has led to regulations that require platforms to offer end users greater transparency and control over their algorithmic recommendation-based feeds. In response, social media platforms such as TikTok have recently started explaining why specific videos are recommended to end users. However, we still lack a good understanding of how these explanations are generated and whether they offer the desired transparency to end users. In this work, we audit explanations provided on short-format videos on TikTok. We collect a large dataset of short-format videos and explanations provided by TikTok (when available) using automated sockpuppet accounts. Then, we systematically characterize the explanations, focusing on their accuracy and comprehensiveness. For our assessments, we compare the provided explanations with video metadata and the behavior of our sockpuppet accounts. Our analysis shows that some generic (non-personalized) reasons are always included in explanations (e.g., "This video is popular in your country"), while at the same time, we find that a large number of provided explanations are incompatible with the behavior of our sockpuppet accounts; (e.g., an account that made zero comments on the platform, was presented with the explanation "You commented on similar videos" in 34% of all recommended videos.) Overall, our audit of TikTok video explanations highlights the need for more accurate, fine-grained, and useful explanations for the end users. We will make our code and dataset available to assist the research community.

Understanding Coordinated Communities through the Lens of Protest-Centric Narratives: A Case Study on #CAA Protest

2024-05-28T05:01:36-07:00

Social media platforms, particularly Twitter, have emerged as vital media for organizing online protests worldwide. During protests, users on social media share different narratives, often coordinated to share collective opinions and obtain widespread reach. In this paper, we focus on the communities formed during a protest and the collective narratives they share, using the protest on the enactment of the Citizenship Amendment Act (#CAA) by the Indian Government as a case study. Since #CAA protest led to divergent discourse in the country, we first classify the users into opposing stances, i.e., protesters (who opposed the Act) and counter-protesters (who supported it) in an unsupervised manner. Next, we identify the coordinated communities in the opposing stances and examine the collective narratives shared by coordinated communities of opposing stances. We use content-based metrics to identify user coordination, including hashtags, mentions, and retweets. Our results suggest mention as the strongest metric for coordination across the opposing stances. Next, we decipher the collective narratives in the opposing stances using an unsupervised narrative detection framework and found call-to-action, on-ground activity, grievances sharing, questioning, and skepticism narratives in the protest tweets. We analyze the strength of the different coordinated communities using network measures, and perform inauthentic activity analysis on the most coordinated communities on both sides. Our findings also suggest that coordinated communities, which were highly inauthentic, showed the highest clustering coefficient towards a greater extent of coordination.

Measuring Moral Dimensions in Social Media with Mformer

2024-05-28T05:01:37-07:00

The ever-growing textual records of contemporary social issues, often discussed online with moral rhetoric, present both an opportunity and a challenge for studying how moral concerns are debated in real life. Moral foundations theory is a taxonomy of intuitions widely used in data-driven analyses of online content, but current computational tools to detect moral foundations suffer from the incompleteness and fragility of their lexicons and from poor generalization across data domains. In this paper, we fine-tune a large language model to measure moral foundations in text based on datasets covering news media and long- and short-form online discussions. The resulting model, called Mformer, outperforms existing approaches on the same domains by 4–12% in AUC and further generalizes well to four commonly used moral text datasets, improving by up to 17% in AUC. We present case studies using Mformer to analyze everyday moral dilemmas on Reddit and controversies on Twitter, showing that moral foundations can meaningfully describe people’s stance on social issues and such variations are topic-dependent. Pretrained model and datasets are released publicly. We posit that Mformer will help the research community quantify moral dimensions for a range of tasks and data domains, and eventually contribute to the understanding of moral situations faced by humans and machines.

How to Improve Representation Alignment and Uniformity in Graph-Based Collaborative Filtering?

2024-05-28T05:01:39-07:00

Collaborative filtering (CF) is a prevalent technique utilized in recommender systems (RSs), and has been extensively deployed in various real-world applications. A recent study in CF focuses on improving the quality of representations from the perspective of alignment and uniformity on the hyperspheres for enhanced recommendation performance. It promotes alignment to increase the similarity between representations of interacting users and items, and enhances uniformity to have more uniformly distributed user and item representations within their respective hyperspheres. However, although alignment and uniformity are enforced by two different optimized objectives, respectively, they jointly constitute the supervised signals for model training. Models trained with only supervised signals in labeled data can inevitably overfit the noise introduced by label sampling variance, even with i.i.d. datasets. This overfitting to noise further compromises the model's generalizability and performance on unseen testing data. To address this issue, in this study, we aim to mitigate the effect caused by the sampling variance in labeled training data to improve representation generalizability from the perspective of alignment and uniformity. Representations with more generalized alignment and uniformity further lead to improved model performance on testing data. Specifically, we model the data as a user-item interaction bipartite graph, and apply a graph neural network (GNN) to learn the user and item representations. This graph modeling approach allows us to integrate self-supervised signals into the RS, by performing self-supervised contrastive learning on the user and item representations from the perspective of label-irrelevant alignment and uniformity. Since the representations are less dependent on label supervision, they can capture more label-irrelevant data structures and patterns, leading to more generalized alignment and uniformity. We conduct extensive experiments on three benchmark datasets to demonstrate the superiority of our framework (i.e., improved performance and faster convergence speed). Our codes: https://github.com/zyouyang/AUPlus

The Manifestation of Affective Polarization on Social Media: A Cross-Platform Supervised Machine Learning Approach

2024-05-28T05:01:41-07:00

This project explores how affective polarization, hostility towards people’s political adversaries, manifests on social media. Whereas prior attempts have relied on sentiment analysis and bag-of-word approaches, we use supervised machine learning to capture the nuances of affective polarization in text on social media. Specifically, we fine-tune BERT to build a classifier that identifies expressions of affective polarization in posts shared on Facebook or Twitter during the first six months of the COVID-19 pandemic (n = 8,603,695). Focusing on this context allows us to study how affective polarization evolved on social media as the COVID-19 issue went from unfamiliar to highly political. We explore the temporal dynamics of affective polarization on Facebook and Twitter using ARIMA models and an outlier analysis of the first few months of the pandemic. Further, we examine the interplay between affective polarization and virality across the two platforms. The findings have important implications for those seeking to (1) capture affective polarization in text, and (2) understand how affective polarization manifests on social media. These implications are discussed.

Party Prediction for Twitter

2024-05-28T05:01:42-07:00

A large number of studies on social media compare the behaviour of users from different political parties. As a basic step, they employ a predictive model for inferring their political affiliation. The accuracy of this model can change the conclusions of a downstream analysis significantly, yet the choice between different models seems to be made arbitrarily. In this paper, we provide a comprehensive survey and an empirical comparison of the current party prediction practices and propose several new approaches which are competitive with or outperform state-of-the-art methods, yet require less computational resources. Party prediction models rely on the content generated by the users (e.g., tweet texts), the relations they have (e.g., who they follow), or their activities and interactions (e.g., which tweets they like). We examine all of these and compare their signal strength for the party prediction task. This paper lets the practitioner select from a wide range of data types that all give strong performance. Finally, we conduct extensive experiments on different aspects of these methods, such as data collection speed and transfer capabilities, which can provide further insights for both applied and methodological research.

Where Did the President Visit Last Week? Detecting Celebrity Trips from News Articles

2024-05-28T05:01:43-07:00

Celebrities’ whereabouts are of pervasive importance. For instance, where politicians go, how often they visit, and who they meet, come with profound geopolitical and economic implications. Although news articles contain travel information of celebrities, it is not possible to perform large-scale and network-wise analysis due to the lack of automatic itinerary detection tools. To design such tools, we have to overcome difficulties from the heterogeneity among news articles: 1) One single article can be noisy, with irrelevant people and locations, especially when the articles are long. 2) Though it may be helpful if we consider multiple articles together to determine a particular trip, the key semantics are still scattered across different articles intertwined with various noises, making it hard to aggregate them effectively. 3) Over 20% of the articles refer to celebrity trips indirectly, instead of using the exact celebrity names or location names, leading to large portions of trips escaping regular detecting algorithms. We model text content across articles related to each candidate location as a graph to better associate essential information and cancel out the noises. Besides, we design a special pooling layer based on attention mechanism and node similarity, reducing irrelevant information from longer articles. To make up the missing information resulted from indirect mentions, we construct knowledge sub-graphs for named entities (person, organization, facility, etc.). Specifically, we dynamically update embeddings of event entities like the G7 summit from news descriptions since the properties (date and location) of the event change each time, which is not captured by pre-trained event representations. The proposed CeleTrip jointly trains these modules, which outperforms all baseline models and achieves 82.53% in the F1 metric. By open-sourcing the first tool and a carefully curated dataset for such a task, we hope to facilitate relevant research in celebrity itinerary mining as well as the social and political analysis built upon the extracted trips.

Online Social Behavior Enhanced Detection of Political Stances in Tweets

2024-05-28T05:01:46-07:00

Public opinion plays a pivotal role in politics, influencing political leaders' decisions, shaping election outcomes, and impacting policy-making processes. In today's digital age, the abundance of political discourse available on social media platforms has become an invaluable resource for analyzing public opinion. This paper focuses on the task of detecting political stances in the context of the 2020 US presidential election. To facilitate this research, we curate a substantial dataset sourced from Twitter, annotated using hashtags as indicators of political polarity. In our approach, we construct a bipartite graph that explicitly models user-tweet interactions, which provides a comprehensive contextual understanding of the election. To effectively leverage the wealth of user behavioral information encoded in this graph, we adopt graph convolution and introduce a novel skip aggregation mechanism. This mechanism enables tweet nodes to aggregate information from their second-order neighbors, which are also tweet nodes due to the graph's bipartite nature. Our experimental results demonstrate that our proposed model outperforms a range of competitive baseline models. Furthermore, our in-depth analyses highlight the importance of user behavioral information and the effectiveness of skip aggregation.

Narratives of Collective Action in YouTube’s Discourse on Veganism

2024-05-28T05:01:48-07:00

Narratives can be powerful tools for inspiring action on pressing societal issues such as climate change. While social science theories offer frameworks for understanding the narratives that arise within collective movements, these are rarely applied to the vast data available from social media platforms, which play a significant role in shaping public opinion and mobilizing collective action. This gap in the empirical evaluation of online narratives limits our understanding of their relationship with public response. In this study, we focus on plant-based diets as a form of pro-environmental action and employ Natural Language Processing to operationalize a theoretical framework of moral narratives specific to the vegan movement. We apply this framework to narratives found in YouTube videos promoting environmental initiatives such as Veganuary, Meatless March, and No Meat May. Our analysis reveals that several narrative types, as defined by the theory, are empirically present in the data. To identify narratives with the potential to elicit positive public engagement, we used text processing to estimate the proportion of comments supporting collective action across narrative types. Video narratives advocating social fight, whether through protest or through efforts to convert others to the cause, are associated with a stronger sense of collective action in the respective comments. These narrative types also demonstrate increased semantic coherence and alignment between the message and public response, markers typically associated with successful collective action. Our work offers new insights into the complex factors that influence the emergence of collective action, thereby informing the development of effective communication strategies within social movements.

Characterizing Political Campaigning with Lexical Mutants on Indian Social Media

2024-05-28T05:01:49-07:00

Increasingly online platforms are becoming popular arenas of political amplification in India. With known instances of pre-organized coordinated operations, researchers are questioning the legitimacy of political expression and its consequences on the democratic processes in India. In this paper, we study an evolved form of political amplification by first identifying and then characterizing political campaigns with lexical mutations. By lexical mutation, we mean content that is reframed, paraphrased, or altered while preserving the same underlying message. Using multilingual embeddings and network analysis, we detect over 3.8K political campaigns with text mutations spanning multiple languages and social media platforms in India. By further assessing the political leanings of accounts repeatedly involved in such amplification campaigns, we contribute a broader understanding of how political amplification is used across various political parties in India. Moreover, our temporal analysis of the largest amplification campaigns suggests that political campaigning can evolve as temporally ordered arguments and counter-arguments between groups with competing political interests. Overall, our work contributes insights into how lexical mutations can be leveraged to bypass the platform manipulation policies and how such competing campaigning can provide an exaggerated sense of political divide on Indian social media.

Curious Rhythms: Temporal Regularities of Wikipedia Consumption

2024-05-28T05:01:51-07:00

Wikipedia, in its role as the world's largest encyclopedia, serves a broad range of information needs. Although previous studies have noted that Wikipedia users' information needs vary throughout the day, there is to date no large-scale, quantitative study of the underlying dynamics. The present paper fills this gap by investigating temporal regularities in daily consumption patterns in a large-scale analysis of billions of timezone-corrected page requests mined from English Wikipedia's server logs, with the goal of investigating how context and time relate to the kind of information consumed. First, we show that even after removing the global pattern of day-night alternation, the consumption habits of individual articles maintain strong diurnal regularities. Then, we characterize the prototypical shapes of consumption patterns, finding a particularly strong distinction between articles preferred during the evening/night and articles preferred during working hours. Finally, we investigate topical and contextual correlates of Wikipedia articles' access rhythms, finding that article topic, reader country, and access device (mobile vs. desktop) are all important predictors of daily attention patterns. These findings shed new light on how humans seek information on the Web by focusing on Wikipedia as one of the largest open platforms for knowledge and learning, emphasizing Wikipedia's role as a rich knowledge base that fulfills information needs spread throughout the day, with implications for understanding information seeking across the globe and for designing appropriate information systems.

Community Notes vs. Snoping: How the Crowd Selects Fact-Checking Targets on Social Media

2024-05-28T05:01:53-07:00

Deploying links to professional fact-checking websites (so-called “snoping”) is a common misinformation intervention technique that can be used by social media users to refute misleading claims made by others. However, the real-world effect of snoping may be limited as it suffers from low visibility and distrust towards professional fact-checkers. As a remedy, X (formerly known as Twitter) recently launched its community-based fact-checking system “Community Notes” on which fact-checks are carried out by actual X users and directly shown on the fact-checked posts. Yet, an understanding of how fact-checking via Community Notes differs from regular snoping is largely absent. In this study, we empirically analyze differences in how contributors to Community Notes and Snopers select their targets when fact-checking social media posts. For this purpose, we collect and holistically analyze two unique datasets from X: (a) 25,912 community-created fact-checks from X's Community Notes platform, and (b) 52,505 “snopes” that debunk posts via fact-checking replies that link to professional fact-checking websites. We find that Notes contributors and Snopers focus on different targets when fact-checking social media content. For instance, Notes contributors tend to fact-check posts from larger accounts with higher social influence and are relatively less likely to emphasize the accuracy of non-misleading posts. Fact-checking targets of Notes contributors and Snopers rarely overlap; however, those overlapping exhibit a high level of agreement in the fact-checking assessment. Moreover, we demonstrate that Snopers fact-check social media posts at a higher speed. Altogether, our findings imply that different fact-checking approaches – carried out on the same social media platform – can result in vastly different social media posts getting fact-checked. This has important implications for future research on misinformation, which should not rely on a single fact-checking approach when compiling misinformation datasets. From a practical perspective, our findings imply that different fact-checking approaches complement each other and may help social media providers to optimize strategies to combat misinformation on their platforms.

How COVID-19 Has Impacted the Anti-vaccine Discourse: A Large-Scale Twitter Study Spanning Pre-COVID and Post-COVID Era

2024-05-28T05:01:54-07:00

The debate around vaccines has been going on for decades, but the COVID-19 pandemic showed how crucial it is to understand and mitigate anti-vaccine sentiments. While the pandemic may be over, it is still important to understand how the pandemic affected the anti-vaccine discourse, and whether the arguments against non-COVID vaccines have also changed due to the pandemic. This study attempts to answer these questions through a large-scale study of anti-vaccine posts on Twitter. Almost all prior works that utilized social media to understand anti-vaccine opinions considered only the three broad stances of Anti-Vax, Pro-Vax, and Neutral. There has not been any effort to identify the specific reasons/concerns behind the anti-vax sentiments (e.g., side-effects, conspiracy theories, political reasons) on social media at scale. In this work, we propose two novel methods for classifing tweets into 11 different anti-vax concerns -- a discriminative approach (entailment-based) and a generative approach (based on instruction tuning of LLMs) -- which outperform several strong baselines. We then apply this classifier on anti-vaccine tweets posted over a 5-year period (Jan 2018 - Jan 2023) to understand how the COVID-19 pandemic has impacted the anti-vaccine concerns among the masses. We find that the pandemic has made the anti-vaccine discourse far more complex than in the pre-COVID times, and increased the variety of concerns being voiced. Alarmingly, we find that concerns about COVID vaccines are now being projected onto the non-COVID vaccines, thus making more people hesitant in taking them in post-COVID times.

Diversity and Inclusion in the Sharing Economy: An Airbnb Case Study

2024-05-28T05:01:56-07:00

The sharing economy model is a contested concept: on one hand, its proponents have praised it to be enabler of fair marketplaces, with all participants receiving equal opportunities; on the other hand, its detractors have criticised it for actually exacerbating preexisting societal inequalities. In this paper, we propose a scalable quantitative method to measure participants' diversity and inclusion in such marketplaces, with the aim to offer evidence to ground this debate. We apply the method to the case of the Airbnb hospitality service for the city of London, UK. Our findings reveal that diversity is high for gender, but not so for age and ethnicity. As for inclusion, we find strong signals of homophily both in terms of gender, age and ethnicity, thus suggesting that under-represented groups have significantly fewer opportunities to gain from this market model. Interestingly, the sentiment associated to same-group (homophilic) interactions is just as positive as that associated to heterophilic ones, even after controlling for Airbnb property's type, price and location. This suggests that increased diversity and inclusion are desirable not only for moral but also for economic and market reasons.

A Deep Dive into the Disparity of Word Error Rates across Thousands of NPTEL MOOC Videos

2024-05-28T05:01:57-07:00

Automatic speech recognition (ASR) systems are designed to transcribe spoken language into written text and find utility in a variety of applications including voice assistants and transcription services. However, it has been observed that state-of-the-art ASR systems which deliver impressive benchmark results, struggle with speakers of certain regions or demographics due to variation in their speech properties. In this work, we describe the curation of a massive speech dataset of 8740 hours consisting of ~9.8K technical lectures in the English language along with their transcripts delivered by instructors representing various parts of Indian demography. The dataset is sourced from the very popular NPTEL MOOC platform. We use the curated dataset to measure the existing disparity in YouTube Automatic Captions and OpenAI Whisper model performance across the diverse demographic traits of speakers in India. While there exists disparity due to gender, native region, age and speech rate of speakers, disparity based on caste is non-existent. We also observe statistically significant disparity across the disciplines of the lectures. These results indicate the need of more inclusive and robust ASR systems and more representational datasets for disparity evaluation in them.

Theme-Driven Keyphrase Extraction to Analyze Social Media Discourse

2024-05-28T05:01:58-07:00

Social media platforms are vital resources for sharing self-reported health experiences, offering rich data on various health topics. Despite advancements in Natural Language Processing (NLP) enabling large-scale social media data analysis, a gap remains in applying keyphrase extraction to health-related content. Keyphrase extraction is used to identify salient concepts in social media discourse without being constrained by predefined entity classes. This paper introduces a theme-driven keyphrase extraction framework tailored for social media, a pioneering approach designed to capture clinically relevant keyphrases from user-generated health texts. Themes are defined as broad categories determined by the objectives of the extraction task. We formulate this novel task of theme-driven keyphrase extraction and demonstrate its potential for efficiently mining social media text for the use case of treatment for opioid use disorder. This paper leverages qualitative and quantitative analysis to demonstrate the feasibility of extracting actionable insights from social media data and efficiently extracting keyphrases using minimally supervised NLP models. Our contributions include the development of a novel data collection and curation framework for theme-driven keyphrase extraction and the creation of SuboxoPhrase, the first dataset of its kind comprising human-annotated keyphrases from a Reddit community. We also identify the scope of minimally supervised NLP models to extract keyphrases from social media data efficiently. Lastly, we found that a large language model (ChatGPT) outperforms unsupervised keyphrase extraction models, showcasing its efficacy in this task.

Does It Matter Who Said It? Exploring the Impact of Deepfake-Enabled Profiles on User Perception towards Disinformation

2024-05-28T05:02:00-07:00

Recently, deepfake techniques have been adopted by real-world adversaries to fabricate believable personas (posing as experts or insiders) in disinformation campaigns to promote false narratives and deceive the public. In this paper, we investigate how fake personas influence the user perception of the disinformation shared by such accounts. Using Twitter as an exemplary platform, we conduct a user study (N=417) where participants read tweets of fake news with (and without) the presence of the tweet authors' profiles. Our study examines and compares three types of fake profiles: deepfake profiles, profiles of relevant organizations, and simple bot profiles. Our results highlight the significant impact of deepfake and organization profiles on increasing the perceived information accuracy of and engagement with fake news. Moreover, deepfake profiles are rated as significantly more real than other profile types. Finally, we observe that users may like/reply/share a tweet even though they believe it was inaccurate (e.g., for fun or truth-seeking), which could further disseminate false information. We then discuss the implications of our findings and directions for future research.

Stranger Danger! Cross-Community Interactions with Fringe Users Increase the Growth of Fringe Communities on Reddit

2024-05-28T05:02:02-07:00

Fringe communities promoting conspiracy theories and extremist ideologies have thrived on mainstream platforms, raising questions about the mechanisms driving their growth. Here, we hypothesize and study a possible mechanism: new members may be recruited through fringe-interactions: the exchange of comments between members and non-members of fringe communities. We apply text-based causal inference techniques to study the impact of fringe-interactions on the growth of three prominent fringe communities on Reddit: r/Incel, r/GenderCritical, and r/The Donald. Our results indicate that fringe-interactions attract new members to fringe communities. Users who receive these interactions are up to 4.2 percentage points (pp) more likely to join fringe communities than similar, matched users who do not.This effect is influenced by 1) the characteristics of communities where the interaction happens (e.g., left vs. right-leaning communities) and 2) the language used in the interactions. Interactions using toxic language have a 5pp higher chance of attracting newcomers to fringe communities than non-toxic interactions. We find no effect when repeating this analysis by replacing fringe (r/Incel, r/GenderCritical, and r/The Donald) with non-fringe communities (r/climatechange, r/NBA, r/leagueoflegends), suggesting this growth mechanism is specific to fringe commu- nities. Overall, our findings suggest that curtailing fringe interactions may reduce the growth of fringe communities on mainstream platforms.

TUBERAIDER: Attributing Coordinated Hate Attacks on YouTube Videos to Their Source Communities

2024-05-28T05:02:04-07:00

Alas, coordinated hate attacks, or raids, are becoming increasingly common online. In a nutshell, these are perpetrated by a group of aggressors who organize and coordinate operations on a platform (e.g., 4chan) to target victims on another community (e.g., YouTube). In this paper, we focus on attributing raids to their source community, paving the way for moderation approaches that take the context (and potentially the motivation) of an attack into consideration. We present TUBERAIDER, an attribution system achieving over 75% accuracy in detecting and attributing coordinated hate attacks on YouTube videos. We instantiate it using links to YouTube videos shared on 4chan's /pol/ board, r/The_Donald, and 16 Incels-related subreddits. We use a peak detector to identify a rise in the comment activity of a YouTube video, which signals that an attack may be occurring. We then train a machine learning classifier based on the community language (i.e., TF-IDF scores of relevant keywords) to perform the attribution. We test TUBERAIDER in the wild and present a few case studies of actual aggression attacks identified by it to showcase its effectiveness.

Unveiling the Risks of NFT Promotion Scams

2024-05-28T05:02:05-07:00

The rapid growth in popularity and hype surrounding digital assets such as art, video, and music in the form of non-fungible tokens (NFTs) has made them a lucrative investment opportunity, with NFT-based sales surpassing $25B in 2021 alone. However, the volatility and general lack of technical understanding of the NFT ecosystem have led to the spread of various scams. The success of an NFT heavily depends on its online virality. As a result, creators use dedicated promotion services to drive engagement to their projects on social media websites, such as Twitter. However, these services are also utilized by scammers to promote fraudulent projects that attempt to steal users' cryptocurrency assets, thus posing a major threat to the ecosystem of NFT sales. In this paper, we conduct a longitudinal study of 439 promotion services (accounts) on Twitter that have collectively promoted 823 unique NFT projects through giveaway competitions over a period of two months. Our findings reveal that more than 36% of these projects were fraudulent, comprising of phishing, rug pull, and pre-mint scams. We also found that a majority of accounts engaging with these promotions (including those for fraudulent NFT projects) are bots that artificially inflate the popularity of the fraudulent NFT collections by increasing their likes, followers, and retweet counts. This manipulation results in significant engagement from real users, who then invest in these scams. We also identify several shortcomings in existing anti-scam measures, such as blocklists, browser protection tools, and domain hosting services, in detecting NFT-based scams. We utilize our findings to develop and open-source a machine learning classifier tool that was able to proactively detect 382 new fraudulent NFT projects on Twitter.

HyperGraphDis: Leveraging Hypergraphs for Contextual and Social-Based Disinformation Detection

2024-05-28T05:02:07-07:00

In light of the growing impact of disinformation on social, economic, and political landscapes, accurate and efficient identification methods are increasingly critical. This paper introduces HyperGraphDis, a novel approach for detecting disinformation on Twitter that employs a hypergraph-based representation to capture (i) the intricate social structures arising from retweet cascades, (ii) relational features among users, and (iii) semantic and topical nuances. Evaluated on four Twitter datasets -- focusing on the 2016 U.S. presidential election and the COVID-19 pandemic -- HyperGraphDis outperforms existing methods in both accuracy and computational efficiency, underscoring its effectiveness and scalability for tackling the challenges posed by disinformation dissemination. HyperGraphDis displays exceptional performance on a COVID-19-related dataset, achieving an impressive F1 score (weighted) of approximately 89.5%. This result represents a notable improvement of around 4% compared to the other state-of-the-art methods. Additionally, significant enhancements in computation time are observed for both model training and inference. In terms of model training, completion times are accelerated by a factor ranging from 2.3 to 7.6 compared to the second-best method across the four datasets. Similarly, during inference, computation times are 1.3 to 6.8 times faster than the state-of-the-art.

Reliability Matters: Exploring the Effect of AI Explanations on Misinformation Detection with a Warning

2024-05-28T05:02:08-07:00

To mitigate misinformation on social media, platforms such as Facebook have offered warnings to users based on the detection results of AI systems. With the evolution of AI detection systems, efforts have been devoted to applying explainable AI (XAI) to further increase the transparency of AI decision-making. Nevertheless, few factors have been considered to understand the effectiveness of a warning with AI explanations in helping humans detect misinformation. In this study, we report the results of three online human-subject experiments (N = 2,692) investigating the framing effect and the impact of an AI system’s reliability on the effectiveness of AI warning with explanations. Our findings show that the framing effect is effective for participants’ misinformation detection, whereas the AI system’s reliability is critical for humans’ misinformation detection and participants’ trust in the AI system. However, adding the explanations can potentially increase participants’ suspicions on miss errors (i.e., false negatives) in the AI system. Furthermore, more trust is shown in the AI warning without explanations condition. We conclude by discussing the implications of our findings.

A Domain Adaptive Graph Learning Framework to Early Detection of Emergent Healthcare Misinformation on Social Media

2024-05-28T05:02:09-07:00

A fundamental issue in healthcare misinformation detection is the lack of timely resources (e.g., medical knowledge, annotated data), making it challenging to accurately detect emergent healthcare misinformation at an early stage. In this paper, we develop a crowdsourcing-based early healthcare misinformation detection framework that jointly exploits the medical expertise of expert crowd workers and adapts the medical knowledge from a source domain (e.g., COVID-19) to detect misleading posts in an emergent target domain (e.g., Mpox, Polio). Two important challenges exist in developing our solution: (i) How to leverage the complex and noisy knowledge from the source domain to facilitate the detection of misinformation in the target domain? (ii) How to effectively utilize the limited amount of expert workers to correct the inapplicable knowledge facts in the source domain and adapt the corrected facts to examine the truthfulness of the posts in the emergent target domain? To address these challenges, we develop CrowdAdapt, a crowdsourcing-based domain adaptive approach that effectively identifies and adapts relevant knowledge facts from the source domain to accurately detect misinformation in the target domain. Evaluation results from two real-world case studies demonstrate the superiority of CrowdAdapt over state-of-the-art baselines in accurately detecting emergent healthcare misinformation.

The Diffusion of Causal Language in Social Networks

2024-05-28T05:02:11-07:00

Causal reasoning plays a central role in human cognition. It facilitates the ability to infer, predict, and manipulate outcomes within the environment, which in turn lays the foundation for a uniquely adaptive decision-making framework that is crucial in navigating complex problem-solving contexts. With the pervasive influence of social media platforms, these online social networks have become critical for disseminating information, shaping public beliefs, and influencing daily life. However, no study has examined the propagation of causal language within social networks. In this work, we analyze the dispersion of messages containing causal language against those without, within the milieu of a large online social network. With the entirety of messages over one complete day on Twitter along with two additional days for validation, and with our validated ensemble method for identifying causal language, our findings reveal that messages with causal language exhibit a more extensive reach than those without. Furthermore, our counterfactual analysis demonstrates that the effect of causal language on information diffusion is truly causal. Moreover, our findings indicate that messages incorporating causal language manifest a higher ability to spread to out-groups compared to those without. These novel insights reveal the unique diffusion pattern of causal language within social networks, and suggest a potential to mitigate the echo chamber effect, while causal language could serve as a bridge for diverse perspectives.

Characterizing Online Criticism of Partisan News Media Using Weakly Supervised Learning

2024-05-28T05:02:13-07:00

We propose novel methods to identify tweets that criticize partisan news sources. Prior work suggests that criticism, ridicule, and distrust of news media all play important roles in hyperpartisanship, misinformation, and filter bubble formation. Thus, understanding the prevalence and temporal dynamics of media-targeted criticism can provide us with updated tools to assess the health of the information ecosystem. There is a scarcity of labeled data for this task, and we develop a weakly supervised learning approach that leverages multiple noisy labeling functions based on both the content of the tweet as well as the historical news sharing behavior of the user. Using this classifier, we explore how tweets expressing criticism vary by user, news source, and time, finding substantial spikes in media criticism during politically polarizing events, such as the investigation into Russian interference in the 2016 U.S. elections and the 2017 "unite the right" rally in Charlottesville. This type of media-targeting criticism is also more likely to occur after users have been exposed to unreliable and hyperpartisan media.

Forecasting Political News Engagement on Social Media

2024-05-28T05:02:14-07:00

Understanding how political news consumption changes over time can provide insights into issues such as hyperpartisanship, filter bubbles, and misinformation. To investigate long-term trends of news consumption, we curate a collection of over 60M tweets from politically engaged users over seven years, annotating ~10% with mentions of news outlets and their political leaning. We then train a neural network to forecast the political lean of news articles Twitter users will engage with, considering both past news engagements as well as tweet content. Using the learned representation of this model, we cluster users to discover salient patterns of long-term news engagement. Our findings include the following: (1) hyperpartisan users are more engaged with news; (2) right-leaning users engage with contra-partisan sources more than left-leaning users; (3) topics such as immigration, COVID-19, Islamaphobia, and gun control are salient indicators of engagement with low quality news sources.

Differences in the Toxic Language of Cross-Platform Communities

2024-05-28T05:02:15-07:00

Cross-platform communities are social media communities that have a presence on multiple online platforms. One active community on both Reddit and Discord is dankmemes. Our study aims to examine differences in harmful language usage across different platforms in a community. We scrape 15 communities that are active on both Reddit and Discord. We then identify and compare differences in type and level of toxicity, in the topics of the harmful discourse, in the temporal evolution of toxicity and its attribution to users, and in the moderation strategies communities across platforms. Our results show that most communities exhibit differences in toxicity depending on the platform. We see that toxicity is rooted in the different subcultures as well as in the way in which the platforms operate and their administrators moderate content. However, we note that in general terms Discord is significantly more toxic than Reddit. We offer a detailed analysis of the topics and types of communities in which this happens and why, which will help moderators and policymakers shape their strategies to mitigate the harm on the Web. In particular, we propose practical and effective strategies that Discord can implement to improve its platform moderation.

280 Characters to Employment: Using Twitter to Quantify Job Vacancies

2024-05-28T05:02:18-07:00

Accurate assessment of workforce needs is critical for designing well-informed economic policy and improving market efficiency. While surveys are the gold standard for estimating when and where workers are needed, they also have important limitations, most notably their substantial costs, dependence on existing and extensive surveying infrastructure, and limited temporal, geographical, and sectorial resolution. Here, we investigate the potential of social media to provide a complementary signal for estimating labor market demand. We introduce a novel statistical approach for extracting information about the location and occupation advertised in job vacancies posted on Twitter. We then construct an aggregate index of labor market demand by occupational class in every major U.S. city from 2015 to 2022, which we evaluate against two sources of official statistics and an index from a large aggregator of online job postings. We find that the newly constructed index is strongly correlated with official statistics and, in some cases, advantageous compared to statistics from job aggregators. Moreover, we demonstrate that our index can robustly improve the prediction of official statistics across occupations and states.

Submodular Optimization beyond Nonnegativity: Adaptive Seed Selection in Incentivized Social Advertising

2024-05-28T05:02:19-07:00

Social advertising, also known as social promotion, is a method of promoting products or ideas through the use of influential individuals, known as ``seeds,'' on online social networks. Advertisers and platforms are the main players in this ecosystem, with platforms selling viral engagements, such as ``likes,'' to advertisers by inserting ads into the feeds of seeds. Seeds are given monetary incentives by the platform in exchange for their participation in the campaign, and when a follower of a seed engages with an ad, the platform receives payment from the advertiser. Specifically, at the beginning of a campaign, the advertiser submits a budget to the platform and this budget can be used for two purposes: recruiting seeds and paying for the viral engagements generated by the seeds. Note that the first part of payment goes to the seeds and the latter one is the actual revenue collected by the platform. The challenge for the platform is to select a group of seeds that will generate the most revenue within the budget constraints set by the advertiser. This problem is challenging as the objective function can be non-monotone and may take on negative values. This makes traditional methods of submodular optimization and influence maximization inapplicable. We study this problem under both non-adaptive and adaptive settings, and propose effective solutions for each scenario.

Climbing the Influence Tiers on TikTok: A Multimodal Study

2024-05-28T05:02:21-07:00

Corporate social media analysts break influencers into five tiers of increasing importance: Nano, Micro, Mid, Macro, and Mega. We perform a comprehensive study of TikTok influencers with two goals: (i) what factors distinguish influencers in each of these tiers from the adjacent tier(s)? (ii) of the features influencers can directly control ("actionable" features), which ones are most impactful to reach the next tier? We build and release a novel TikTok dataset featuring over 230K videos from 5000 influencers - 1000 from each tier. The dataset includes video details such as likes, facial action units, emotions, and music information derived from Spotify. Access to the videos is facilitated through provided URLs and hydration code. To find the most important features that distinguish influencers in a tier from those in the next tier up, we thoroughly analyze traditional features (e.g., profile information) and text, audio, and video features using statistical methods and ablation testing. Our classifiers achieve F1-scores over 80%. The most impactful actionable features are traditional and video features, including enhancing video pleasure, quality, and emphasizing facial expressions. Finally, we collect and release a YouTube Shorts dataset to conduct a comparative analysis, aiming to identify similarities and differences between the two platforms.

Xenophobia Meter: Defining and Measuring Online Sentiment toward Foreigners on Twitter

2024-05-28T05:02:22-07:00

Xenophobia, a form of hatred directed at foreigners, immigrants, and sometimes even people who are just perceived as foreigners, has been flooding social media in recent political climates. In order to capture language related to foreigners and those perceived-as-foreigners (F&PAF) we present the 7-scale Xenophobia Meter, ranging from anti– to pro- F&PAF sentiments with examples and application rationale. We also publish a dataset of over 7,000 tweets labeled according to this meter, from 11 U.S.-based accounts that are on the forefront of defining the rhetoric related to immigration and policy. We apply a number of models to automatically identify xenophobic and F&PAF-related language. We also present findings from qualitative interviews with human annotators about their labeling experiences. While we find xenophobia is a complex social phenomenon to identify by both humans and machine learning algorithms, we hope that our work inspires researchers, policymakers, and the public to learn about xenophobia and to make efforts to shift the rhetoric and policies toward allyship, equity and inclusion.

Disappearing without a Trace: Coverage, Community, Quality, and Temporal Dynamics of Wikipedia Articles on Endangered Brazilian Indigenous Languages

2024-05-28T05:02:23-07:00

Nearly half of Brazil's 180 Indigenous languages face extinction within the next 20 years. What's more concerning is that most of these languages lack a single scientific article describing them, which means they could disappear without leaving any documented evidence of their existence. This work investigates the state of articles about those languages in Wikipedia, both in the English and Portuguese versions, regarded here as indicative of the minimum world-level trace of the previous existence of these languages. Our study shows that over 30% of these languages do not have a single Wikipedia article describing them. It also highlights that the Portuguese and English editing communities are not only distinct, but have different practices, achieving similar levels of quality through different temporal dynamics. These results, although encouraging, suggest that any effort to enhance coverage comprehensiveness in both Wikipedias should consider different strategies for engaging each editing community.

Unraveling User Coordination on Telegram: A Comprehensive Analysis of Political Mobilization during the 2022 Brazilian Presidential Election

2024-05-28T05:02:25-07:00

Social media has gained importance as a channel to influence people's behavior and decisions, affecting not only the online world but also real-life (offline) events. This is especially evident in Brazil, where platforms like Telegram have been instrumental in disseminating political content rapidly and widely. However, the potential coordinated use of Telegram for promoting specific political narratives at critical times, such as the 2022 Brazilian elections, remains an area that requires further investigation. This study aims to investigate this phenomenon, focusing on the first and second rounds of voting and the January 8th riots. To this end, we conducted a comprehensive analysis of 620,000 messages from 256 Telegram groups, focusing on the dynamics of message dissemination and user interactions. Using network backbone extraction and text analysis methods, we identified key users who may be orchestrating the distribution of content. Our findings suggest that these individuals play a central role in the network's topology, relaying messages to a broader audience on dominant topics of discussion that reflect Brazil's political landscape during this turbulent period. This study not only highlights the growing influence of messaging apps on political mobilization but also contributes to our understanding of digital communication strategies in modern electoral contexts, emphasizing the need for further research in this field.

Tracking Fringe and Coordinated Activity on Twitter Leading Up to the US Capitol Attack

2024-05-28T05:02:26-07:00

The aftermath of the 2020 US Presidential Election witnessed an unprecedented attack on the democratic values of the country through the violent insurrection at Capitol Hill on January 6th, 2021. The attack was fueled by the proliferation of conspiracy theories and misleading claims about the integrity of the election pushed by political elites and fringe communities on social media. In this study, we explore the evolution of fringe content and conspiracy theories on Twitter in the seven months leading up to the Capitol attack. We examine the suspicious coordinated activity carried out by users sharing fringe content, finding evidence of common adversarial manipulation techniques ranging from targeted amplification to manufactured consensus. Further, we map out the temporal evolution of, and the relationship between, fringe and conspiracy theories, which eventually coalesced into the rhetoric of a stolen election, with the hashtag #stopthesteal, alongside QAnon-related narratives. Our findings further highlight how social media platforms offer fertile ground for the widespread proliferation of conspiracies during and in the aftermath of major societal events, which can potentially lead to offline coordinated actions and organized violence.

Multilingual Serviceability Model for Detecting and Ranking Help Requests on Social Media during Disasters

2024-05-28T05:02:28-07:00

Social media users expect quick and high-quality responses from emergency services when seeking help. However, these organizations face difficulties in detecting and prioritizing critical requests due to the overwhelming amount of information on social media and their limited human resources to tackle it during mass emergencies or disaster events. The situation is exacerbated when users communicate in different or native languages, which can be expected during disasters. While recent studies have focused on characterizing and automatically detecting help requests on social media, they focused on non-behavioral features and monolingual data, primarily in English. Thus, a key gap exists in analyzing multilingual requests on social media for public services. In this paper, we introduce a knowledge distillation framework called MulTMR (Multiple Teachers Model for detecting and Ranking), which combines the power of both task-related and behavior-guided models as diverse teachers for training a student model to efficiently detect serviceable request messages across languages and regions on social media during natural disaster events. We demonstrate that the presented framework can enhance performance (with an AUC improvement of up to 10%) in various scenarios of multilingual test data. Our results, which were validated on real-world data collected in three languages during ten disasters across seven countries, indicate the use of behavior-guided teacher models in MulTMR can increase attention to relevant indicators of serviceability characteristics. The application of the MulTMR framework through a streaming data analytics tool could reduce the cognitive load on personnel within social media teams of emergency services. Further, its application could inform how to leverage human behavior characteristics in creating automated models for social media analytics to design systems in other public service domains beyond emergency management.

Improving Quantification with Minimal In-Domain Annotations: Beyond Classify and Count

2024-05-28T05:02:30-07:00

Quantification is the task of estimating the class distribution in a given collection. With the growing availability of classification models, the use of classifiers for quantification has become increasingly popular, carrying the promise of eliminating the need for manual annotation. However, the naive classify and count approach presents clear limitations, especially evident in the face of domain discrepancies. In this work, we introduce two novel quantification methods, called CPCC and BCC, which can adapt to new target datasets with a small number of annotated in-domain samples (N = 100). To explore their real-world applicability, we apply our methods to a range of quantification tasks in the realm of hateful and offensive language, where they perform markedly better than classify and count and other existing methods.

Leveraging Psychiatric Scale for Suicide Risk Detection on Social Media

2024-05-28T05:02:31-07:00

The objective of suicide risk detection on social media is to identify individuals who may attempt suicide and determine their suicide risk level based on their online behavior. Although data-driven learning models have been used to predict suicide risk levels, these models often lack theoretical support and explanation from psychiatric research. To address this issue, we propose the incorporation of professional psychiatric scales into research to provide theoretical support and explanations for our model. Our proposed Scale-based Neural Network (SNN) architecture aims to extract content associated with scales from the posting history of social media users to predict their suicide risk level. Additionally, our approach provides scale-based explanations for the model's predictions. Experimental results demonstrate that our proposed method outperforms several strong baseline methods and highlights the potential of combining psychiatric scales and computational techniques to improve suicide risk detection.

Making Online Communities ‘Better’: A Taxonomy of Community Values on Reddit

2024-05-28T05:02:33-07:00

Many researchers studying online communities seek to make them better. However, beyond a small set of widely-held values, such as combating misinformation and abuse, determining what `better’ means can be challenging, as community members may disagree, values may be in conflict, and different communities may have differing preferences as a whole. In this work, we present the first study that elicits values directly from members across a diverse set of communities. We survey 212 members of 627 unique subreddits and ask them to describe their values for their communities in their own words. Through iterative categorization of 1,481 responses, we develop and validate a comprehensive taxonomy of community values, consisting of 29 subcategories within nine top-level categories enabling principled, quantitative study of community values by researchers. Using our taxonomy, we reframe existing research problems, such as managing influxes of new members, as tensions between different values, and we identify understudied values, such as those regarding content quality and community size. We call for greater attention to vulnerable community members' values, and we make our codebook public for use in future research.

Calibrate-Extrapolate: Rethinking Prevalence Estimation with Black Box Classifiers

2024-05-28T05:02:35-07:00

In computational social science, researchers often use a pre-trained, black box classifier to estimate the frequency of each class in unlabeled datasets. A variety of prevalence estimation techniques have been developed in the literature, each yielding an unbiased estimate if certain stability assumption holds. This work introduces a framework to rethink the prevalence estimation process as calibrating the classifier outputs against ground truth labels to obtain the joint distribution of a base dataset and then extrapolating to the joint distribution of a target dataset. We call this framework "Calibrate-Extrapolate". It clarifies what stability assumptions must hold for a prevalence estimation technique to yield accurate estimates. In the calibration phase, the techniques assume only a stable calibration curve between a calibration dataset and the full base dataset. This allows for the classifier outputs to be used for disproportionate random sampling, thus improving the efficiency of calibration. In the extrapolation phase, some techniques assume a stable calibration curve while some assume stable class-conditional densities. We discuss the stability assumptions from a causal perspective. By specifying base and target joint distributions, we can generate simulated datasets, as a way to build intuitions about the impacts of assumption violations. This also leads to a better understanding of how the classifier's predictive power affects the accuracy of prevalence estimates: the greater the predictive power, the lower the sensitivity to violations of stability assumptions in the extrapolation phase. We illustrate the framework with an application that estimates the prevalence of toxic comments on news topics over time on Reddit, Twitter/X, and YouTube, using Jigsaw's Perspective API as a black box classifier. Finally, we summarize several practical advice for prevalence estimation.

Morality in the Mundane: Categorizing Moral Reasoning in Real-Life Social Situations

2024-05-28T05:02:36-07:00

Moral reasoning reflects how people acquire and apply moral rules in particular situations. With social interactions increasingly happening online, social media provides an unprecedented opportunity to assess in-the-wild moral reasoning. We investigate the commonsense aspects of morality empirically using data from a Reddit subcommunity (i.e., a subreddit), r/AmITheAsshole, where an author describes their behavior in a situation and seeks comments about whether that behavior was appropriate. A commenter judges and provides reasons for whether an author or others’ behaviors were wrong. We focus on the novel problem of understanding the moral reasoning implicit in user comments about the propriety of an author’s behavior. Specifically, we explore associations between the common elements of the indicated rationale and the extractable social factors. Our results suggest that a moral response depends on the author’s gender and the topic of a post. Typical situations and behaviors include expressing anger emotion and using sensible words (e.g., f-ck, hell, and damn) in work-related situations. Moreover, we find that commonly expressed reasons also depend on commenters’ interests.

Landscape of Large Language Models in Global English News: Topics, Sentiments, and Spatiotemporal Analysis

2024-05-28T05:02:38-07:00

Generative AI has exhibited considerable potential to transform various industries and public life. The role of news media coverage of generative AI is pivotal in shaping public perceptions and judgments about this significant technological innovation. This paper provides in-depth analysis and rich insights into the temporal and spatial distribution of topics, sentiment, and substantive themes within global news coverage focusing on the latest emerging technology—generative AI. We collected a comprehensive dataset of English news articles (January 2018 to November 2023, N = 24,827) through ProQuest databases. For topic modeling, we employed the BERTopic technique and combined it with qualitative coding to identify semantic themes. Subsequently, sentiment analysis was conducted using the RoBERTa-base model. Analysis of temporal patterns in the data reveals notable variability in coverage across key topics—business, corporate technological development, regulation and security, and education—with spikes in articles coinciding with major AI developments and policy discussions. Sentiment analysis shows a predominantly neutral to positive media stance, with the business-related articles exhibiting more positive sentiment, while regulation and security articles receive a reserved, neutral to negative sentiment. Our study offers a valuable framework to investigate global news discourse and evaluate news attitudes and themes related to emerging technologies.

On the Role of Large Language Models in Crowdsourcing Misinformation Assessment

2024-05-28T05:02:40-07:00

The proliferation of online misinformation significantly undermines the credibility of web content. Recently, crowd workers have been successfully employed to assess misinformation to address the limited scalability of professional fact-checkers. An alternative approach to crowdsourcing is the use of large language models (LLMs). These models are however also not perfect. In this paper, we investigate the scenario of crowd workers working in collaboration with LLMs to assess misinformation. We perform a study where we ask crowd workers to judge the truthfulness of statements under different conditions: with and without LLMs labels and explanations. Our results show that crowd workers tend to overestimate truthfulness when exposed to LLM-generated information. Crowd workers are misled by wrong LLM labels, but, on the other hand, their self-reported confidence is lower when they make mistakes due to relying on the LLM. We also observe diverse behaviors among crowd workers when the LLM is presented, indicating that leveraging LLMs can be considered a distinct working strategy.

From Posts to Pavement, or Vice Versa? The Dynamic Interplay between Online Activism and Offline Confrontations

2024-05-28T05:02:41-07:00

This study examines how the relationship between social media discourse and offline confrontations in social movements, focusing on the "Black Lives Matter" (BLM) protests following George Floyd's death in 2020. While social media's role in facilitating social movements is well-documented, its relationship with offline confrontations remains understudied. To bridge this gap, we curated a dataset comprising 108,443 Facebook posts and 1,406 offline BLM protest events. Our analysis categorized online media framing into "consonance" (alignment) and "dissonance" (misalignment) with the perspectives of different involved parties. Our findings indicate a reciprocal relationship between online activism support and offline confrontational occurrences. Online support for the BLM, in particular, was associated with less property damage and fewer confrontational protests in the days that followed. Conversely, offline confrontations amplified online support for the protesters. By illuminating this dynamic, we highlight the multifaceted influence of social media on social movements. Not only does it serve as a platform for information dissemination and mobilization but also plays a pivotal role in shaping public discourse about offline confrontations.

TeC: A Novel Method for Text Clustering with Large Language Models Guidance and Weakly-Supervised Contrastive Learning

2024-05-28T05:02:42-07:00

Text clustering has become an important branch in unsupervised learning methods and has been widely used in social media. Recently, Large Language Models (LLMs) represent a significant advancement in the field of AI. Therefore, some works have been dedicated to improving the clustering performance of embedding models with feedback from LLMs. However, current approaches hardly take into consideration the cluster label information between text instances when fine-tuning embedding models, leading to the problem of cluster collision. To tackle this issue, this paper proposes TeC, a novel method operating through teaching and correcting phases. In these phases, LLMs take on the role of teachers, guiding embedding models as students to enhance their clustering performance. The teaching phase imparts guidance on cluster label information to embedding models by querying LLMs in a batch-wise manner and utilizes a proposed weakly-supervised contrastive learning loss to fine-tune embedding models based on the provided cluster label information. Subsequently, the correcting phase refines clustering outcomes obtained by the teaching phase by instructing LLMs to correct cluster assignments of low-confidence samples. The extensive experimental evaluation of six text datasets across three different clustering tasks shows the superior performance of our proposed method over existing state-of-the-art approaches.

Detection and Categorization of Needs during Crises Based on Twitter Data

2024-05-28T05:02:44-07:00

The Ukraine-Russia conflict has brought sizable detrimental impact to the global energy, food, finance, and manufacturing industries, as well to many affected people. In this paper, we use Twitter (now X) to automatically identify who needs what from text data and how the types of needs that we categorized and standardize evolved throughout this conflict. Our findings suggest that the Ukraine expresses a need for weapons, Russia for land, Europe for gas, and America for leadership. The majority of needs expressed on Twitter during this conflict are related to the categories transportation, military, health & medical, financial and money, energy, and essential items (food, water, shelter, non-food items). Stated needs changed as the conflict escalated or fell into stalemate. Needs also varied depending on the tweet's location, with tweets from Ukraine's neighboring countries being related to food and medicine, while tweets from non-neighboring countries stated needs for clothing and tents. Tweets written in Ukrainian and Russian shared similar need terms, such as medicines and kits, compared to English tweets, which expressed needs such as ammunition and humanitarian aid. Our comparison of needs across four different disaster events, namely this conflict, an earthquake, a major hurricane, and the COVID-19 pandemic, showed how needs differ depending on the nature of the crisis and how domain-adjustment of needs categories is necessary. We contribute to the crisis informatics literature by (1) validating a methodology for using tweets to study the demand and supply of things that different stakeholders need during crisis events and (2) testing, comparing, and improving the fit of widely used need classification schemas for studying crisis from different domains.

Evaluating and Improving Value Judgments in AI: A Scenario-Based Study on Large Language Models’ Depiction of Social Conventions

2024-05-28T05:02:45-07:00

The adoption of generative AI technologies is swiftly expanding. Services employing both linguistic and multimodal models are evolving, offering users increasingly precise responses. Consequently, human reliance on these technologies is expected to grow rapidly. With the premise that people will be impacted by the output of AI, we explored approaches to help AI output produce better results. Initially, we evaluated how contemporary AI services competitively meet user needs, then examined society's depiction as mirrored by Large Language Models (LLMs). We did a query experiment, querying about social conventions in various countries and eliciting a one-word response. We compared the LLMs' value judgments with public data and suggested a model of decision-making in value-conflicting scenarios which could be adopted for future machine value judgments. This paper advocates for a practical approach to using AI as a tool for investigating other remote worlds. This research has significance in implicitly rejecting the notion of AI making value judgments and instead arguing a more critical perspective on the environment that defers judgemental capabilities to individuals. We anticipate this study will empower anyone, regardless of their capacity, to receive safe and accurate value judgment-based outputs effectively.

Hate Cannot Drive Out Hate: Forecasting Conversation Incivility following Replies to Hate Speech

2024-05-28T05:02:47-07:00

User-generated counter hate speech is a promising means to combat hate speech, but questions about whether it can stop incivility in follow-up conversations linger. We argue that effective counter hate speech stops incivility from emerging in follow-up conversations—counter hate that elicits more incivility is counterproductive. This study introduces the task of predicting the incivility of conversations following replies to hate speech. We first propose a metric to measure conversation incivility based on the number of civil and uncivil comments as well as the unique authors involved in the discourse. Our metric approximates human judgments more accurately than previous metrics. We then use the metric to evaluate the outcomes of replies to hate speech. A linguistic analysis uncovers the differences in the language of replies that elicit follow-up conversations with high and low incivility. Experimental results show that forecasting incivility is challenging. We close with a qualitative analysis shedding light into the most common errors made by the best model.

Influencer Marketing Augmented Personalized Assortment Planning: A Two-Stage Optimization Problem

2024-05-28T05:02:49-07:00

Assortment optimization presents a significant challenge for online retail platforms. Its primary objective is to create an optimal selection of products from a vast array of substitutes, which will be displayed to customers with the aim of maximizing expected revenue. The purchase behavior of customers is typically influenced by a choice model that determines the probability of purchasing each product from a given assortment. This paper extends traditional assortment optimization by introducing the integration of influencer marketing, a practice that involves enlisting influencers to promote products and enhance their appeal to customers. While conventional assortment optimization assumes fixed product attractiveness, our model enables platforms to strategically enhance the attractiveness of selected products through influencer marketing, thereby increasing revenue potential. Consequently, we present a novel problem formulation encompassing assortment and influencer marketing planning. Leveraging recent advancements in submodular optimization, we develop effective and efficient solutions for this joint optimization problem.

DoubleH: Twitter User Stance Detection via Bipartite Graph Neural Networks

2024-05-28T05:02:51-07:00

Given the development and abundance of social media, studying the stance of social media users is a challenging and pressing issue. Social media users express their stance by posting tweets and retweeting. Therefore, the homogeneous relationship between users and the heterogeneous relationship between users and tweets are relevant for the stance detection task. Recently, graph neural networks (GNNs) have developed rapidly and have been applied to social media research. In this paper, we crawl a large-scale dataset of the 2020 US presidential election and automatically label all users by manually tagged hashtags. Subsequently, we propose a bipartite graph neural network model, DoubleH, which aims to better utilize homogeneous and heterogeneous information in user stance detection tasks. Specifically, we first construct a bipartite graph based on posting and retweeting relations for two kinds of nodes, including users and tweets. We then iteratively update the node's representation by extracting and separately processing heterogeneous and homogeneous information in the node's neighbors. Finally, the representations of user nodes are used for user stance classification. Experimental results show that DoubleH outperforms the state-of-the-art methods on popular benchmarks. Further analysis illustrates the model's utilization of information and demonstrates stability and efficiency at different numbers of layers.

Node Attribute Prediction with Weighted and Directed Edges on Single and Multilayer Networks

2024-05-28T05:02:52-07:00

With the rapid development of digital platforms, users can now interact in endless ways from writing business reviews and comments to sharing information with their friends and followers. As a result, organizations have numerous digital social networks available for graph learning problems with little guidance on how to select the right graph or how to combine multiple edge types. For example, while user-to-user interactions are directed in nature, many graph learning approaches use the undirected version of the network. In this paper, we introduce edge direction, edge weight, and multi-relational data for node prediction tasks. We first adapt an existing node attribute prediction method for binary prediction, LINK-Naive Bayes, to account for both edge direction and weights on single-layer networks. We compare predictive performance metrics across various node attribute prediction tasks for an ads click prediction task on Facebook and for a publicly available dataset from the Open Graph Benchmark (OGB). We observe meaningful predictive performance improvements when incorporating edge direction and weight, and performance that's competitive with the OGB Leaderboard. We then introduce an approach called MultiLayerLINK-NaiveBayes that can combine multiple network layers during training and observe superior performance over the single-layer results. Ultimately, whether edge direction, edge weights, and multi-layers are practically useful will depend on the particular setting. Our approach enables practitioners to quickly combine multiple layers and edge types.

ProtoRectifier: A Prototype Rectification Framework for Efficient Cross-Domain Text Classification with Limited Labeled Samples

2024-05-28T05:02:54-07:00

During the past few years, with the advent of large-scale pre-trained language models (PLMs), there has been a significant advancement in cross-domain text classification with limited labeled samples. However, most existing approaches still face the problem of excessive computation overhead. While some non-pretrained language models can reduce the computation overhead, the performance could sharply drop off. To resolve few-shot learning problems on resource-limited devices with satisfactory performance, we propose a prototype rectification framework, ProtoRectifier, based on pre-trained model distillation and episodic meta-learning strategy. Specifically, a representation refactor based on DistilBERT is developed to mine text semantics. Meanwhile, a novel prototype rectification approach (i.e., Mean Shift Rectification) is put forward by making full use of the pseudo labeled query samples, so that the prototype of each category can be updated during the meta-training phase without introducing additional time overhead. Experiments on multiple real-world datasets demonstrate that ProtoRectifier outperforms the state-of-the-art baselines, not only achieving high cross-domain classification accuracy but also reducing the computation overhead significantly.

Discovering Collective Narratives Shifts in Online Discussions

2024-05-28T05:02:58-07:00

Narratives are foundation of human cognition and decision making. Because narratives play a crucial role in societal discourses and spread of misinformation and because of the pervasive use of social media, the narrative dynamics on social media can have profound societal impact. Yet, systematic and computational understanding of online narratives faces critical challenge of the scale and dynamics; how can we reliably and automatically extract narratives from massive amount of texts? How do narratives emerge, spread, and die? Here, we propose a systematic narrative discovery framework that fill this gap by combining change point detection, semantic role labeling (SRL), and automatic aggregation of narrative fragments into narrative networks. We evaluate our model with synthetic and empirical data — two Twitter corpora about COVID-19 and 2017 French Election. Results demonstrate that our approach can recover major narrative shifts that correspond to the major events.

Characterizing Fake News Targeting Corporations

2024-05-28T05:03:00-07:00

Misinformation proliferates in the online sphere, with evident impacts on the political and social realms, influencing democratic discourse and posing risks to public health and safety. The corporate world is also a prime target for fake news dissemination. While recent studies have attempted to characterize corporate misinformation and its effects on companies, their findings often suffer from limitations due to qualitative or narrative approaches and a narrow focus on specific industries. To address this gap, we conducted an analysis utilizing social media quantitative methods and crowd-sourcing studies to investigate corporate misinformation across a diverse array of industries within the S&P 500 companies. Our study reveals that corporate misinformation encompasses topics such as products, politics, and societal issues. We discovered companies affected by fake news also get reputable news coverage but less social media attention, leading to heightened negativity in social media comments, diminished stock growth, and increased stress mentions among employee reviews. Additionally, we observe that a company is not targeted by fake news all the time, but there are particular times when a critical mass of fake news emerges. These findings hold significant implications for regulators, business leaders, and investors, emphasizing the necessity to vigilantly monitor the escalating phenomenon of corporate misinformation.

Emoji Promotes Developer Participation and Issue Resolution on GitHub

2024-05-28T05:03:02-07:00

Although remote working is increasingly adopted during the pandemic, many are concerned by the low-efficiency in the remote working. Missing in text-based communication are non-verbal cues such as facial expressions and body language, which hinders the effective communication and negatively impacts the work outcomes. Prevalent on social media platforms, emojis, as alternative non-verbal cues, are gaining popularity in the virtual workspaces well. In this paper, we study how emoji usage influences developer participation and issue resolution in virtual workspaces. To this end, we collect GitHub issues for a one-year period and apply causal inference techniques to measure the causal effect of emojis on the outcome of issues, controlling for confounders such as issue content, repository, and author information. We find that emojis can significantly reduce the resolution time of issues and attract more user participation. We also compare the heterogeneous effect on different types of issues. These ﬁndings deepen our understanding of the developer communities, and they provide design implications on how to facilitate interactions and broaden developer participation.

A Study of Partisan News Sharing in the Russian Invasion of Ukraine

2024-05-28T05:03:03-07:00

Since the Russian invasion of Ukraine, a large volume of biased and partisan news has been spread via social media platforms. As this may lead to wider societal issues, we argue that understanding how partisan news sharing impacts users' communication is crucial for better governance of online communities. In this paper, we perform a measurement study of partisan news sharing. We aim to characterize the role of such sharing in influencing users' communications. Our analysis covers an eight-month dataset across six Reddit communities related to the Russian invasion. We first perform an analysis of the temporal evolution of partisan news sharing. We confirm that the invasion stimulates discussion in the observed communities, accompanied by an increased volume of partisan news sharing. Next, we characterize users' response to such sharing. We observe that partisan bias plays a role in narrowing its propagation. More biased media is less likely to be spread across multiple subreddits. However, we find that partisan news sharing attracts more users to engage in the discussion, by generating more comments. We then built a predictive model to identify users likely to spread partisan news. The prediction is challenging though, with 61.57% accuracy on average. Our centrality analysis on the commenting network further indicates that the users who disseminate partisan news possess lower network influence in comparison to those who propagate neutral news.

Understanding and Improving Content Moderation in Web3 Platforms

2024-05-28T05:03:05-07:00

There have been numerous recent attempts to “decentralize” social media platforms, loosely referred to as Web3. Such ideas, often underpinned by blockchain solutions, offer decentralized equivalents of well-known services (e.g., forums, social networks, video sharing sites, microblogs). One particularly challenging function to implement in such a design is content moderation, due to the lack of central control. Consequently, they often rely on user-controlled moderation, whereby each user must create their own personal block list to filter out content they do not wish to see. This paper presents a first study of user-controlled moderation on one exemplar Web3 social microblogging platform called memo.cash. Based on a dataset covering 391K posts, we study the factors that lead users to “mute” each other. We find that the most crucial factor is the platform action count, rather than the presence of things like hate speech. We also show that the followership network plays a pivotal role in determining their visibility on the platform, further influencing their muting behavior. This leads us to design tooling to automate the muting process on a per-user basis. We model this as a recommendation problem, and experiment with a number of state-of-the-art recommender engines. We show that our system can generate effective personalized mute lists for users.

TweetIntent@Crisis: A Dataset Revealing Narratives of Both Sides in the Russia-Ukraine Crisis

2024-05-28T05:03:08-07:00

This paper introduces TweetIntent@Crisis, a novel Twitter dataset centered on the Russia-Ukraine crisis. Comprising over 17K tweets from government-affiliated accounts of both nations, the dataset is meticulously annotated to identify underlying intents and detailed intent-related information. Our analysis demonstrates the dataset's capability in revealing fine-grained intents and nuanced narratives within the tweets from both parties involved in the crisis. We aim for TweetIntent@Crisis to provide the research community with a valuable tool for understanding and analyzing granular media narratives and their impact in this geopolitical conflict.

The LGBTQ+ Minority Stress on Social Media (MiSSoM) Dataset: A Labeled Dataset for Natural Language Processing and Machine Learning

2024-05-28T05:03:10-07:00

Minority stress is the leading theoretical construct for understanding LGBTQ+ health disparities. As such, there is an urgent need to develop innovative policies and technologies to reduce minority stress. To spur technological innovation, we created the largest labeled datasets on minority stress using natural language from subreddits related to sexual and gender minority people. A team of mental health clinicians, LGBTQ+ health experts, and computer scientists developed two datasets: (1) the publicly available LGBTQ+ Minority Stress on Social Media (MiSSoM) dataset and (2) the advanced request-only version of the dataset, LGBTQ+ MiSSoM+. Both datasets have seven labels related to minority stress, including an overall composite label and six sublabels. LGBTQ+ MiSSoM (N = 27,709) includes both human- and machine-annotated la-bels and comes preprocessed with features (e.g., topic models, psycholinguistic attributes, sentiment, clinical keywords, word embeddings, n-grams, lexicons). LGBTQ+ MiSSoM+ includes all the characteristics of the open-access dataset, but also includes the original Reddit text and sentence-level labeling for a subset of posts (N = 5,772). Benchmark supervised machine learning analyses revealed that features of the LGBTQ+ MiSSoM datasets can predict overall minority stress quite well (F1 = 0.869). Benchmark performance metrics yielded in the prediction of the other labels, namely prejudiced events (F1 = 0.942), expected rejection (F1 = 0.964), internalized stigma (F1 = 0.952), identity concealment (F1 = 0.971), gender dysphoria (F1 = 0.947), and minority coping (F1 = 0.917), were excellent. Descriptive analyses, ethical considerations, limitations, and possible use cases are provided.

IsamasRed: A Public Dataset Tracking Reddit Discussions on Israel-Hamas Conflict

2024-05-28T05:03:12-07:00

The conflict between Israel and Palestinians significantly escalated after the October 7, 2023 Hamas attack, capturing global attention. To understand the public discourse on this conflict, we present a meticulously compiled dataset-IsamasRed-comprising nearly 400,000 conversations and over 8 million comments from Reddit, spanning from August 2023 to November 2023. We introduce an innovative keyword extraction framework leveraging a large language model to effectively identify pertinent keywords, ensuring a comprehensive data collection. Our initial analysis on the dataset, examining topics, controversy, emotional and moral language trends over time, highlights the emotionally charged and complex nature of the discourse. This dataset aims to enrich the understanding of online discussions, shedding light on the complex interplay between ideology, sentiment, and community engagement in digital spaces.

A Multilingual Similarity Dataset for News Article Frame

2024-05-28T05:03:13-07:00

Understanding the writing frame of news articles is vital for addressing social issues, and thus has attracted notable attention in the fields of communication studies. Yet, assessing such news article frame remains a challenge due to the absence of a concrete and unified standard dataset that considers the comprehensive nuances within news content. To address this gap, we introduce an extended version of a large labeled news article dataset with 16,687 new labeled pairs. Leveraging the pairwise comparison of news articles, our method frees the work of manual identification of frame classes in traditional news frame analysis studies. Overall we introduce the most extensive cross-lingual news article similarity dataset available to date with 26,555 labeled news article pairs across 10 languages. Each data point has been meticulously annotated according to a codebook detailing eight critical aspects of news content, under a human-in-the-loop framework. Application examples demonstrate its potential in unearthing country communities within global news coverage, exposing media bias among news outlets, and quantifying the factors related to news creation. We envision that this news similarity dataset will broaden our understanding of the media ecosystem in terms of news coverage of events and perspectives across countries, locations, languages, and other social constructs. By doing so, it can catalyze advancements in social science research and applied methodologies, thereby exerting a profound impact on our society.

Language-Agnostic Modeling of Wikipedia Articles for Content Quality Assessment across Languages

2024-05-28T05:03:15-07:00

Wikipedia is the largest web repository of free knowledge. Volunteer editors devote time and effort to creating and expanding articles in more than 300 language editions. As content quality varies from article to article, editors also spend substantial time rating articles with specific criteria. However, keeping these assessments complete and up-to-date is largely impossible given the ever-changing nature of Wikipedia. To overcome this limitation, we propose a novel computational framework for modeling the quality of Wikipedia articles. State-of-the-art approaches to model Wikipedia article quality have leveraged machine learning techniques with language-specific features. In contrast, our framework is based on language-agnostic structural features extracted from the articles, a set of universal weights, and a language version-specific normalization criterion. Therefore, we ensure that all language editions of Wikipedia can benefit from our framework, even those that do not have their own quality assessment scheme. Using this framework, we have built datasets with the feature values and quality scores of all revisions of all articles in the existing language versions of Wikipedia. We provide a descriptive analysis of these resources and a benchmark of our framework. In addition, we discuss possible downstream tasks to be addressed with these datasets, which are released for public use.

AiGen-FoodReview: A Multimodal Dataset of Machine-Generated Restaurant Reviews and Images on Social Media

2024-05-28T05:03:16-07:00

Online reviews in the form of user-generated content (UGC) significantly impact consumer decision-making. However, the pervasive issue of not only human fake content but also machine-generated content challenges UGC's reliability. Recent advances in Large Language Models (LLMs) may pave the way to fabricate indistinguishable fake generated content at a much lower cost. Leveraging OpenAI's GPT-4-Turbo and DALL-E-2 models, we craft AiGen-FoodReview, a multimodal dataset of 20,144 restaurant review-image pairs divided into authentic and machine-generated. We explore unimodal and multimodal detection models, achieving 99.80% multimodal accuracy with FLAVA. We use attributes from readability and photographic theories to score reviews and images, respectively, demonstrating their utility as handcrafted features in scalable and interpretable detection models with comparable performance. This paper contributes by open-sourcing the dataset and releasing fake review detectors, recommending its use in unimodal and multimodal fake review detection tasks, and evaluating linguistic and visual features in synthetic versus authentic data.

GMHP7k: A Corpus of German Misogynistic Hatespeech Posts

2024-05-28T05:03:18-07:00

We provide a german corpus consisting of 7,061 posts authored by users of social media platforms. A group of volunteers annotated each post according to hatespeech and misogynistic/misogynous hatespeech in a binary fashion. The interrater reliability over all annotators according to Fleiss’ Kappa is 0.6409 for hatespeech and 0.8258 for misogynistic hatespeech. Furthermore, baseline measurements with machine learning based text classification with BERT are presented. Initial experiments with the corpus achieve macro average F1-scores up to 0.79 for hatespeech and 0.75 for misogynistic hatespeech. The dataset of the corpus on German Misogynistic Hatespeech Posts (GMHP7k) is publicly available.

NELA-PS: A Dataset of Pink Slime News Articles for the Study of Local News Ecosystems

2024-05-28T05:03:20-07:00

Pink slime news outlets automatically produce low-quality, often partisan content that is framed as authentic local news. Given that local news is trusted by Americans and is increasingly shutting down due to financial distress, pink slime news outlets have the potential to exploit local information voids. Yet, there are gaps in understanding of pink slime production practices and tactics, particularly over time. Hence, to support future research in this area, we built a dataset of over 7.9M articles from 1093 pink slime sources over 2.5 years. This dataset is publicly-available at https://doi.org/10.7910/DVN/YHWTFC.

Put Your Money Where Your Mouth Is: Dataset and Analysis of Real World Habit Building Attempts

2024-05-28T05:03:22-07:00

The pursuit of habit building is challenging, and most people struggle with it. Research on successful habit formation is mainly based on small human trials focusing on the same habit for all the participants as conducting long-term heterogonous habit studies can be logistically expensive. With the advent of self-help, there has been an increase in online communities and applications that are centered around habit building and logging. Habit building applications can provide large-scale data on real-world habit building attempts and unveil the commonalities among successful ones. We collect public data on stickk.com, which allows users to track progress on habit building attempts called commitments. A commitment can have an external referee, regular check-ins about the progress, and a monetary stake in case of failure. Our data consists of 742,923 users and 397,456 commitments. In addition to the dataset, rooted in theories like Fresh Start Effect, Accountablity, and Loss Aversion, we ask questions about how commitment properties like start date, external accountability, monitory stake, and pursuing multiple habits together affects the odds of success. We found that people tend to start habits on temporal landmarks, but that does not affect the probability of their success. Practices like accountability and stakes are not often used but are strong determents of success. Commitments of 6 to 8 weeks in length, weekly reporting with an external referee, and a monetary amount at stake tend to be most successful. Finally, around 40% of all commitments are attempted simultaneously with other goals. Simultaneous attempts of pursuing commitments may fail early, but if pursued through the initial phase, they are statistically more successful than building one habit at a time.

Online News Coverage of Critical Race Theory Controversies: A Dataset of Annotated Headlines

2024-05-28T05:03:23-07:00

In this paper, we introduce an annotated dataset of 11,704 unique U.S. news headlines related to critical race theory and its controversies from August 2020 through December 2022. Annotations generated by GPT-4 specify the headline stance and the primary actor in the headline. GPT-4 annotations performed well on the validation dataset, with weighted average F-scores of 0.8339 for headline stance annotations and 0.7625 for primary actor annotations. Along with the annotated headlines and URLs to the full article, we augment the dataset with metrics that are relevant to future research on political polarization, news frame analysis, and regional news coverage. The dataset includes partisan audience bias scores by news source domain, tags for mentions of U.S. states in the article body, and exposure and engagement metrics for articles shared on Reddit. Among other preliminary descriptive analyses, we find that the most prevalent headline stance in our headlines dataset is anti-CRT (43.06%), and the most prevalent primary actor in our headlines dataset is political influencers (56.56%). This paper describes the data collection methodology, preliminary descriptive analysis, and possible uses of the dataset for future research in political science, computational social sciences, and natural language processing. Our dataset and replication code is available to access on Zenodo at zenodo.org/doi/10.5281/zenodo.10516190

The Koo Dataset: An Indian Microblogging Platform with Global Ambitions

2024-05-28T05:03:26-07:00

Increasingly, alternative platforms are playing a key role in the social media ecosystem. Koo, a microblogging platform based in India, has emerged as a major new social network hosting high profile politicians from several countries (India, Brazil, Nigeria) and many internationally renowned celebrities. This paper presents the largest publicly available Koo dataset, spanning from the platform’s founding in early 2020 to September 2023, providing detailed metadata for 72M posts, 75M comments, 40M shares, 284M likes and 1.4M user profiles. Along with the release of the dataset, we provide an overview of the platform including a discussion of the news ecosystem on the platform, hashtag usage, and user engagement. Our results highlight the pivotal role that new platforms play in shaping online communities in emerging economies and the Global South, connecting local politicians and public figures with their followers. With Koo’s ambition to become the town hall for diverse non-English speaking communities, our dataset offers new opportunities for studying social media beyond a Western context.

Sentibank: A Unified Resource of Sentiment Lexicons and Dictionaries

2024-05-28T05:03:28-07:00

Sentiment analysis is critical across computational social science domains, but faces challenges in interpretability. Rule-based methods relying on expert lexicons enable transparency, yet applying them is hindered by resource fragmentation and lack of validation. This paper introduces sentibank, a large-scale unified database consolidating 15 original sentiment dictionaries and 43 preprocessed dictionaries, spanning 7 genres and 6 domains.

iDRAMA-Scored-2024: A Dataset of the Scored Social Media Platform from 2020 to 2023

2024-05-28T05:03:29-07:00

Online web communities often face bans for violating platform policies, encouraging their migration to alternative platforms. This migration, however, can result in increased toxicity and unforeseen consequences on the new platform. In recent years, researchers have collected data from many alternative platforms, indicating coordinated efforts leading to offline events, conspiracy movements, hate speech propagation, and harassment. Thus, it becomes crucial to characterize and understand these alternative platforms. To advance research in this direction, we collect and release a large-scale dataset from Scored -- an alternative Reddit platform that sheltered banned fringe communities, for example, c/TheDonald (a prominent right-wing community) and c/GreatAwakening (a conspiratorial community). Over four years, we collected approximately 57M posts from Scored, with at least 58 communities identified as migrating from Reddit and over 950 communities created since the platform's inception. Furthermore, we provide sentence embeddings of all posts in our dataset, generated through a state-of-the-art model, to further advance the field in characterizing the discussions within these communities. We aim to provide these resources to facilitate their investigations without the need for extensive data collection and processing efforts.

MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection

2024-05-28T05:03:31-07:00

Hate speech represents a pervasive and detrimental form of online discourse, often manifested through an array of slurs, from hateful tweets to defamatory posts. As such speech proliferates, it connects people globally and poses significant social, psychological, and occasionally physical threats to targeted individuals and communities. Current computational linguistic approaches for tackling this phenomenon rely on labelled social media datasets for training. For unifying efforts, our study advances in the critical need for a comprehensive meta-collection, advocating for an extensive dataset to help counteract this problem effectively. We scrutinized over 60 datasets, selectively integrating those pertinent into MetaHate. This paper offers a detailed examination of existing collections, highlighting their strengths and limitations. Our findings contribute to a deeper understanding of the existing datasets, paving the way for training more robust and adaptable models. These enhanced models are essential for effectively combating the dynamic and complex nature of hate speech in the digital realm.

A Dataset to Assess Microsoft Copilot Answers in the Context of Swiss, Bavarian and Hessian Elections

2024-05-28T05:03:32-07:00

This study describes a dataset that allows to assess the emerging challenges posed by Generative Artificial Intelligence when doing Active Retrieval Augmented Generation (RAG), especially when summarizing trustworthy sources on the Internet. As a case study, we focus on Microsoft Copilot, an innovative software that integrates Large Language Models (LLMs) and Search Engines (SE) making advanced AI accessible to the general public. The core contribution of this paper is the presentation of the largest public database to date of RAG responses to user prompts, collected during the 2023 electoral campaigns in Switzerland, Bavaria and Hesse. This dataset was compiled with the assistance of a group of experts who posed realistic voter questions and conducted fact-checking of Microsoft Copilot's responses. It contains prompts and answers in English, German, French and Italian. All the collection happened during the electoral campaign, between 21 August 2023 and 2 October 2023. The paper makes available the full set of 5,561 pairs of prompts and answers, including the URLs referenced in the answers. In addition to the dataset itself, we provide 1374 answers labelled by a group of experts who rated the accuracy of the answers in providing factual information, showing that almost one out of three times the chatbot responded with either factually incorrect information or completely nonsensical answers. This resource is intended to facilitate further research into the performance of LLMs in the context of elections, defined as a "high-risk scenario" by the Digital Services Act (DSA) Article 34(1)(c).

SocialDrought: A Social and News Media Driven Dataset and Analytical Platform towards Understanding Societal Impact of Drought

2024-05-28T05:03:34-07:00

Drought poses significant challenges to sustainability across various sectors in our society, leading to substantial consequences on agriculture, environments, ecosystems, public health, and socioeconomic stability. While prior work has studied the impacts of drought using professionally measured data sources, the societal perspectives of drought impacts remain largely under-explored. In this work, we present SocialDrought, a novel and comprehensive dataset to facilitate research on the societal impacts of drought. In particular, SocialDrought consists of three major components: 1) over 1.5 million social media posts, 2) over 1,400 news articles collected and verified by domain experts, and 3) over 31,000 meteorological records from the U.S. Drought Monitor about drought severity. In addition, we also introduce an online analytical platform that enables interactive and real-time data exploration to gain timely insights into the societal impacts of drought. Our interdisciplinary dataset integrates both conventional meteorological data and unconventional social and news media data to provide a holistic understanding of drought impacts. SocialDrought opens new opportunities to study the societal impacts of drought through the lens of social and news media.

EnronSR: A Benchmark for Evaluating AI-Generated Email Replies

2024-05-28T05:03:36-07:00

Human-to-human communication is no longer just mediated by computers, it is increasingly generated by them, including on popular communication platforms such as Gmail, Facebook Messenger, Linkedin, and others. Yet, little is known about the differences between human- and machine-generated responses in complex social settings. Here, we present EnronSR, a novel benchmark dataset that is based on the Enron email corpus and contains both naturally occurring human- and AI-generated email replies for the same set of messages. This resource enables the benchmarking of novel language-generation models in a public and reproducible manner, and facilitates a comparison against the strong, production-level baseline of Google Smart Reply used by millions of people. Moreover, we show that when language models produce responses they could align more closely with human replies in terms of when responses should be offered, their length, sentiment, and semantic meaning. We further demonstrate the utility of this benchmark in a case study of GPT-3, showing significantly better alignment with human responses than Smart Reply, albeit providing no guarantees for quality or safety.

Analyzing Mentions of Death in COVID-19 Tweets

2024-05-28T05:03:39-07:00

Many researchers have analyzed the potential of using tweets for epidemiology in general and for nowcasting COVID-19 trends in specific. Here, we focus on a subset of tweets that mention a personal, COVID-related death. We show that focusing on this set improves the correlation with official death statistics in six countries, while also picking up on mortality trends specific to different age groups and socio-economic groups. Furthermore, qualitative analysis reveals how politicized many of the mentioned deaths are. To help others reproduce and build on our work, we release a dataset of annotated tweets for academic research.

Tube2Vec: Social and Semantic Embeddings of YouTube Channels

2024-05-28T05:03:40-07:00

Research using YouTube data often explores social and semantic dimensions of channels and videos. Typically, analyses rely on laborious manual annotation of content and content creators, often found by low-recall methods such as keyword search. Here, we explore an alternative approach, Tube2Vec, using latent representations (embeddings) obtained via machine learning. Using a large dataset of YouTube links shared on Reddit; we create embeddings that capture social sharing behavior, video metadata (title, description, etc.), and YouTube's video recommendations. We evaluate these embeddings using crowdsourcing and existing datasets, finding that recommendation embeddings excel at capturing both social and semantic dimensions, although social-sharing embeddings better correlate with existing partisan scores. We share embeddings capturing the social and semantic dimensions of 44,000 YouTube channels for the benefit of future research on YouTube. https://github.com/epfl-dlab/youtube-embeddings.

Scientific Appearance in Telegram

2024-05-28T05:03:41-07:00

This paper examines the influence of scientific appearance (SA) on post dissemination and analyses a dataset of important actors in Germany, specifically those involved in the dissemination of disinformation on the social media platform Telegram. SA is identified through textual elements such as predefined keywords or digital object identifiers (DOIs). Characteristics and behaviours of actors with and without SA are compared using metadata such as forward counts and original posts. The additional content analysis provides insights into SA's usage and impact. The findings indicate that SA may influence the dissemination of posts and demonstrate how different methods can be applied for studying social media platforms.

Generative AI in Crowdwork for Web and Social Media Research: A Survey of Workers at Three Platforms

2024-05-28T05:03:43-07:00

Crowdsourcing plays an important role in Web and social media research, from data annotation, to online experiments and user surveys. With the emergence of Generative AI (GenAI), researchers are considering how models and tools such as GPT might replace crowdwork. Many have already evaluated GPT on annotation tasks. However, it is less clear how GenAI might impact other types of tasks, or to what extent crowdworkers have already incorporated it into their work processes. Thus, we asked crowdworkers directly regarding their use of GenAI, via a survey at two points in time, across three commercial platforms. We found evidence that workers' self-reported use of GenAI did not change over time, but rather, was strongly correlated to the platform in which they operate, with MTurk workers using GenAI much more often than those operating at Clickworker and Prolific. As most respondents reported that survey completion is their "usual type of task", we discuss the implication of the use of GenAI in user surveys, via specific examples of ICWSM research.

Human vs. LMMs: Exploring the Discrepancy in Emoji Interpretation and Usage in Digital Communication

2024-05-28T05:03:44-07:00

Leveraging Large Multimodal Models (LMMs) to simulate human behaviors when processing multimodal information, especially in the context of social media, has garnered immense interest due to its broad potential and far-reaching implications. Emojis, as one of the most unique aspects of digital communication, are pivotal in enriching and often clarifying the emotional and tonal dimensions. Yet, there is a notable gap in understanding how these advanced models, such as GPT-4V, interpret and employ emojis in the nuanced context of online interaction. This study intends to bridge this gap by examining the behavior of GPT-4V in replicating human-like use of emojis. The findings reveal a discernible discrepancy between human and GPT-4V behaviors, likely due to the subjective nature of human interpretation and the limitations of GPT-4V's English-centric training, suggesting cultural biases and inadequate representation of non-English cultures.