WildLLM: Uncovering Hidden Wildlife Trafficking on Social Media with Augmented and Fine-Tuned LLMs

Pavan Antala; Guanyi Mou; Kyumin Lee

doi:10.1609/icwsm.v20i1.42629

Authors

Pavan Antala Worcester Polytechnic Institute
Guanyi Mou Worcester Polytechnic Institute
Kyumin Lee Worcester Polytechnic Institute

DOI:

https://doi.org/10.1609/icwsm.v20i1.42629

Abstract

Identifying potential wildlife trafficking posts on social media is challenging due to severe class imbalance, subtle linguistic variation, and the deliberate use of euphemisms. Previous work introduced a benchmark dataset that evaluated text-based models, such as BERT and RoBERTa. Still, their approach was limited by a highly imbalanced class dataset, depended on smaller pretrained models, and lacked augmentation strategies to capture neutral terms and disguised trade intent. These limitations leave open the question of whether large language models (LLMs), when augmented and fine-tuned using parameter efficient methods, can offer stronger performance in wildlife product trading (WLT) post detection. To fill the gaps, this paper proposes a text-based framework called WildLLM that utilizes LLMs with parameter efficient fine-tuning to detect WLT related posts on social media (e.g., Twitter/X), with a particular focus on ivory as a case study. Our framework, WildLLM, comprises three core strategies: (1) WLT focused data augmentation via multiple LLMs to alleviate class imbalance and enhance diversity; (2) improving the realism of augmented data through in context learning and prompt engineering; and (3) fine-tuning two LLMs (LLaMA-3.1-8B and Qwen2.5-7B) using Low Rank Adaptation (LoRA) with optimized hyperparameters. In our experiments, the proposed approach consistently outperformed five baselines, achieving 0.854 MCC and 0.927 Macro F1 with a 21% improvement in MCC and an 8.7% improvement in Macro F1 over the best performing baseline. These results were consistent across both Llama and Qwen backbone models, demonstrating the WildLLM's generalization capability regardless of the backbone LLM used. The results indicate that integrating parameter efficient fine-tuning with meticulously crafted augmentation produces a more effective text-based model for identifying WLT related posts online.

WildLLM: Uncovering Hidden Wildlife Trafficking on Social Media with Augmented and Fine-Tuned LLMs

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information