Evaluating the Efficacy of Prompting Techniques for Debiasing Language Model Outputs (Student Abstract)

Authors

  • Shaz Furniturewala Birla Institute of Technology and Science Pilani, Pilani
  • Surgan Jandial MDSR Labs Adobe
  • Abhinav Java MDSR Labs, Adobe
  • Simra Shahid MDSR Labs Adobe
  • Pragyan Banerjee Birla Institute of Technology and Science Pilani, Pilani
  • Balaji Krishnamurthy MDSR Labs Adobe
  • Sumit Bhatia MDSR Labs Adobe
  • Kokil Jaidka National University of Singapore

DOI:

https://doi.org/10.1609/aaai.v38i21.30443

Keywords:

LLM, Debiasing, Fairness, Zeroshot, Text Generation

Abstract

Achieving fairness in Large Language Models (LLMs) continues to pose a persistent challenge, as these models are prone to inheriting biases from their training data, which can subsequently impact their performance in various applications. There is a need to systematically explore whether structured prompting techniques can offer opportunities for debiased text generation by LLMs. In this work, we designed an evaluative framework to test the efficacy of different prompting techniques for debiasing text along different dimensions. We aim to devise a general structured prompting approach to achieve fairness that generalizes well to different texts and LLMs.

Published

2024-03-24

How to Cite

Furniturewala, S., Jandial, S., Java, A., Shahid, S., Banerjee, P., Krishnamurthy, B., Bhatia, S., & Jaidka, K. (2024). Evaluating the Efficacy of Prompting Techniques for Debiasing Language Model Outputs (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 38(21), 23492-23493. https://doi.org/10.1609/aaai.v38i21.30443