Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)

Da Cheng Gu; Wei Liu

doi:10.1609/aaai.v39i28.35257

Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)

Authors

Da Cheng Gu University of Technology Sydney
Wei Liu University of Technology Sydney

DOI:

https://doi.org/10.1609/aaai.v39i28.35257

Abstract

State-of-the-art large language models (LLMs) are designed with robust safeguards to prevent the disclosure of harmful information and dangerous procedures. However, "jailbreaking" techniques can circumvent these protections by exploiting vulnerabilities in the models. This paper introduces a novel method, Hex Injection, which leverages a specific weakness in LLMs' ability to decode encoded text to uncover concealed dangerous instructions. Hex Injection distinguishes itself from traditional methods by combining encoded instructions with plaintext prompts to reveal unsafe content more effectively. Our approach involves encoding potentially malicious prompts in hexadecimal and integrating them. We observe a 94% average success rate (ASR) with a combination of plaintext, encoded, and role-play for Llama 3 and 3.1 models, and an 86% ASR for the Gemma 2 model. This research not only advances the understanding of LLM security but also offers valuable insights for improving safety mechanisms in artificial intelligence systems.

AAAI-25 / IAAI-25 / EAAI-25 Proceedings Cover

Downloads

Published

2025-04-11

How to Cite

Gu, D. C., & Liu, W. (2025). Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract). Proceedings of the AAAI Conference on Artificial Intelligence, 39(28), 29377–29378. https://doi.org/10.1609/aaai.v39i28.35257

Download Citation

Issue

Vol. 39 No. 28: IAAI-25, EAAI-25, AAAI-25 Student Abstracts, Undergraduate Consortium and Demonstrations

Section

AAAI Student Abstract and Poster Program

Assessing Vulnerabilities in State-of-the-Art Large Language Models Through Hex Injection (Student Abstract)

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information