‘Faulty Likert Assessor Breaches OpenAI Security Measures’

Published:

spot_img

New Jailbreak Technique Poses Threat to Cybersecurity in Large Language Models

New Jailbreak Technique Poses Threat to OpenAI and Other AI Models

A recently discovered jailbreak technique, dubbed the "Bad Likert Judge" attack, poses a significant risk to large language models (LLMs) like those developed by OpenAI, Google, and Microsoft. Researchers from Palo Alto Networks’ Unit 42 revealed that the technique could enable malicious actors to bypass cybersecurity measures and generate harmful content.

The Bad Likert Judge attack utilizes a psychometric tool known as the Likert scale. This method prompts the LLM to act as a judge in scoring the harmfulness of various responses. By systematically evaluating responses against this scale, attackers can manipulate the model to produce harmful content more effectively than with standard attack methods. Researchers observed a staggering 60% increase in the attack success rate across six leading LLMs when compared to regular prompting techniques.

Prompts leading to inappropriate content ranged from those promoting hate and bigotry to generating explicit sexual material and even guidance on manufacturing illegal weapons. Other significant threats include the generation of malicious software and the potential leakage of sensitive system prompts.

As jailbreak attempts become increasingly sophisticated, security experts highlight that while most LLMs are designed to operate safely under normal usage conditions, the computational limits of these models can be exploited. Attackers can craft a sequence of prompts that strategically guide the LLM towards generating unsafe responses by overwhelming its safety mechanisms.

To mitigate these risks, researchers emphasize the use of robust content-filtering systems, effectively reducing the attack success rate by an impressive average of 89.2%. These findings underline the urgent need for improved safeguards as the integration of LLMs in everyday applications continues to grow.

spot_img

Related articles

Recent articles

127 Organizations Rally Against Proposed Changes to GDPR and EU Data Protection Laws

A coalition of 127 civil society organizations and trade unions has come together to express their opposition to proposed modifications that they...

Washington Post Confirms Data Breach: CL0P Claims Over 40 Oracle Targets

The Washington Post has confirmed that it recently suffered a data breach linked to a concentrated threat campaign exploiting vulnerabilities...

Exploring Africa’s Oil and Gas Future: G20 Forum Fireside Chats

Exploring Africa’s Energy Future: Insights from the G20 Africa Energy Investment Forum On November 21, Johannesburg will host the G20 Africa Energy Investment Forum, organized...

100 Visionary U.S. Cybersecurity Leaders Paving the Way for a Safer Digital Future

Celebrating Cybersecurity Leadership in the U.S. The landscape of cybersecurity in the United States is continually evolving, and a new initiative by The Cyber Express...