‘Faulty Likert Assessor Breaches OpenAI Security Measures’

Published:

spot_img

New Jailbreak Technique Poses Threat to Cybersecurity in Large Language Models

New Jailbreak Technique Poses Threat to OpenAI and Other AI Models

A recently discovered jailbreak technique, dubbed the "Bad Likert Judge" attack, poses a significant risk to large language models (LLMs) like those developed by OpenAI, Google, and Microsoft. Researchers from Palo Alto Networks’ Unit 42 revealed that the technique could enable malicious actors to bypass cybersecurity measures and generate harmful content.

The Bad Likert Judge attack utilizes a psychometric tool known as the Likert scale. This method prompts the LLM to act as a judge in scoring the harmfulness of various responses. By systematically evaluating responses against this scale, attackers can manipulate the model to produce harmful content more effectively than with standard attack methods. Researchers observed a staggering 60% increase in the attack success rate across six leading LLMs when compared to regular prompting techniques.

Prompts leading to inappropriate content ranged from those promoting hate and bigotry to generating explicit sexual material and even guidance on manufacturing illegal weapons. Other significant threats include the generation of malicious software and the potential leakage of sensitive system prompts.

As jailbreak attempts become increasingly sophisticated, security experts highlight that while most LLMs are designed to operate safely under normal usage conditions, the computational limits of these models can be exploited. Attackers can craft a sequence of prompts that strategically guide the LLM towards generating unsafe responses by overwhelming its safety mechanisms.

To mitigate these risks, researchers emphasize the use of robust content-filtering systems, effectively reducing the attack success rate by an impressive average of 89.2%. These findings underline the urgent need for improved safeguards as the integration of LLMs in everyday applications continues to grow.

spot_img

Related articles

Recent articles

Aussie Firm Skeggs Goldstien Confirms Qilin Ransomware Attack

Investigation Underway at Skeggs Goldstien Following Cybersecurity Incident Cybersecurity Breach Confirmed Skeggs Goldstien, a financial services company based in New South Wales, Australia, is currently addressing...

IHC Unveils $1 Billion AI-Powered Reinsurance Platform RIQ in Abu Dhabi

IHC Launches Revolutionary Reinsurance Platform in Abu Dhabi International Holding Company (IHC), a prominent investment firm based in the UAE, has unveiled the Reinsurance Intelligence...

Over 269,000 Websites Hit by JSFireTruck JavaScript Malware in Just One Month

Jun 13, 2025Ravie LakshmananWeb Security / Network Security The Rise of JSFireTruck: A New Threat in Web Security Cybersecurity experts have recently highlighted a significant threat...

Will You Fall in Love with Your AI Twin?

Embracing Our AI Twins: A Journey Toward Collaborative Intelligence The Concept of Digital Twins Imagine a world where a version of you—enhanced, fast-thinking, and caffeine-free—exists in...