Microsoft develops Spotlighting to protect AI systems from attacks

It will significantly reduce the success rate of attacks

News

3 min. read

Published on April 15, 2024

by Kazim Ali Alvi

published on April 15, 2024

Share this article

Readers help support Windows Report. We may get a commission if you buy through our links.

The last few years have witnessed tremendous AI integration, with Microsoft leading the charge. At the same time, the Redmond-based tech giant is taking steps to minimize threats and protect AI-based systems. In a bid to achieve that, Microsoft developed Spotlighting!

Spotlighting is actually a family of techniques that reduces the success rate of attacks on AI systems from 20% to under the detection threshold without affecting performance. Microsoft describes Spotlighting as

Spotlighting (also known as data marking) makes the external data clearly separable from instructions by the LLM, with different marking methods offering a range of quality and robustness tradeoffs that depend on the model in use.

Spotlighting helps against Poisoned content, a type of attack that uses seemingly harmless content to exploit vulnerabilities in the AI system. For instance, an email which, when summarised, would issue instructions to the AI system to search for critical information and share it.

In such cases, Microsoft’s Spotlighting prevents LLMs from reading hidden content that contains instructions for an attack, thus protecting the AI system.

Microsoft discovers a new attack type, Crescendo

Crescendo or multiturn LLM jailbreak is an attack capable of bypassing existing security filters and can affect most of the popular LLMs, although it poses no privacy or security risks to the end users or AI systems.

Microsoft’s official blog describes Crescendo as,

At its core, Crescendo tricks LLMs into generating malicious content by exploiting their own responses. By asking carefully crafted questions or prompts that gradually lead the LLM to a desired outcome, rather than asking for the goal all at once, it is possible to bypass guardrails and filters—this can usually be achieved in fewer than 10 interaction turns.

The Redmon-based tech giant made changes to the native chatbot, Microsoft Copilot, to prevent it from falling prey to Crescendo. This includes introducing additional filtering and security layers, namely, Multiturn prompt filter, AI Watchdog, and Advanced research.

The findings were also shared with other AI companies. You can read more about Crescendo in Microsoft’s research paper.

AI, while a groundbreaking innovation, poses a wide array of threats, both to end users and organizations. Microsoft’s President, Brad Smith, expressed concerns about AI in a recent interview and called for regulations and a safety brake.

AI is also behind sophisticated cyberattacks that are difficult to detect and can cause significant damage. Microsoft believes AI is the best way to fight AI-backed threats, and it appears to be the case at present!

What do you think about Microsoft’s Spotlighting and if it could mitigate risks? Share with our readers in the comments section.

More about the topics: artificial intelligence, microsoft

Kazim Ali Alvi

Windows Hardware Expert

Kazim has always been fond of technology, be it scrolling through the settings on his iPhone, Android device, or Windows PC. He's specialized in hardware devices, always ready to remove a screw or two to find out the real cause of a problem. Long-time Windows user, Kazim is ready to provide a solution for your every software & hardware error on Windows 11, Windows 10 and any previous iteration. He's also one of our experts in Networking & Security.

User forum

0 messages

Sort by:

Microsoft discovers a new attack type, Crescendo

Leave a Reply Cancel reply