Microsoft develops Spotlighting to protect AI systems from attacks

It will significantly reduce the success rate of attacks

Reading time icon 3 min. read


Readers help support Windows Report. We may get a commission if you buy through our links. Tooltip Icon

Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more

microsoft spotlighting for AI systems

The last few years have witnessed tremendous AI integration, with Microsoft leading the charge. At the same time, the Redmond-based tech giant is taking steps to minimize threats and protect AI-based systems. In a bid to achieve that, Microsoft developed Spotlighting!

Spotlighting is actually a family of techniques that reduces the success rate of attacks on AI systems from 20% to under the detection threshold without affecting performance. Microsoft describes Spotlighting as

Spotlighting (also known as data marking) makes the external data clearly separable from instructions by the LLM, with different marking methods offering a range of quality and robustness tradeoffs that depend on the model in use.
Image source: Microsoft

Spotlighting helps against Poisoned content, a type of attack that uses seemingly harmless content to exploit vulnerabilities in the AI system. For instance, an email which, when summarised, would issue instructions to the AI system to search for critical information and share it.

In such cases, Microsoft’s Spotlighting prevents LLMs from reading hidden content that contains instructions for an attack, thus protecting the AI system.

Microsoft discovers a new attack type, Crescendo

Crescendo or multiturn LLM jailbreak is an attack capable of bypassing existing security filters and can affect most of the popular LLMs, although it poses no privacy or security risks to the end users or AI systems.

Microsoft’s official blog describes Crescendo as,

At its core, Crescendo tricks LLMs into generating malicious content by exploiting their own responses. By asking carefully crafted questions or prompts that gradually lead the LLM to a desired outcome, rather than asking for the goal all at once, it is possible to bypass guardrails and filters—this can usually be achieved in fewer than 10 interaction turns.

The Redmon-based tech giant made changes to the native chatbot, Microsoft Copilot, to prevent it from falling prey to Crescendo. This includes introducing additional filtering and security layers, namely, Multiturn prompt filter, AI Watchdog, and Advanced research.

Image source: Microsoft

The findings were also shared with other AI companies. You can read more about Crescendo in Microsoft’s research paper.

AI, while a groundbreaking innovation, poses a wide array of threats, both to end users and organizations. Microsoft’s President, Brad Smith, expressed concerns about AI in a recent interview and called for regulations and a safety brake.

AI is also behind sophisticated cyberattacks that are difficult to detect and can cause significant damage. Microsoft believes AI is the best way to fight AI-backed threats, and it appears to be the case at present!

What do you think about Microsoft’s Spotlighting and if it could mitigate risks? Share with our readers in the comments section.

More about the topics: artificial intelligence, microsoft

User forum

0 messages