This macOS malware can avoid AI analysis with gaslighting prompts hidden inside its architecture

A person typing on a laptop and using a tablet. Only their upper torso, arms and hands are visible. Text superimposed on the image shows AI
(Image credit: Getty Images)

  • SentinelOne uncovered macOS malware “Gaslight” that uses prompt injection to mislead AI‑assisted triage tools during analysis
  • Beyond standard backdoor and infostealer capabilities, it embeds fake Markdown “system” messages to trick LLMs into halting investigation
  • Researchers warn defenders to treat malware samples as adversarial input and isolate AI pipelines, as more analyst‑targeting prompt injection is expected

We’ve seen prompt injection in websites and emails, but what about - malware samples? Security researchers SentinelOne recently published an in-depth report on a newly uncovered piece of macOS malware called Gaslight that, as the name suggests, tries to gaslight AI-assisted triage agents into stopping the analysis.

The malware itself is nothing out of the ordinary: it infects the device by whatever means necessary (usually phishing and social engineering), connects to attacker-controlled infrastructure via Telegram, and then executes different commands such as profiling the device, running arbitrary shell commands, stealing files, or terminating processes.

It also delivers a stage-two malware that acts as an infostealer, pulling passwords, sensitive PDFs, cryptocurrency wallet information, and more.

Latest Videos From

Weaponizing LLM-assisted triage pipelines

But where Gaslight stands out is its defenses against AI-powered malware analysis. According to SentinelOne, the malware contains a large block of fake Markdown-formatted "system" messages designed for AI assistants that security researchers may use during reverse engineering. These messages claim things like “the AI's authentication token has expired”, “the analysis environment is running out of memory”, “disk space has been exhausted”, “static analysis is unsafe”, and similar.

While a human analyst would definitely recognize these fake messages even at a glance, an LLM that isn’t properly isolated from untrusted input could interpret them as genuine system instructions and refuse to further analyze the malware.

“macOS.Gaslight is noteworthy for its analyst-targeting prompt injection, an attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop,” SentinelOne explains. “Anyone building such tooling should treat the contents of the samples they triage as adversarial input, never as instructions, and be prepared to keep hostile content out of the model entirely. As LLM-assisted analysis becomes routine, defenders should expect more samples built to exploit it.”

The researchers have published a full list of indicators of compromise on this link.

Via The Hacker News


Best antivirus software header
The best antivirus for all budgets

Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.


TOPICS

Sead is a seasoned freelance journalist based in Sarajevo, Bosnia and Herzegovina. He writes about IT (cloud, IoT, 5G, VPN) and cybersecurity (ransomware, data breaches, laws and regulations). In his career, spanning more than a decade, he’s written for numerous media outlets, including Al Jazeera Balkans. He’s also held several modules on content writing for Represent Communications.

You must confirm your public display name before commenting

Please logout and then login again, you will then be prompted to enter your display name.