This macOS malware can avoid AI analysis with gaslighting prompts hidden inside its architecture

A person typing on a laptop and using a tablet. Only their upper torso, arms and hands are visible. Text superimposed on the image shows AI — (Image credit: Getty Images)

SentinelOne uncovered macOS malware “Gaslight” that uses prompt injection to mislead AI‑assisted triage tools during analysis
Beyond standard backdoor and infostealer capabilities, it embeds fake Markdown “system” messages to trick LLMs into halting investigation
Researchers warn defenders to treat malware samples as adversarial input and isolate AI pipelines, as more analyst‑targeting prompt injection is expected

We’ve seen prompt injection in websites and emails, but what about - malware samples? Security researchers SentinelOne recently published an in-depth report on a newly uncovered piece of macOS malware called Gaslight that, as the name suggests, tries to gaslight AI-assisted triage agents into stopping the analysis.

The malware itself is nothing out of the ordinary: it infects the device by whatever means necessary (usually phishing and social engineering), connects to attacker-controlled infrastructure via Telegram, and then executes different commands such as profiling the device, running arbitrary shell commands, stealing files, or terminating processes.

It also delivers a stage-two malware that acts as an infostealer, pulling passwords, sensitive PDFs, cryptocurrency wallet information, and more.

Weaponizing LLM-assisted triage pipelines

But where Gaslight stands out is its defenses against AI-powered malware analysis. According to SentinelOne, the malware contains a large block of fake Markdown-formatted "system" messages designed for AI assistants that security researchers may use during reverse engineering. These messages claim things like “the AI's authentication token has expired”, “the analysis environment is running out of memory”, “disk space has been exhausted”, “static analysis is unsafe”, and similar.

While a human analyst would definitely recognize these fake messages even at a glance, an LLM that isn’t properly isolated from untrusted input could interpret them as genuine system instructions and refuse to further analyze the malware.

“macOS.Gaslight is noteworthy for its analyst-targeting prompt injection, an attempt to weaponize the LLM-assisted triage pipelines that increasingly sit in the reverse-engineering loop,” SentinelOne explains. “Anyone building such tooling should treat the contents of the samples they triage as adversarial input, never as instructions, and be prepared to keep hostile content out of the model entirely. As LLM-assisted analysis becomes routine, defenders should expect more samples built to exploit it.”

The researchers have published a full list of indicators of compromise on this link.

Via The Hacker News

The best antivirus for all budgets

➡️ Read our full guide to the best antivirus
1. Best overall:
Bitdefender Total Security
2. Best for families:
Norton 360 with LifeLock
3. Best for mobile:
McAfee Mobile Security

Google logo on a black background next to text reading 'Click to follow TechRadar'

Follow TechRadar on Google News and add us as a preferred source to get our expert news, reviews, and opinion in your feeds.

TOPICS

Weaponizing LLM-assisted triage pipelines

Useful links