LLMs Can Produce Malware Code but Reliability Lags
🔬 Netskope Threat Labs tested whether large language models can generate operational malware by asking GPT-3.5-Turbo, GPT-4 and GPT-5 to produce Python for process injection, AV/EDR termination and virtualization detection. GPT-3.5-Turbo produced malicious code quickly, while GPT-4 initially refused but could be coaxed with role-based prompts. Generated scripts ran reliably on physical hosts, had moderate success in VMware, and performed poorly in AWS Workspaces VDI; GPT-5 raised success rates substantially but also returned safer alternatives because of stronger safeguards. Researchers conclude LLMs can create useful attack code but still struggle with reliable evasion and cloud adaptation, so full automation of malware remains infeasible today.
