_{^{Photo by Ismail Salad Osman Hajji dirir on Unsplash}}

The Huge Promise of LLMs in Cybersecurity

Large Language Models (LLMs) trained on vast quantities of data can make security operations teams smarter. LLMs provide in-line suggestions and guidance on response, audits, posture management, and more. Most security teams are experimenting with or using LLMs to reduce manual toil in workflows.

Image: LLMs for Cybersecurity

For example, an LLM can query an employee via email if they meant to share a document that was proprietary and process the response with a recommendation for a security practitioner. An LLM can also be tasked with translating requests to look for supply chain attacks on open source modules and spinning up agents focused on specific conditions — new contributors to widely used libraries, improper code patterns — with each agent primed for that specific condition.

“LLMs hold tremendous promise to reduce the workload of cybersecurity teams and to improve their capabilities.”

CISOs and their teams constantly struggle to keep up with the rising tide of new cyberattacks. According to Qualys, the number of CVEs reported in 2023 hit a new record of 26,447. That’s up more than 5X from 2013.

This challenge has only become more taxing as the attack surface of the average organization grows larger with each passing year. AppSec teams must secure and monitor many more software applications. Cloud computing, APIs, multi-cloud and virtualization technologies have added additional complexity. With modern CI/CD tooling and processes, application teams can ship more code, faster, and more frequently. Microservices have both splintered monolithic app into numerous APIs and attack surface and also punched many more holes in global firewalls for communication with external services or customer devices.

Advanced LLMs hold tremendous promise to reduce the workload of cybersecurity teams and to improve their capabilities. AI-powered coding tools have widely penetrated software development. Github research found that 92% of developers are using or have used AI tools for code suggestion and completion. Most of these “copilot” tools have some security capabilities. In fact, programmatic disciplines with relatively binary outcomes such as coding (code will either pass or fail unit tests) are well suited for LLMs.

Enhanced Analysis

LLMs can process massive amounts of security data (logs, alerts, threat intelligence) to identify patterns and correlations invisible to humans. They can do this across languages, around the clock, and across numerous dimensions simultaneously. This opens new opportunities for security teams. LLMs can burn down a stack of alerts in near real-time, flagging the ones that are most likely to be severe. Through reinforcement learning, the analysis should improve over time.

Automation

LLMs can automate security team tasks that normally require conversational back and forth. For example, when a security team receives an IoC and needs to ask the owner of an endpoint if they had in fact signed into a device or if they are located somewhere outside their normal work zones, the LLM can perform these simple operations and then follow up with questions as required and links or instructions.

Continuous Learning and Tuning

Unlike previous machine learning systems for security policies and comprehension, LLMs can learn on the fly by ingesting human ratings of its response and by retuning on newer pools of data that may not be contained in internal log files. In fact, using the same underlying foundational model, cybersecurity LLMs can be tuned for different teams and their needs, workflows, or regional or vertical-specific tasks. This also means that the entire system can instantly be as smart as the model, with changes propagating quickly across all interfaces.

However, as a new technology with a short track record, LLMs have serious risks. Worse, understanding the full extent of those risks is challenging because LLM outputs are not 100% predictable or programmatic.

Risk of LLMs for Cybersecurity

Attackers can craft malicious prompts specifically to produce misleading or harmful outputs. This type of attack can exploit the LLM’s tendency to generate content based on the prompts it receives. In cybersecurity use cases, prompt injection might be most risky as a form of insider attack or attack by an unauthorized user who uses prompts to permanently alter system outputs by skewing model behavior.

The training data LLMs rely on can be intentionally corrupted, compromising their decision-making. In cybersecurity settings, where organizations are likely using models trained by tool providers, data poisoning might occur during the tuning of the model for the specific customer and use case.

As mentioned previously, LLMs may generate factually incorrect, illogical, or even malicious responses due to misunderstandings of prompts or underlying data flaws. In cybersecurity use cases, hallucinations can result in critical errors that cripple threat intelligence, vulnerability triage and remediation, and more.

Cybersecurity Image: John Morello

Security processes and AI policies around LLM use and workflows will become more critical as these systems become more common across cybersecurity operations and research. Making sure those processes are complied with, and are measured and accounted for in governance systems, will prove crucial to ensuring that CISOs can provide sufficient GRC (Governance, Risk and Compliance) coverage to meet new mandates like the Cybersecurity Framework 2.0.