The security of LLMs
LLMs are an increasingly popular technology, as they can be used for various purposes and provide benefits in automating tasks and improving workflows. Therefore, many organizations add them as tools to their IT systems, and many software products have LLM components included to enable natural language generation features, such as chatbots or report generation.
However, LLMs have their own vulnerabilities that can be exploited by attackers. Security experts must be aware of these vulnerabilities to advise and implement secure integration of LLMs.
The Open Worldwide Application Security Project (OWASP) is an international organization dedicated to cybersecurity – mainly, web application security. Considering the rising popularity of LLMs, OWASP has created a taxonomy of the top 10 security threats when applying LLMs:
- Prompt injection: LLMs are used by providing queries to them in the form of a textual prompt and obtaining a textual response. The modeling and response generation process usually includes guardrails that safeguard them from providing responses that are inappropriate or leak sensitive information. However, it was shown that an attacker can create a specially crafted prompt that can make LLMs ignore the guardrails and provide an unwanted response.
- Insecure output handling: The output of LLMs can be natural language text but it can also be created in a specific format that can be used to execute malicious code and compromise the security of an IT system where LLM is integrated.
- Training data poisoning: LLMs are trained with large-scale data, and in a high number of use cases, the data is gathered from public sources. There is a potential threat that the data that is used for training can be manipulated to change a model to provide results that benefit attackers.
- Model denial of service: LLMs are large-scale models and can require a high number of computational resources to operate and serve responses to queries from a potentially large user base. Attackers can intentionally create an overwhelmingly large volume of requests to overwhelm a system where an LLM is running and cause disruptions.
- Supply chain vulnerabilities: The functionality of LLMs involves reliance on various other software components, which can be vulnerable to malicious updates or other changes. Furthermore, systems that involve LLMs as components can be vulnerable to inauthentic LLM models being provided.
- Sensitive information disclosure: LLMs are trained on large volumes of data, which can easily include Personally Identifiable Data (PII) or other sensitive organizational information. Without proper guardrails, this information can be provided within generated responses, possibly to users who shouldn’t have permission to see this data.
- Insecure plugin design: LLMs are often added to existing software as plugins to enable features such as chatbots and various text generation capabilities. However, since LLMs accept user input and output can be forwarded to other components, vulnerabilities can be introduced to allow exploits such as remote code execution or SQL injection.
- Excessive agency: The responses of LLMs often contain very impressive textual responses and can provide formulation in the desired form. However, making them a fully autonomous component should be carefully considered, as problems such as hallucinations can cause problems with their reliability and reduce user trust.
- Overreliance: Building on the previous point, we need to consider when we can rely on LLM responses and under what conditions. There needs to be enough awareness of LLM functionality and limitations, as well as enough testing, to be highly confident that LLMs are reliable in particular use cases.
- Model theft: LLM models can be trained on confidential data, and their use by unauthorized persons can create problems for organizations. Therefore, LLM models need to be protected from unauthorized access and being moved and copied to locations where they can be used by attackers.
Since the introduction of LLMs, there have been security considerations regarding emerging threats when using this technology. However, there have been new defensive approaches developed to counter these threats. One example is an AI firewall, which is a type of product that analyzes input and output to LLMs to enable defense from some of the aforementioned threats, such as prompt injection or the extraction of sensitive data. Conversely, some threats, such as supply chain vulnerability and denial of service, were well known before the introduction of LLMs and there are already established defensive approaches that can be used for LLM-based architectures as well.