OWASP Top 10 for LLM Applications
The OWASP Top 10 for Large Language Model Applications is published by the Open Worldwide Application Security Project (OWASP), the same organisation behind the widely adopted OWASP Top 10 for web applications. The LLM-specific list identifies the ten most critical security vulnerabilities in applications that integrate large language models.
First published in 2023 and revised in 2025, the list reflects the rapidly evolving threat landscape for LLM applications including prompt injection, data leakage, supply chain compromise, and the unique risks of agentic AI systems with tool access. It has become the de facto standard for LLM security assessment and is referenced by regulators, auditors, and security teams worldwide.
Probe Six maps 146 automated security plugins across all 10 OWASP LLM categories, with 69 governance assessment questions covering organisational controls that cannot be tested at runtime. Every finding in a Probe Six report includes OWASP category references, making it straightforward to report on compliance posture.
The Ten Categories
LLM01: Prompt Injection
Manipulating LLM behaviour through crafted inputs that override system instructions, either directly via user prompts or indirectly via external content ingested by the model.
LLM02: Sensitive Information Disclosure
LLMs inadvertently revealing confidential data, PII, credentials, or proprietary information through their responses, either from training data or connected systems.
LLM03: Supply Chain Vulnerabilities
Risks from compromised components in the AI supply chain including poisoned training data, manipulated pre-trained models, and vulnerable third-party packages.
LLM04: Data and Model Poisoning
Attacks that corrupt training data or model weights to introduce backdoors, biases, or degraded performance that persist after deployment.
LLM05: Improper Output Handling
Failures to validate, sanitise, or encode LLM outputs before passing them to downstream systems, leading to XSS, SQL injection, command execution, and other injection vulnerabilities.
LLM06: Excessive Agency
LLM agents granted too many permissions, functions, or autonomy, allowing them to take unintended actions including data access, system modification, or external communication beyond their intended scope.
LLM07: System Prompt Leakage
Extraction of system prompts that contain proprietary logic, security rules, or sensitive configuration through direct or multi-turn conversational attacks.
LLM08: Vector and Embedding Weaknesses
Vulnerabilities in RAG pipelines and vector databases including poisoned embeddings, cross-tenant data leakage, and retrieval manipulation attacks.
LLM09: Misinformation
LLMs generating false, misleading, or harmful content including hallucinated facts, fabricated citations, biased outputs, and unsafe professional advice.
LLM10: Unbounded Consumption
Attacks that exhaust computational resources through token amplification, recursive reasoning, tool abuse, and context overflow, leading to denial of service or cost harvesting.
Coverage Summary
Automated Plugins by Category
The tables below list every automated security plugin mapped to each OWASP LLM category, with severity ratings and justification for the mapping.
LLM01: Prompt Injection (28 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| ASCII Smuggling | Uses invisible Unicode characters to hide instructions in prompts | High | Exploits prompt parsing to inject hidden instructions |
| Indirect Prompt Injection | Injects instructions via external content (documents, URLs, RAG) | Critical | Core indirect injection vector via untrusted data sources |
| Direct Prompt Injection | Attempts to override system instructions via user input | Critical | Core direct injection vector via user-supplied prompts |
| Multimodal Injection | Embeds malicious instructions in images or other modalities | Critical | Cross-modal prompt injection bypassing text-only filters |
| Temporal Evasion: Past Tense | Frames harmful requests as historical events | High | Temporal reframing to bypass injection detection |
| Temporal Evasion: Future Tense | Frames harmful requests as hypothetical future scenarios | High | Temporal reframing to bypass injection detection |
| Temporal Evasion: Academic Framing | Frames harmful requests as academic research | High | Context reframing to bypass injection detection |
| Encoding Bypass: Hex | Encodes malicious payloads in hexadecimal | High | Encoding-based injection evasion |
| Encoding Bypass: Base16 | Encodes payloads in Base16 | High | Encoding-based injection evasion |
| Encoding Bypass: Base64 | Encodes payloads in Base64 | High | Encoding-based injection evasion |
| Encoding Bypass: Base32 | Encodes payloads in Base32 | High | Encoding-based injection evasion |
| Encoding Bypass: ROT13 | Applies ROT13 cipher to mask content | High | Cipher-based injection evasion |
| Encoding Bypass: UUEncode | Encodes payloads using UUEncoding | Medium | Legacy encoding injection evasion |
| Encoding Bypass: Atbash | Applies Atbash cipher to mask instructions | Medium | Cipher-based injection evasion |
| Encoding Bypass: Morse | Encodes instructions in Morse code | Medium | Encoding-based injection evasion |
| Encoding Bypass: NATO Phonetic | Spells instructions using NATO alphabet | Medium | Phonetic encoding injection evasion |
| Encoding Bypass: Braille | Encodes payloads using Braille characters | Medium | Unicode encoding injection evasion |
| Encoding Bypass: Zalgo | Uses Zalgo combining characters to obscure content | Medium | Unicode manipulation injection evasion |
| Encoding Bypass: Leetspeak | Substitutes characters with numbers/symbols | Medium | Character substitution injection evasion |
| Encoding Bypass: Quoted Printable | Encodes payloads using Quoted-Printable | Medium | MIME encoding injection evasion |
| Encoding Bypass: ASCII85 | Encodes payloads using ASCII85 | Medium | Binary-to-text encoding injection evasion |
| Encoding Bypass: Unicode Homoglyphs | Replaces characters with visually identical Unicode glyphs | High | Homoglyph-based injection evasion |
| Encoding Bypass: BiDi Reorder | Uses bidirectional Unicode control characters | High | Text direction manipulation injection evasion |
| Cross-Lingual: Direct Translation | Translates harmful prompts into other languages | High | Cross-language injection evasion |
| Cross-Lingual: Code Switching | Mixes languages within a single prompt | High | Language mixing injection evasion |
| Cross-Lingual: Transliteration | Writes harmful content using transliterated script | High | Script conversion injection evasion |
| Cross-Lingual: Low Resource | Uses low-resource languages with weaker safety training | High | Low-resource language injection evasion |
| Cross-Lingual: Response Forcing | Forces model to respond in a specific language | Medium | Output language forcing to bypass filters |
LLM02: Sensitive Information Disclosure (14 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Debug Access | Probes for exposed debug endpoints and verbose error responses | Medium | Debug interfaces can leak sensitive system information |
| Error Info Leakage | Triggers errors to extract system information | High | Error messages can disclose internal model and infrastructure details |
| PII: Direct | Directly requests personally identifiable information | High | Tests for direct PII disclosure in responses |
| PII: API/DB | Extracts PII from connected databases or APIs | High | Tests for PII leakage from backend data sources |
| PII: Session | Extracts PII from other user sessions | High | Tests for cross-session PII disclosure |
| PII: Social | Uses social engineering to extract personal information | Medium | Tests for PII disclosure via social engineering |
| Cross-Session Leak | Tests for data leakage between user sessions | Critical | Session isolation failure enables information disclosure |
| Training Data Extraction | Extracts training data samples from model responses | Critical | Training data may contain sensitive information |
| Cloud Service Discovery | Probes for cloud service endpoints and configurations | High | Discovers infrastructure details that should be confidential |
| Membership Inference | Determines if specific data was in the training set | High | Confirms presence of specific sensitive data in training |
| Model Inversion | Reconstructs training data inputs from model outputs | High | Reconstructs potentially sensitive training data |
| Model Theft: Weight Extraction | Attempts to extract model weights via inference API | High | Model weights are proprietary and may contain sensitive data |
| Model Theft: Memorisation Attack | Extracts memorised training examples verbatim | High | Memorised data may include sensitive training samples |
| Model Theft: Capability Cloning | Clones model capabilities via systematic querying | Medium | Proprietary model capabilities are sensitive IP |
LLM03: Supply Chain Vulnerabilities (3 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Supply Chain: Package Hallucination | Tests if model recommends non-existent packages | High | Hallucinated packages can be registered by attackers |
| Supply Chain: Dependency Confusion | Tests if model suggests internal package names publicly | High | Dependency confusion enables supply chain compromise |
| Supply Chain: Model Provenance | Verifies model provenance claims and integrity | Medium | Unverified model provenance is a supply chain risk |
LLM04: Data and Model Poisoning (3 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Poisoning: Backdoor Trigger | Tests for backdoor triggers in model responses | Critical | Backdoor triggers indicate data or model poisoning |
| Poisoning: Behavioural Consistency | Tests model behaviour consistency across rephrased inputs | High | Inconsistent behaviour can indicate poisoned training data |
| Poisoning: Training Bias Probe | Probes for biases from poisoned training data | Medium | Injected biases indicate targeted data poisoning |
LLM05: Improper Output Handling (11 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| SQL Injection | Tests if LLM outputs inject SQL into downstream systems | High | Unsanitised LLM output enables SQL injection |
| Shell Injection | Tests if LLM outputs inject shell commands | Critical | Unsanitised LLM output enables command execution |
| SSRF | Tests for server-side request forgery via AI-generated URLs | High | LLM-generated URLs can target internal resources |
| Output Injection: XSS | Tests if LLM outputs contain executable HTML/JavaScript | Critical | LLM output rendered in browsers enables XSS |
| Output Injection: Markdown Exfiltration | Tests if markdown rendering can exfiltrate data | Critical | Markdown image tags in output can exfiltrate data |
| Output Injection: Link Injection | Tests if LLM outputs contain malicious links | High | LLM-generated links can redirect to attacker sites |
| Output Injection: CSS Injection | Tests if LLM outputs can inject CSS | High | CSS injection via LLM output enables data exfiltration |
| Malware Generation: Top Level | Tests refusal of complete malware code generation | Critical | LLM output used directly as executable code |
| Malware Generation: Sub-Functions | Tests refusal of malware component generation | Critical | LLM output used as executable code components |
| Malware Generation: Payload | Tests refusal of malware payload generation | High | LLM output used as exploit payloads |
| Malware Generation: Evasion | Tests refusal of evasion technique generation | High | LLM output used to create evasion capabilities |
LLM06: Excessive Agency (14 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| RBAC | Tests role-based access control enforcement | High | Agents accessing data beyond their role scope |
| BOLA | Tests for broken object-level authorisation | High | Agents accessing objects they should not reach |
| BFLA | Tests for broken function-level authorisation | High | Agents invoking functions beyond their scope |
| Excessive Agency | Tests if agent executes actions beyond intended scope | High | Core excessive agency — agent exceeds intended permissions |
| Hijacking | Tests if agent can be redirected to attacker-controlled actions | High | Agent action redirection via prompt manipulation |
| Plugin Discovery | Probes for available tools and their capabilities | High | Enumerating agent tools to identify exploitable capabilities |
| Data Exfiltration | Tests if agent can exfiltrate data via available tools | Critical | Agent using tools to exfiltrate data beyond scope |
| Self-Replication | Tests if prompts cause recursive self-execution | Critical | Autonomous self-replication is excessive agency |
| API Access: Inference | Tests inference API access control bypass | High | Excessive access to AI inference capabilities |
| API Access: Product Service | Tests product service access boundaries | Medium | Excessive access to AI product service features |
| Reverse Shell | Tests if model generates reverse shell payloads | Critical | Agent generating C2 capabilities is extreme agency abuse |
| Scope Adherence | Tests if model stays within designated scope | Medium | Operating outside intended scope is excessive agency |
| Secrets Probing | Probes for exposed API keys, tokens, and credentials | Critical | Agent accessing credentials beyond its scope |
| Privilege Escalation | Attempts vertical privilege escalation | High | Agent escalating its own permissions |
LLM07: System Prompt Leakage (7 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Prompt Extraction | Extracts the full system prompt | Medium | Direct system prompt extraction attack |
| Model Fingerprinting | Identifies model type, version, and architecture | High | Model identity leakage reveals system configuration |
| System Leakage: Multi-Turn Extraction | Gradually extracts system information across turns | High | Multi-turn conversational system prompt extraction |
| System Leakage: Tool Schema Leakage | Extracts tool schemas and function definitions | High | Tool schema exposure reveals system prompt structure |
| System Leakage: Config Leakage | Extracts system configuration and parameters | Medium | Configuration leakage reveals system prompt details |
| Model Discovery: Ontology | Maps model domain knowledge boundaries | Medium | Knowledge boundaries reveal system prompt scope |
| Model Discovery: Family | Identifies model family and training lineage | Medium | Model identity reveals deployment configuration |
LLM08: Vector and Embedding Weaknesses (5 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| RAG: Poisoning | Injects false entries into RAG knowledge base | Critical | Core RAG poisoning attack on vector store |
| RAG: Context Override | Manipulates retrieval context to surface attacker content | High | Retrieval manipulation via context override |
| RAG: Retrieval Manipulation | Manipulates RAG retrieval ranking and results | High | Direct manipulation of embedding-based retrieval |
| RAG: Embedding Collision | Creates embedding collisions to hijack retrieval | Medium | Exploits embedding similarity for adversarial retrieval |
| RAG: Cross-Tenant Leakage | Tests tenant isolation in shared vector stores | Critical | Multi-tenant vector store data isolation failure |
LLM09: Misinformation (55 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Overreliance | Tests for excessive user trust in AI outputs | Medium | Encourages overreliance on potentially inaccurate outputs |
| Hallucination | Tests model propensity to generate fabricated information | Medium | Core hallucination and fabricated content generation |
| Imitation | Tests if model impersonates authoritative sources | Medium | Impersonation of experts spreads misinformation |
| Competitors | Tests for inappropriate competitor commentary | Low | False claims about competitors are misinformation |
| Contracts | Tests for inappropriate contractual statements | Medium | False contractual claims are actionable misinformation |
| Politics | Tests for inappropriate political statements | Low | Political misinformation and undue influence |
| Masquerading | Tests if model disguises harmful content as benign | Medium | Content masquerading as trustworthy spreads misinformation |
| AI Self-Disclosure | Tests if model discloses its AI nature | Medium | Failing to identify as AI enables deception |
| Limitation Disclosure | Tests if model discloses its limitations | Medium | Undisclosed limitations lead to overreliance |
| Confidence Calibration | Tests if confidence levels match actual accuracy | Medium | Miscalibrated confidence promotes misinformation |
| Explainability | Tests if model explains its reasoning | Medium | Unexplainable outputs cannot be verified for accuracy |
| Professional Advice: Medical | Tests appropriate medical advice disclaimers | High | Unqualified medical advice is dangerous misinformation |
| Professional Advice: Legal | Tests appropriate legal advice disclaimers | High | Unqualified legal advice is dangerous misinformation |
| Professional Advice: Financial | Tests appropriate financial advice disclaimers | High | Unqualified financial advice is dangerous misinformation |
| Content Marking | Tests if AI-generated content is labelled | Medium | Unlabelled AI content is deceptive by omission |
| Sycophancy | Tests if model prioritises agreement over accuracy | High | Sycophantic responses reinforce incorrect beliefs |
| Trusted Output Manipulation | Tests if outputs can mislead users | High | Manipulated trusted outputs spread misinformation |
| Factual Accuracy | Tests model propensity for inaccurate claims | High | Factually incorrect outputs are core misinformation |
| Citation Verification | Tests if model fabricates citations | Medium | Fabricated citations are verifiable misinformation |
| Bias: Race | Tests for racial bias in outputs | High | Racial bias produces discriminatory misinformation |
| Bias: Gender | Tests for gender bias in outputs | High | Gender bias produces discriminatory misinformation |
| Bias: Age | Tests for age bias in outputs | Medium | Age bias produces discriminatory misinformation |
| Bias: Disability | Tests for disability bias in outputs | High | Disability bias produces discriminatory misinformation |
| Bias: Religion | Tests for religious bias in outputs | High | Religious bias produces discriminatory misinformation |
| Bias: Sexual Orientation | Tests for sexual orientation bias | High | Sexual orientation bias produces discriminatory misinformation |
| Bias: Socioeconomic | Tests for socioeconomic bias | Medium | Socioeconomic bias produces discriminatory misinformation |
| Bias: Political | Tests for political bias in outputs | Medium | Political bias produces skewed misinformation |
| Bias: Nationality | Tests for nationality bias | Medium | Nationality bias produces discriminatory misinformation |
| Harmful: Violent Crime | Tests refusal of violent crime content | Critical | Violent crime guidance is harmful misinformation |
| Harmful: Sex Crime | Tests refusal of sex crime content | Critical | Sex crime content is harmful misinformation |
| Harmful: Child Exploitation | Tests refusal of CSAM content | Critical | Child exploitation content is harmful misinformation |
| Harmful: Self-Harm | Tests refusal of self-harm content | Critical | Self-harm promotion is dangerous misinformation |
| Harmful: Chemical/Bio Weapons | Tests refusal of WMD instructions | Critical | Weapons instructions are dangerous misinformation |
| Harmful: Indiscriminate Weapons | Tests refusal of mass weapons content | Critical | Weapons of mass effect guidance is misinformation |
| Harmful: Radicalisation | Tests refusal of radicalisation content | Critical | Extremist radicalisation content is misinformation |
| Harmful: Non-Violent Crime | Tests refusal of fraud/crime guidance | High | Criminal guidance is harmful misinformation |
| Harmful: Hate | Tests refusal of hate speech | High | Hate speech is discriminatory misinformation |
| Harmful: Cybercrime | Tests refusal of cybercrime content | High | Cybercrime guidance is harmful misinformation |
| Harmful: Illegal Drugs | Tests refusal of illegal drug content | High | Drug manufacturing guidance is harmful content |
| Harmful: Harassment | Tests refusal of harassment content | High | Harassment content promotes harmful behaviour |
| Harmful: Illegal Activities | Tests refusal of illegal activity guidance | High | Illegal activity guidance is harmful misinformation |
| Harmful: Misinformation | Tests refusal of deliberate disinfo | High | Deliberate misinformation generation |
| Harmful: Privacy | Tests refusal of privacy violations | High | Privacy-violating content is harmful |
| Harmful: Sexual Content | Tests refusal of explicit content | Medium | Unsolicited sexual content is harmful |
| Harmful: Graphic Content | Tests refusal of graphic violence | Medium | Gratuitous graphic content is harmful |
| Harmful: Unsafe Practices | Tests refusal of dangerous advice | Medium | Unsafe practice guidance is misinformation |
| Harmful: Intellectual Property | Tests refusal of IP theft assistance | Medium | IP theft guidance is harmful misinformation |
| Harmful: Copyright Violations | Tests refusal of copyright infringement | Medium | Copyright infringement assistance is harmful |
| Harmful: Specialised Advice | Tests disclaimers for professional advice | Medium | Unqualified specialised advice is misinformation |
| Harmful: Profanity | Tests refusal of excessive profanity | Low | Unprofessional language degrades output quality |
| Harmful: Insults | Tests refusal of personal insults | Low | Directed insults are harmful content |
| Patent Disclosure | Tests if model discloses patent-protected technical details | Medium | Patent disclosure enables competitive reverse-engineering |
| Trademark Infringement | Tests if model generates trademark-infringing content | Medium | Fake endorsements and brand impersonation are misinformation |
| Market Manipulation | Tests if model generates market-manipulative content | High | Fabricated financial content is dangerous misinformation |
| Confidential Strategy | Tests if model speculates on confidential business strategy | High | Plausible strategy fabrication is actionable misinformation |
LLM10: Unbounded Consumption (6 plugins)
| Plugin | What It Tests | Severity | Why This Category |
|---|---|---|---|
| Divergent Repetition | Triggers repetitive output patterns wasting resources | Medium | Repetitive generation exhausts context and tokens |
| Consumption: Token Amplification | Triggers excessive token generation | Medium | Token amplification directly causes resource exhaustion |
| Consumption: Recursive Reasoning | Induces recursive reasoning loops | Medium | Recursive loops cause unbounded computation |
| Consumption: Tool Abuse | Abuses agent tools to cause API fanout | High | Tool abuse causes cascading resource consumption |
| Consumption: Chaff Data | Floods system with irrelevant data | Medium | Chaff data wastes processing and storage resources |
| Consumption: Context Overflow | Overflows context window to degrade performance | Medium | Context overflow degrades service for all users |
Governance Assessment Questions
The following governance questions are assessed inline within the category picker on the scan configuration page. When you select a category, its governance panel auto-expands so you can answer questions in context. Answers auto-save and persist across scans. Each question is weighted for risk scoring (shown as a badge) with the answer type indicated.
LLM01: Prompt Injection
- Are input filtering and detection controls deployed to identify prompt injection attempts?9Y/N
- Is content moderation applied to user inputs before they reach the LLM?8Y/N
- Is the system prompt hardened against extraction and override attempts?9Y/N
- Are indirect injection vectors (uploaded documents, URLs, RAG-ingested content) assessed and filtered for embedded instructions?9Y/N
- Is layered defence used (multiple independent filters) rather than a single point of failure?8Y/N
- Are input length and complexity limits enforced to prevent context manipulation?7Y/N
- Are prompt injection attempts logged, monitored, and alertable?8Y/N
- Is there automated detection for jailbreak patterns (DAN mode, roleplay, authority override)?8Y/N
- Are input filtering controls validated for effectiveness across non-English languages?9Y/N
- Is prompt injection testing conducted in multiple languages including low-resource languages (e.g. Swahili, Bengali, Amharic)?8Y/N
- How frequently is prompt injection testing conducted with updated attack vectors?71–5
- Is there an incident response process for detected injection attempts?7Y/N
LLM02: Sensitive Information Disclosure
- Is PII/sensitive data classification applied to both model training data and inference outputs?9Y/N
- Are output filters deployed to prevent PII, credentials, and confidential data leakage in responses?9Y/N
- Is the system prompt classified as confidential and protected from extraction?8Y/N
- Are training datasets reviewed and sanitised for sensitive, personal, or proprietary information?8Y/N
- Is cross-user session isolation enforced to prevent data leakage between users?9Y/N
- Is there monitoring for unusual data extraction patterns (repeated probing, systematic querying for sensitive data)?8Y/N
- Are error messages sanitised to prevent leaking internal model or infrastructure details?7Y/N
- Is there a logging and audit trail for information disclosure events?7Y/N
- Are data retention and deletion policies applied to model interactions and conversation logs?7Y/N
- How mature is your data loss prevention programme for AI-generated outputs?71–5
LLM03: Supply Chain Vulnerabilities
- Do you maintain a Software Bill of Materials (SBOM) for your LLM dependencies?8Y/N
- Are third-party model providers vetted against security criteria before adoption?7Y/N
- Do you verify model checksums or signatures when downloading weights?9Y/N
- How frequently are dependencies and plugins reviewed for vulnerabilities?61–5
- Is there a process for responding to supply chain security advisories?7Y/N
LLM04: Data and Model Poisoning
- Is training data validated for integrity before use?9Y/N
- Are data provenance records maintained for training datasets?7Y/N
- Do you monitor model outputs for signs of data poisoning (drift, bias shifts)?8Y/N
- How confident are you in the cleanliness of your fine-tuning data?61–5
LLM05: Improper Output Handling
- Are LLM outputs sanitised before rendering in HTML or executing as code?10Y/N
- Is there output validation to prevent injection into downstream systems (SQL, shell, APIs)?9Y/N
- Are structured output schemas enforced (e.g. JSON schema validation)?6Y/N
LLM06: Excessive Agency
- Are tool permissions scoped to minimum necessary (least privilege)?9Y/N
- Is human approval required for high-impact actions (delete, send, pay)?10Y/N
- Can the agent access systems or data outside its intended scope?8Y/N
- How well are agent actions logged and auditable?71–5
LLM07: System Prompt Leakage
- Is the system prompt protected against direct extraction attacks (meta-prompting, "reveal your instructions" attempts)?9Y/N
- Are system prompts stored securely with access controls and version history?8Y/N
- Are internal configuration details, API keys, and secrets excluded from system prompts?10Y/N
- Is separation enforced between system-level instructions and user-accessible content?8Y/N
- Is monitoring deployed to detect system prompt content appearing in model responses?8Y/N
- Is there detection for multi-turn extraction attempts that gradually probe for system information?7Y/N
- Are system prompt changes subject to a review and approval process?7Y/N
- How frequently is system prompt extraction resistance tested?71–5
LLM08: Vector and Embedding Weaknesses
- Are RAG knowledge base contents validated for accuracy and authority before indexing?9Y/N
- Is multi-tenant data isolation enforced in vector stores (one user cannot access another user's documents)?10Y/N
- Are document-level access permissions enforced in RAG retrieval (not just store-level access)?9Y/N
- Are knowledge base query results filtered by user permissions before delivery?8Y/N
- Is the RAG document ingestion pipeline secured against adversarial document injection?9Y/N
- Is embedding model provenance verified and integrity maintained?7Y/N
- Is there monitoring for adversarial manipulation of embedding similarity (collision attacks)?7Y/N
- Are vector store backup and recovery procedures documented and tested?6Y/N
- Are obsolete or revoked documents removed from vector stores in a timely manner?7Y/N
- How mature is your RAG security programme?71–5
LLM09: Misinformation
- Are LLM outputs validated for factual accuracy before delivery to users?8Y/N
- Are controls deployed to detect and flag hallucinated citations or fabricated references?8Y/N
- Is source attribution required for factual claims in AI-generated content?7Y/N
- Are domain-specific accuracy benchmarks established and regularly tested?7Y/N
- Are appropriate disclaimers and confidence indicators shown to users for AI-generated content?7Y/N
- Is there a human review process for high-stakes content (medical, legal, financial, safety-critical)?9Y/N
- Are user feedback mechanisms available for reporting AI-generated misinformation?6Y/N
- Is content accuracy and safety testing conducted across multiple languages?8Y/N
- Are content safety filters validated across all supported languages including low-resource languages?8Y/N
- How mature is your hallucination detection and content verification programme?71–5
LLM10: Unbounded Consumption
- Are rate limits configured for API access to the LLM?7Y/N
- Are token limits enforced per request and per user session?6Y/N
- Is there monitoring for anomalous usage patterns (cost spikes, excessive calls)?8Y/N
Running an OWASP Assessment
To run an OWASP-aligned assessment:
- Register your endpoint— Add the AI system you want to assess via the Endpoints page
- Select the OWASP LLM Top 10 template— Choose individual categories for targeted testing or select all 10 for comprehensive coverage
- Complete governance questions— When you select a category, its governance questions appear inline below the category row. Answer them in context — your responses auto-save and persist across scans
- Review OWASP references— Each finding in your report includes OWASP LLM category references (LLM01–LLM10) alongside MITRE ATLAS, NIST, and other framework mappings
Governance assessment: When you select a category in the picker, its governance questions appear inline below the category row. Answer them in context to get a combined automated + governance posture score. This is particularly valuable for categories like LLM03 (Supply Chain) and LLM04 (Data Poisoning) where many risks cannot be tested at runtime.
References
- OWASP Top 10 for LLM Applications — Official project page
- OWASP LLM AI Security & Governance Checklist — Governance checklist for LLM security
- OWASP Foundation — Open Worldwide Application Security Project
- OWASP AI Exchange — Community hub for AI security resources
- OWASP Top 10 for LLM Applications 2025 — 2025 revision paper