System Profile & Governance

When you add an endpoint to Probe Six, you can provide two types of additional context: a System Profile describing how your AI system is built, and a Governance Assessmentcovering your organisation's security policies and controls. Both are optional but highly recommended.

Why does this matter?The more Probe Six knows about your system, the more targeted and useful your security reports become. For example, if your system uses tools and function calling, Probe Six can focus on excessive agency risks. If your system handles personal data, data protection tests receive higher priority. Without this context, scans still work — but the results are more generic.

Where to Find These

Both the System Profile and Governance Assessment appear on your endpoint's detail page. Navigate to Endpoints from the sidebar, then click View on any endpoint. Scroll down past the endpoint configuration to find the System Profile card followed by the Governance Assessment card.

Auto-Save

All changes are saved automatically as you fill in each field. You don't need to click a save button — just make your changes and move on. A small spinner appears briefly in the card header while saving.

System Profile

The System Profile describes the technical architecture of your AI system across 10 sections. Each section is a collapsible accordion — click to expand it and fill in the fields. A green tick appears next to completed sections, and the percentage in the top-right corner shows your overall completeness.

Tip:You don't need to complete every section. Fill in the ones that apply to your system. Even partial information helps improve scan accuracy.

1. System Type & Context

What kind of AI system is this? Select from the dropdown and add a brief description.

System Type

Choose the category that best describes your system. Options include Chatbot, Code Assistant, Customer Support, Content Generation, Data Analysis, Search & Retrieval, Autonomous Agent, Multi-Agent System, API Service, or Other.

Custom System Type

Only shown if you select 'Other'. Describe what your system does in a few words.

Description

A brief summary of your system's purpose and what it does. Up to 1,000 characters.

Example: For an internal HR chatbot, select Customer Support, then describe it as: "Internal HR assistant that answers employee questions about policies, benefits, and leave requests. Connected to our HR knowledge base via RAG."

2. Model & Training

Which AI model powers your system, and has it been customised?

Base Model

The name of the foundation model. For example: GPT-4, Claude 3.5 Sonnet, Llama 3 70B, Gemini Pro, or Mistral Large.

Is Fine-Tuned?

Tick this if the model has been fine-tuned on your own data, rather than using the base model as-is.

Fine-Tuning Description

Only shown if fine-tuned. Describe the type of data and the purpose of fine-tuning. Up to 2,000 characters.

Example: Base Model: "GPT-4o". If you fine-tuned on customer support transcripts, tick Is Fine-Tuned and add: "Fine-tuned on 50,000 de-identified customer support conversations to improve response accuracy for product-specific queries."

3. Tool & Agent Config

Does your system use tools or function calling? This includes any capability where the model can trigger actions beyond generating text — such as searching the web, querying a database, or calling an API.

Uses Tools / Function Calling

Tick this if your model can invoke external tools or functions.

Tool Names

A comma-separated list of tools the model can use. For example: web-search, calculator, file-read.

Can Execute Code?

Tick this if the model can run code (e.g. Python, JavaScript) as part of its response.

Can Access Network?

Tick this if the model can make HTTP requests or access external services.

Can Access Filesystem?

Tick this if the model can read or write files on a server or storage system.

Example: For a coding assistant that can run Python and search documentation, tick Uses Tools, enter "code-interpreter, docs-search" as tool names, and tick Can Execute Code.

4. RAG & Knowledge Base

RAG (Retrieval-Augmented Generation) means your model can search through your own documents or data before answering. If your system pulls information from a knowledge base, database, or document store, this section applies.

Uses RAG / Knowledge Base

Tick this if your system retrieves information from external sources to inform its responses.

Knowledge Base Type

How your knowledge base is structured. Options: Vector DB (e.g. Pinecone, Weaviate), Keyword Search (e.g. Elasticsearch), Graph DB (e.g. Neo4j), Hybrid (combines multiple approaches), or Other.

Data Classification

The sensitivity level of the data in your knowledge base. Options: Public (open information), Internal (company-internal, not sensitive), Confidential (sensitive business data), or Restricted (highly sensitive, regulated data such as PII or financial records).

Example:For a system that answers questions using your company's internal wiki, tick Uses RAG / Knowledge Base, select Vector DB, and set Data Classification to Internal.

5. System Prompt

The system prompt is the set of instructions given to the model before it starts responding to users. It defines the model's behaviour, personality, and boundaries. Sharing it here helps Probe Six test for prompt extraction vulnerabilities.

Has System Prompt

Tick this if your model uses a system prompt (most production systems do).

System Prompt Text

Paste your full system prompt here. Up to 10,000 characters. This is stored securely and only used to improve scan accuracy.

Summary

Optional short summary of what the prompt instructs the model to do. Useful if the full prompt is long.

Example: Summary: "Customer-facing support bot. Must refuse to discuss competitors, must not share internal pricing, and must escalate billing disputes to a human agent."

Privacy note: Your system prompt is encrypted and stored securely. It is never shared with other users or used for any purpose other than enhancing your security assessment.

6. Output Handling

How are the model's responses used after they're generated? This helps Probe Six assess risks like cross-site scripting (XSS) or injection attacks.

Renders HTML

Tick this if model output is displayed as HTML in a web page (not just plain text).

Renders Markdown

Tick this if model output is rendered as Markdown (e.g. bold, headers, code blocks).

Executes Generated Code

Tick this if the system runs code that the model generates (e.g. a code sandbox).

Outputs to Database

Tick this if model responses are written to a database (e.g. saving chat history, logging).

Outputs to API

Tick this if model responses are sent to another service or API (e.g. triggering workflows).

Example: A chat interface that displays responses with Markdown formatting and saves conversations to a database would tick Renders Markdown and Outputs to Database.

7. Consumption Controls

What safeguards are in place to prevent your AI system from being abused or overloaded? These controls help protect against denial-of-service and cost attacks.

Has Rate Limiting

Tick this if there are limits on how many requests a user or application can make per minute/hour.

Has Token Limits

Tick this if there are maximum token (word) limits per request or per session.

Max Tokens Per Request

Only shown if token limits are enabled. Enter the maximum number of tokens allowed in a single request.

Has Content Filtering

Tick this if input or output content filtering is enabled (e.g. profanity filters, topic blocklists).

Example: If your API limits users to 60 requests per minute and caps each response at 4,096 tokens, tick Has Rate Limiting and Has Token Limits, then enter 4096 for Max Tokens Per Request.

8. Agent Architecture

If your system operates as an autonomous agent — meaning it can take actions, make decisions, and interact with other systems on its own — this section captures how it's structured.

Is an Agent

Tick this if the system operates autonomously, making decisions and taking actions without a human reviewing each step.

Orchestration Pattern

How the agent(s) are organised. Options: Single Agent (one AI doing everything), Supervisor (one AI oversees others), Peer-to-Peer (agents collaborate equally), Hierarchical (layered chain of command), Swarm (many agents coordinating dynamically), or Other.

Agent Count

How many agents are in the system. Enter 1 for a single agent.

Has Memory

Tick this if the agent remembers context from previous interactions (e.g. conversation history, task state).

Can Delegate to Other Agents

Tick this if the agent can hand off tasks to other AI agents.

Example: For a customer service agent that uses a supervisor model to route queries to specialised sub-agents (billing, technical support, returns), tick Is an Agent, select Supervisor as the orchestration pattern, enter 4 for agent count, and tick Can Delegate to Other Agents.

9. Auth & Identity

How is access to your AI system controlled? This helps Probe Six assess risks around unauthorised access and data exposure.

Has User Authentication

Tick this if users must log in or provide credentials to use the system.

Has Role-Based Access Control

Tick this if different users have different permission levels (e.g. admin, editor, viewer).

Handles Personal Data

Tick this if the system processes personally identifiable information (PII) such as names, emails, or addresses.

Data Residency Region

Only shown if the system handles personal data. Where is the data stored? For example: EU, US, UK, or a specific country.

Example: A healthcare chatbot that requires login, has admin and patient roles, and stores patient information in EU data centres would tick all three checkboxes and enter "EU" for data residency.

10. Monitoring & Governance

What oversight and compliance measures are in place for your AI system?

Has Audit Logging

Tick this if all interactions with the system are logged for review (e.g. who asked what, when).

Has Content Moderation

Tick this if there is a moderation layer that reviews or filters model inputs and outputs.

Has Human-in-the-Loop

Tick this if a human reviews or approves certain model actions before they are carried out.

Compliance Frameworks

A comma-separated list of compliance frameworks your system adheres to. For example: SOC2, ISO 27001, GDPR, HIPAA, PCI DSS.

Example: For a financial services chatbot with full audit trails and content moderation, tick Has Audit Logging and Has Content Moderation, then enter "SOC2, PCI DSS, GDPR" as compliance frameworks.

Governance Assessment

The Governance Assessment is a set of 89 questions across 15 security categories that evaluate your organisation's policies and controls for AI security. Unlike the automated scan (which tests what your model does), the governance assessment captures what your organisation has in place to manage AI risk.

Think of it this way:The scan tests your model's behaviour under attack. The governance assessment evaluates whether you have the policies, processes, and controls to prevent, detect, and respond to those attacks in the real world.

How It Works

Expand a category — Click on any category to see its questions.

Answer the questions — Most questions are Yes/No. Some use a 1-5 scale. Click your answer and it saves automatically.

Watch your posture score — The percentage in the top-right corner updates as you answer. This is your governance posture score.

Complete all categories — A green tick appears next to each category when all its questions are answered.

Posture Score

Your governance posture score (0–100%) reflects how well your organisation's security controls align with industry frameworks. It is calculated as a weighted average of your answers — higher-impact questions contribute more to the score. Answering "Yes" to a question means you have that control in place. A higher score indicates stronger governance maturity.

Question Types

Yes / No

The most common type. Click Yes if you have the control or process in place, Noif you don't. Be honest — the assessment is for your benefit.

Scale (1–5)

A confidence or maturity rating. 1means the control doesn't exist or you have no confidence, 5means it's fully implemented and verified. Hover over the help icon next to these questions for guidance on what each number means.

The 15 Categories

Questions are grouped into 15 categories drawn from two industry frameworks: OWASP Top 10 for LLMs (5 categories, 19 questions) and MITRE ATLAS (10 categories, 70 questions). Below is what each category covers and what the questions are asking.

Supply Chain Vulnerabilities (LLM03)

OWASP Top 10 for LLMs · 5 questions

This category assesses whether you manage the security of external components your AI system depends on — model providers, third-party plugins, libraries, and training data sources. Supply chain attacks target these dependencies to compromise your system indirectly.

What the questions ask: Do you track your AI dependencies (SBOM)? Do you vet model providers? Do you verify model checksums? How often do you review for vulnerabilities? Do you have a response process for security advisories?

Answering "Yes" means:You actively maintain a list of all AI-related software and model dependencies, verify their integrity when updating, and have a process for responding when a vulnerability is discovered. If you use a model provider like OpenAI or AWS Bedrock and trust their security without additional verification, the honest answer to most of these is "No".

Data and Model Poisoning (LLM04)

OWASP Top 10 for LLMs · 4 questions

This category covers risks from tampered or malicious training data. If an attacker can influence the data your model learns from, they can subtly change its behaviour — introducing bias, backdoors, or incorrect responses.

What the questions ask: Do you validate training data integrity? Do you track where your data comes from? Do you monitor for signs of poisoning (unexpected bias or drift)? How confident are you in the quality of your fine-tuning data?

Answering "Yes" means:You have processes to ensure your training data hasn't been tampered with. If you use a pre-trained model without fine-tuning, you can still answer based on whether you validated the model's training provenance. The confidence scale (1–5) is about how certain you are that no malicious data has entered your pipeline.

Improper Output Handling (LLM05)

OWASP Top 10 for LLMs · 3 questions

This category focuses on what happens after the model generates a response. If model output is rendered as HTML, executed as code, or passed to other systems without proper sanitisation, attackers can use the model to inject malicious content.

What the questions ask: Do you sanitise outputs before rendering them as HTML or executing them? Do you validate outputs to prevent injection into databases, shells, or APIs? Do you enforce structured output schemas (e.g. JSON validation)?

Answering "Yes" means: If your model generates a response containing <script>tags and your application strips or escapes them before displaying to the user, that's output sanitisation. If your model's output goes directly into a SQL query without parameterisation, you should answer "No" to the injection question.

Excessive Agency (LLM08)

OWASP Top 10 for LLMs · 4 questions

This category examines whether your AI system has more power than it needs. If a model can delete records, send emails, or make payments, are those capabilities properly restricted and overseen?

What the questions ask: Are tool permissions limited to the minimum necessary? Is human approval required for high-impact actions? Can the agent access systems outside its scope? How well are agent actions logged?

Answering "Yes" means:Your model's tools follow the principle of least privilege. For example, if your model can search a database, it has read-only access rather than full write access. High-impact actions like sending emails or processing payments require a human to confirm before execution.

Unbounded Consumption (LLM10)

OWASP Top 10 for LLMs · 3 questions

This category covers resource exhaustion risks. Without proper controls, an attacker could abuse your AI system to run up costs, overload your infrastructure, or deny service to legitimate users.

What the questions ask: Are rate limits configured? Are token limits enforced per request and per session? Do you monitor for unusual usage patterns (cost spikes, excessive calls)?

Answering "Yes" means:You have concrete limits in place — for example, each user is limited to 100 requests per hour and 4,096 tokens per response. You also have alerts for unusual spikes in API usage or cost.

AI Model Access & Credential Security (TA0000)

MITRE ATLAS · 6 questions

Covers ATLAS tactics TA0000 (Reconnaissance) and TA0013 (Initial Access) — credential theft, phishing, API key compromise, and unauthorised model access.

What the questions ask: Is authentication required for inference APIs? Are credentials rotated? Are queries restricted by rate or identity? Is data encrypted in transit and at rest? Is MFA enforced for privileged operations? Are credential logs monitored?

AI Artefact & Data Repository Security (TA0009)

MITRE ATLAS · 5 questions

Covers TA0009 (Collection) — artefact theft, data harvesting, and repository access controls for AI model weights, training data, and evaluation datasets.

What the questions ask: Are model artefacts access-controlled? Is repository access audited? Are bulk downloads monitored? Is there an artefact inventory? Are transfers verified for integrity?

AI Supply Chain Dependencies (TA0004)

MITRE ATLAS · 7 questions

Covers TA0004 (ML Supply Chain Compromise) — poisoned models, backdoored libraries, compromised datasets, and unverified marketplace downloads.

What the questions ask: Do you maintain an AI Bill of Materials? Are models verified with checksums before deployment? Are dependencies scanned? Are datasets vetted? Are library loads audited? Are marketplace downloads scanned for malware? How mature is your supply chain governance?

Execution & Agent Security Policies (TA0005)

MITRE ATLAS · 6 questions

Covers TA0005 (Execution) — LLM code execution, prompt injection, agent abuse, and recursive prompt attacks.

What the questions ask: Are there sandboxing policies for LLM execution? Are prompt injection controls in place? Are agent tool invocations permission-scoped? Are recursive execution safeguards present? Are AI workloads isolated? Is anomalous LLM behaviour monitored?

Persistence, Model Integrity & RAG Governance (TA0006)

MITRE ATLAS · 7 questions

Covers TA0006 (Persistence) — data poisoning, model backdoors, RAG manipulation, and fine-tuning pipeline integrity.

What the questions ask: Is training data sanitised? Are models verified against baselines? Is adversarial training applied? Are RAG contents validated? Is version control in place for model weights? Are fine-tuning pipelines protected? How confident are you in end-to-end integrity?

Evasion Detection & Input Validation (TA0007)

MITRE ATLAS · 8 questions

Covers TA0007 (Defence Evasion) — adversarial examples, jailbreaks, obfuscation, encoding attacks, and evasion of safety guardrails. This is the largest technique surface area in ATLAS.

What the questions ask: Are adversarial inputs detected? Are jailbreak and obfuscation controls in place? Are outputs validated before downstream use? Are normalisation techniques used? Is multi-layer validation present? Are confidence scores monitored? Is there a feedback loop for new attack patterns? How mature is your adversarial testing programme?

Discovery Prevention & Information Disclosure (TA0008)

MITRE ATLAS · 7 questions

Covers TA0008 (Discovery) — model fingerprinting, architecture probing, capability mapping, and information leakage through error messages or metadata.

What the questions ask: Is model metadata restricted? Are outputs limited to prevent configuration disclosure? Is systematic probing detected? Are AI endpoints protected against enumeration? Are error messages sanitised? Is fingerprinting detected? How mature are your disclosure prevention controls?

Impact Mitigation & Content Safety (TA0011)

MITRE ATLAS · 7 questions

Covers TA0011 (Impact) — denial-of-service, content manipulation, cascading failures, model degradation, and cost attacks.

What the questions ask: Are DoS controls in place? Is abnormal cost monitored? Are content safety filters deployed? Are integrity metrics tracked for drift? Is there automated rollback? Are cascading failures tested? How mature is your AI incident containment process?

Exfiltration Prevention & Data Loss Protection (TA0010)

MITRE ATLAS · 6 questions

Covers TA0010 (Exfiltration) — model stealing, training data extraction, system prompt leakage, and data loss through inference APIs.

What the questions ask: Are training data extraction controls in place? Is the system prompt protected? Are outputs monitored for data leakage? Are DLP and egress controls applied? Is model inversion detected? Are watermarking techniques used to detect stolen copies?

AI Security Training & Model Lifecycle

MITRE ATLAS · 11 questions

Cross-cutting governance covering AI security training, governance structure, incident response, threat intelligence, and the full model lifecycle from development through retirement.

What the questions ask: How mature is your AI security training? Is there a formal model lifecycle process? Do you have an AI incident response plan? Have ATLAS mitigations been assessed? Is there a security review cycle? Are risk assessments done before deployment? Is there a governance committee? Are third-party audits conducted? Is production behaviour monitored? Are decommissioned models securely disposed? How mature is your threat intelligence programme?

Best Practices

Be Honest

The governance assessment is a self-assessment — there are no wrong answers. If you don't have a control in place, answering "No" honestly is more valuable than a false "Yes". The assessment identifies gaps you can prioritise for improvement.

Update Regularly

As your system and security practices evolve, come back and update your profile and governance responses. After implementing new controls (rate limiting, content moderation, audit logging), update the relevant answers to see your posture score improve.

Involve the Right People

Some governance questions may need input from different team members. Engineering teams can answer the technical questions (tool permissions, rate limits, output handling), while security or compliance teams can address the governance questions (audit logging, compliance frameworks, human-in-the-loop processes).

Combine with Scans

For the most complete picture of your AI security posture, fill in both the System Profile and Governance Assessment, then run a scan. Your assessment report will combine automated test results with your governance context to provide actionable, prioritised remediation guidance.