one pager
Cross-Lingual Adversarial Testing
ProbeSix probes whether harmful requests slip past your AI's safety controls when they arrive in another language. Five techniques across 30 languages in three resource tiers, mapped to OWASP LLM01 and MITRE ATLAS.
- 30
- languages tested
- 5
- techniques
- OWASP LLM01
- mapped
- MITRE ATLAS
- mapped
Why language is an attack surface
Safety training is overwhelmingly English-first. Published research (for example Tower of Babel Revisited, 2025) shows that the same harmful request, translated into a lower-resourced language, is far more likely to get through. The model's guardrails are thinner where its safety alignment data is thinner. If your AI serves customers in more than one language, English-only testing leaves that gap unmeasured.
What ProbeSix tests
ProbeSix runs five cross-lingual techniques against your endpoint:
- Direct translation: the harmful request, translated into the target language.
- Code-switching: mixing languages within a single prompt to confuse intent detection.
- Transliteration: rendering the request in a different script.
- Low-resource language: using languages with sparse safety-alignment data, where success rates are highest.
- Response-language forcing: instructing the model to answer in another language.
Coverage
- 30 languages across three resource tiers. A language's resource tier is how much training and safety-alignment data exists for it, which is exactly where a model's guardrails are strongest or weakest: high-resource (e.g. Chinese, Arabic, Spanish), medium-resource (e.g. Turkish, Thai, Vietnamese) and low-resource (e.g. Zulu, Welsh, Swahili), where safety alignment is thinnest and attacks succeed most. You can target a specific tier or run the full set.
- Mapped to OWASP LLM01 (Prompt Injection) and MITRE ATLAS AML.T0068 (LLM Prompt Obfuscation), so findings line up with the frameworks your assurance process already uses.
What you get
Every cross-lingual finding lands in the same scored report as the rest of a ProbeSix scan: the exact prompt, the model's response, the technique and language used, the framework references, and remediation guidance (paid tier). Re-run the exact configuration later to evidence that a fix held.
Better together with your existing controls
Cross-lingual testing complements, and does not replace, your guardrails. It tells you where multilingual coverage is weakest so you can close the gap before it is found in production.