Skip to content

feat: add ai-lakera-guard plugin#13570

Open
janiussyafiq wants to merge 3 commits into
apache:masterfrom
janiussyafiq:feat/ai-lakera-guard-pr1
Open

feat: add ai-lakera-guard plugin#13570
janiussyafiq wants to merge 3 commits into
apache:masterfrom
janiussyafiq:feat/ai-lakera-guard-pr1

Conversation

@janiussyafiq

Copy link
Copy Markdown
Contributor

Description

This PR adds a new plugin, ai-lakera-guard, that integrates APISIX with the Lakera Guard v2 /guard API to perform ML-based security scanning of LLM requests at the gateway — prompt injection / jailbreak, PII leakage, content-policy violations, and malicious / unknown links — so each backend LLM service no longer has to implement its own guardrails.

This is PR-1 (input guard MVP) of a planned, independently shippable series (input → output → streaming → observability), modeled closely on ai-aliyun-content-moderation.

How it works

  • Runs in the access phase at priority 1028, just below ai-proxy (1040) and ai-proxy-multi (1041), so the AI context is already populated. The plugin requires one of those proxies and returns 500 otherwise.
  • Extracts the whole request conversation via apisix.plugins.ai-protocols (no role distinction) and sends it to Lakera POST /v2/guard.
  • On a flagged verdict it applies the configured action:
    • block (default) — returns a provider-compatible deny response (a valid chat-completion, or SSE for streaming requests) carrying request_failure_message, built via proto.build_deny_response, so client SDKs render the refusal as a normal completion. The status is deny_code (default 200; set a 4xx to surface blocks as HTTP errors).
    • alert — log-only shadow mode; traffic passes through.
  • Lakera errors / timeouts are governed by fail_open (fail-closed by default).
  • api_key is secret-managed via encrypt_fields + native $secret:// / $env:// resolution.
  • reveal_failure_categories optionally appends the matched detectors to the deny message; every flagged verdict logs Lakera's full per-detector breakdown and request_uuid.

Configuration

api_key is the only required field. Others: lakera_endpoint, project_id, direction (input only in this PR), action, fail_open, timeout, ssl_verify, reveal_failure_categories, deny_code, request_failure_message.

Files

  • Plugin: apisix/plugins/ai-lakera-guard.lua, apisix/plugins/ai-lakera-guard/schema.lua, apisix/plugins/ai-lakera-guard/client.lua
  • Registration: apisix/cli/config.lua, conf/config.yaml.example
  • Docs: docs/en/latest/plugins/ai-lakera-guard.md, docs/en/latest/config.json
  • Tests: t/plugin/ai-lakera-guard.t, t/plugin/ai-lakera-guard-secrets.t, fixtures under t/fixtures/lakera/

Which issue(s) this PR fixes:

Part of #13291

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (new, opt-in plugin disabled by default; additive registration only)

Add the ai-lakera-guard plugin (PR-1, input guard MVP) integrating APISIX
with the Lakera Guard v2 /guard API to scan LLM request prompts for prompt
injection, PII, content-policy violations, and malicious/unknown links at
the gateway.

The plugin runs in the access phase at priority 1028, below ai-proxy /
ai-proxy-multi, which it requires. It extracts the whole request
conversation via apisix.plugins.ai-protocols and calls Lakera POST
/v2/guard. On a flagged verdict it either blocks with a provider-compatible
deny response (a valid chat-completion or SSE carrying request_failure_message,
returned with deny_code, default 200) or alerts (log-only shadow mode).
Lakera errors and timeouts are governed by fail_open (fail-closed by
default). The api_key is secret-managed via encrypt_fields and the native
$secret:// / $env:// resolution.

Signed-off-by: janiussyafiq <izzraff.js@gmail.com>
@dosubot dosubot Bot added enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files. labels Jun 18, 2026
Comment thread apisix/plugins/ai-lakera-guard.lua
- Makefile: install apisix/plugins/ai-lakera-guard/*.lua so the
  luarocks 'diff -rq' check no longer reports the dir as uninstalled
- t/admin/plugins.t: add ai-lakera-guard to the priority-ordered
  expected plugin list (priority 1028, between ai-aliyun-content-
  moderation 1029 and proxy-mirror 1010)
Handle requests this plugin cannot inspect (no picked ai instance, or an
unsupported protocol) via the shared ai-protocols.binding helper and a
configurable fail_mode (skip/warn/error, default skip) instead of a hard
500, matching ai-aliyun-content-moderation. This lets non-AI traffic pass
through unchecked when the plugin is bound at the Consumer/Service level.

fail_mode is distinct from fail_open, which governs Lakera API failures.

Also collapse the test routes onto a single route id (overwrite-in-place,
grouping default-config tests first) to match the convention used by the
sibling AI plugins.

- schema: add fail_mode = binding.schema_property("skip")
- access: route no-instance / unsupported-protocol through on_unsupported
- docs: document fail_mode; clarify non-ai-proxy traffic behavior
- t: fail_mode=error (500) and default skip (pass-through) coverage
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

enhancement New feature or request plugin size:XXL This PR changes 1000+ lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants