Skip to content

Detectors

Detectors scan text and report matches against patterns. They are used internally by builders like pii() and secrets().

PIIDetector

Detects personal information using regular expressions.

TypeDetectsConfidence
emailEmail addresses0.95
phone_internationalInternational phone numbers (+1-555-123-4567)0.85
phone_jpJapanese phone numbers (03-1234-5678)0.80
credit_cardCredit card numbers0.90
my_numberJapanese My Number (12 digits)0.70
ssnUS Social Security Numbers (XXX-XX-XXXX)0.90
ip_addressIPv4 addresses0.75

Usage

ts
pii().block()
pii().exclude("ip_address").block()
pii().only("email", "credit_card").block()

SecretsDetector

Detects API keys, tokens, private keys, and similar patterns.

TypeDetectsConfidence
aws_access_keyAWS access keys (AKIA...)0.95
aws_secret_keyAWS secret keys0.60
github_tokenGitHub tokens (ghp_, github_pat_)0.95
slack_tokenSlack tokens (xoxb-, xoxp-)0.95
bearer_tokenBearer tokens0.85
private_keyPrivate keys (BEGIN PRIVATE KEY)0.99
api_keyGeneric API key patterns (api_key=...)0.75
google_api_keyGoogle API keys (AIza...)0.90
stripe_keyStripe keys (sk_live_, sk_test_)0.95
generic_secretGeneric secrets (password=, token=)0.70

Usage

ts
secrets().block()
secrets().exclude("generic_secret", "aws_secret_key").block()
secrets().only("github_token", "stripe_key").block()

PromptInjectionDetector

Detects prompt injection attacks using scoring-based heuristics. Each pattern has a weight, and detection triggers when the cumulative score exceeds the threshold.

CategoryExample detectionsWeight
role_override"ignore all instructions", "you are now"0.6 - 0.9
system_prompt_extraction"show me your system prompt"0.75 - 0.8
jailbreak"DAN", "developer mode", "unrestricted"0.7 - 0.9
delimiter_injection<|im_start|>, [INST]0.8 - 0.9
encoded_injection"base64 decode", "rot13"0.5
persona_switch"pretend to be", "roleplay"0.3 - 0.5

Usage

ts
promptInjection().block()                 // Default threshold: 0.7
promptInjection().threshold(0.5).block()  // Stricter (more false positives)
promptInjection().threshold(0.9).block()  // Lenient (more false negatives)

ContentFilterDetector

Filters content using user-defined patterns.

Usage

ts
contentFilter(["confidential", /classified/i, /internal only/]).block()

contentFilter(
  ["internal only", /do not distribute/i],
  { label: "internal_document" }
).warn()

String patterns are automatically converted to case-insensitive regular expressions.