Skip to main content

Overview

RDK can automatically redact Personally Identifiable Information (PII) from your traces before they’re sent to the collector. This helps you maintain compliance while still getting valuable observability data.

Quick Start

Enable built-in redaction

Pass redact_pii=True to init() to enable default PII redaction:
from rdk import init

init(
    endpoint="...",
    api_key="...",
    redact_pii=True,
)

Custom redaction

For fine-grained control, build a redactor with RedactorConfig:
import re
from rdk import init
from rdk import RedactorConfig, create_redactor

redactor = create_redactor(RedactorConfig(
    redact_emails=True,
    redact_phones=True,
    redact_ssn=True,
    redact_credit_cards=True,
    redact_api_keys=True,
    custom_patterns=[
        (re.compile(r"ACC-\d{8}"), "[ACCOUNT REDACTED]"),
        (re.compile(r"EMP-\d{6}"), "[EMPLOYEE_ID REDACTED]"),
    ],
))

init(
    endpoint="...",
    api_key="...",
    redactor=redactor,
)

Built-in Patterns

TypeExampleRedacted As
Emailjohn@example.com[EMAIL REDACTED]
Phone555-123-4567[PHONE REDACTED]
SSN123-45-6789[SSN REDACTED]
Credit Card4111-1111-1111-1111[CARD REDACTED]
API Keysk-abc123...[API_KEY REDACTED]

RedactorConfig Fields

class RedactorConfig:
    redact_emails: bool = True
    redact_phones: bool = True
    redact_ssn: bool = True
    redact_credit_cards: bool = True
    redact_api_keys: bool = True
    custom_patterns: list[tuple[re.Pattern, str]] | None = None
    custom_redactor: Callable[[str], str] | None = None
redact_emails
boolean
default:true
Redact email addresses.
redact_phones
boolean
default:true
Redact phone numbers (formats: 555-123-4567, 555.123.4567, 5551234567).
redact_ssn
boolean
default:true
Redact US Social Security Numbers (123-45-6789).
redact_credit_cards
boolean
default:true
Redact 16-digit credit card numbers.
redact_api_keys
boolean
default:true
Redact API keys matching common patterns (sk-, api_key, bearer ...).
custom_patterns
list[tuple[re.Pattern, str]]
default:"[]"
List of (compiled_pattern, replacement_string) tuples applied after built-in patterns.
import re
custom_patterns=[
    (re.compile(r"ACC-\d{8}"), "[ACCOUNT REDACTED]"),
]
custom_redactor
Callable[[str], str]
default:"None"
A function applied to each string value before the built-in patterns. Use for custom logic that regex can’t express.

Utility Functions

from rdk import redact_all_pii
from rdk.filters import create_default_redactor
redact_all_pii(value) — One-shot redaction using all default patterns. Useful for sanitizing data outside of traces.
from rdk import redact_all_pii

clean = redact_all_pii({
    "message": "Call me at 555-123-4567 or john@example.com"
})
# {"message": "Call me at [PHONE REDACTED] or [EMAIL REDACTED]"}
create_default_redactor() — Returns a reusable redactor with all defaults enabled.

What Gets Redacted

Redaction is applied to:
  • Span inputs — messages sent to the LLM
  • Span outputs — responses from the LLM
  • Tool arguments and results
Redaction is not applied to:
  • Trace/span IDs
  • Timestamps
  • Token counts
  • Model names
  • Metadata keys (only values are scanned)

Best Practices

  1. Always enable PII redaction in production
  2. Test custom patterns against realistic sample data before deploying
  3. Use custom_redactor for preprocessing (e.g., tokenization) before regex matching

See Also