Skip to content

DataFog

Open-source PII detection for AI agents. Scan, redact, and guard sensitive data — locally, in milliseconds.

from datafog import sanitize

sanitize("Call Sarah Chen at 415-555-0142, SSN 234-56-7890")
# → "Call [PERSON_1] at [PHONE_1], SSN [SSN_1]"

Projects

🔒 datafog-python — The core SDK

PII detection and redaction via regex + NLP cascade. One function call. <2MB core install. 190x faster than spaCy for structured PII.

pip install datafog

🔌 datafog-mcp — MCP privacy proxy (coming soon)

Add PII detection to any MCP server with one config change. Wraps Postgres, filesystem, Slack, and other MCP servers — intercepts tool responses before PII enters the agent's context window.

uvx datafog-mcp proxy --wrap <your-mcp-server>

🧪 datafog-core — Rust engine (in development)

High-performance detection core in Rust. Will power both the Python SDK (via PyO3) and native integrations.

Use cases

Agent guardrails — Wrap LLM calls with scan_prompt() / filter_output() to catch PII before it enters or leaves your agent.

MCP privacy layer — Proxy any MCP server so tool responses are automatically scanned. Your agent reasons over [PERSON_1] instead of real names.

CI/CD scanningdatafog scan ./data catches PII in test fixtures, logs, and configs before they ship.

RAG sanitization — Scrub retrieved chunks before injecting into prompts.

Links

🌐 datafog.ai · 📦 PyPI · 💬 Discord · 𝕏 @datafoginc

Pinned Loading

  1. datafog-python datafog-python Public

    Python SDK for PII detection and redaction in text and images, combining regex + NLP pipelines for production privacy workflows.

    Python 42 11

  2. fogclaw fogclaw Public

    OpenClaw plugin for PII detection & custom entity redaction powered by DataFog

    TypeScript

Repositories

Showing 10 of 13 repositories

People

This organization has no public members. You must be a member to see who’s a part of this organization.

Top languages

Loading…

Most used topics

Loading…