A family of PII redaction SLMs from 135M to 3B parameters. The 1B model matches a 685B teacher — runs on laptops, data never leaves your machine.

Distil-PII: Family of PII Redaction SLMs

distil labs released specialized small language models for policy-aware PII redaction. The 1B parameter model achieves 0.81 +/- 0.02, effectively matching a frontier 600B+ LLM — while running on laptops with full data privacy.

Available Models

Model	HuggingFace
Distil-PII-Llama-3.2-3B-Instruct	gguf
Distil-PII-Llama-3.2-1B-Instruct	gguf
Distil-PII-gemma-3-270m-it	gguf
Distil-PII-SmolLM2-135M-Instruct	gguf

Performance

Before Fine-Tuning

Model	Score
DeepSeek 3.1 (685B) — Teacher	0.84 +/- 0.03
Llama-3.2-3B (base)	0.03 +/- 0.02
Llama-3.2-1B (base)	0.00 +/- 0.00

After Fine-Tuning

Model	Score
Llama-3.2-3B (tuned)	0.82 +/- 0.03
Llama-3.2-1B (tuned)	0.81 +/- 0.02
Gemma-3-270M (tuned)	0.73 +/- 0.07

The fine-tuned 1B model matches the 685B teacher — a 685x size reduction with comparable accuracy.

The Task

Models perform policy-aware PII redaction: given input text, output JSON containing the redacted text and entity details.

Output Schema

{
  "redacted_text": "...",
  "entities": [
    {
      "value": "original PII value",
      "replacement_token": "[REDACTION_TYPE]",
      "reason": "why this was redacted"
    }
  ]
}

Supported PII Categories (14 types)

Names, email addresses, phone numbers, physical addresses, Social Security numbers, credit card numbers, dates of birth, IP addresses, URLs, financial account numbers, medical record numbers, passport numbers, driver’s license numbers, and demographic attributes.

Why Local PII Redaction?

Sending text containing PII to a cloud LLM for redaction defeats the purpose of data protection. Distil-PII runs entirely on your machine — the sensitive data never leaves your infrastructure.

Use cases:

Pre-processing data before sending to external APIs
Compliance with GDPR, HIPAA, and data residency requirements
Edge deployment for real-time PII scrubbing
Pipeline integration for data anonymization

Distil-PII: Family of PII Redaction SLMs

Distil-PII: Family of PII Redaction SLMs

Available Models

Performance

Before Fine-Tuning

After Fine-Tuning

The Task

Output Schema

Supported PII Categories (14 types)

Why Local PII Redaction?

Resources

Keep Learning