OpenAI Launches Privacy Filter, a New Model That Auto-Redacts Sensitive Data in Text

Developers can run it locally, fine-tune it, and integrate it into custom privacy systems

News

Rishaj Upadhyay

News Editor

News

3 min. read

Published on April 22, 2026

GPPT 4o overly appeasing OpenAI rolls back update — Image: Unsplash/@@siva_photography

In just under twenty-four hours, OpenAI has dropped another AI model. To catch you up, the company yesterday launched its advanced ChatGPT Images 2.0 model. Well, today’s announcement is more about privacy. The company has released OpenAI Privacy Filter, which it says is a lightweight but highly capable model designed to detect and remove personally identifiable information from text. The latest model is dedicated to developers building privacy-first AI systems, with a focus on making data protection easier to integrate across workflows.

OpenAI Privacy Filter model brings on-device PII masking

At its core, Privacy Filter is built to identify sensitive information like names, emails, phone numbers, addresses, account details, and even secrets such as API keys. It works in a single pass, which allows high-speed processing of long text inputs while still maintaining context awareness.

Unlike traditional rule-based tools, the model uses language understanding to decide what should be masked or preserved. This allows it to handle unstructured text more effectively, especially where context determines whether information is private or not. More importantly, it can run locally, meaning sensitive data does not need to be sent to external servers for processing.

OpenAI says it already uses a “fine-tuned version” internally, and the released model is optimized for real-world privacy pipelines such as training data cleanup, logging systems, and review workflows. The model supports up to 128,000 tokens of context and is designed for scalability in production environments.

How the model works and what it detects

Privacy Filter uses a token-classification approach with a fixed set of privacy labels. It processes text in one pass and then reconstructs spans using constrained decoding to ensure clean outputs. The system is designed for efficiency, with around 1.5 billion total parameters but only 50 million active at runtime.

It identifies eight categories including private_person, private_address, private_email, private_phone, private_date, private_url, account_number, and secret. These allow it to catch everything from credit card numbers to passwords and API keys, depending on context.

OpenAI reports strong benchmark results, with performance reaching over 97 percent F1 score on corrected evaluation sets. The model also improves quickly when fine-tuned on domain-specific data, making it adaptable for enterprise use cases.

Availability

OpenAI has released Privacy Filter as an open-weight model under the Apache 2.0 license via Hugging Face and GitHub. Developers can run it locally, fine-tune it, and integrate it into custom privacy systems. While OpenAI highlights strong performance, it also notes that the model is not a complete compliance solution and may still require human oversight in sensitive environments.

OpenAI Privacy Filter model brings on-device PII masking

How the model works and what it detects

Availability

Leave a Reply Cancel reply