OpenAI Launches Privacy Filter, a New Model That Auto-Redacts Sensitive Data in Text
Developers can run it locally, fine-tune it, and integrate it into custom privacy systems
In just under twenty-four hours, OpenAI has dropped another AI model. To catch you up, the company yesterday launched its advanced ChatGPT Images 2.0 model. Well, today’s announcement is more about privacy. The company has released OpenAI Privacy Filter, which it says is a lightweight but highly capable model designed to detect and remove personally identifiable information from text. The latest model is dedicated to developers building privacy-first AI systems, with a focus on making data protection easier to integrate across workflows.
OpenAI Privacy Filter model brings on-device PII masking
At its core, Privacy Filter is built to identify sensitive information like names, emails, phone numbers, addresses, account details, and even secrets such as API keys. It works in a single pass, which allows high-speed processing of long text inputs while still maintaining context awareness.
Unlike traditional rule-based tools, the model uses language understanding to decide what should be masked or preserved. This allows it to handle unstructured text more effectively, especially where context determines whether information is private or not. More importantly, it can run locally, meaning sensitive data does not need to be sent to external servers for processing.
OpenAI says it already uses a “fine-tuned version” internally, and the released model is optimized for real-world privacy pipelines such as training data cleanup, logging systems, and review workflows. The model supports up to 128,000 tokens of context and is designed for scalability in production environments.
How the model works and what it detects
Privacy Filter uses a token-classification approach with a fixed set of privacy labels. It processes text in one pass and then reconstructs spans using constrained decoding to ensure clean outputs. The system is designed for efficiency, with around 1.5 billion total parameters but only 50 million active at runtime.
It identifies eight categories including private_person, private_address, private_email, private_phone, private_date, private_url, account_number, and secret. These allow it to catch everything from credit card numbers to passwords and API keys, depending on context.
OpenAI reports strong benchmark results, with performance reaching over 97 percent F1 score on corrected evaluation sets. The model also improves quickly when fine-tuned on domain-specific data, making it adaptable for enterprise use cases.
Availability
OpenAI has released Privacy Filter as an open-weight model under the Apache 2.0 license via Hugging Face and GitHub. Developers can run it locally, fine-tune it, and integrate it into custom privacy systems. While OpenAI highlights strong performance, it also notes that the model is not a complete compliance solution and may still require human oversight in sensitive environments.
Read our disclosure page to find out how can you help Windows Report sustain the editorial team. Read more
User forum
0 messages