Enterprise PII Detection & Redaction API
Key Features AI-powered Personally Identifiable Information (PII) detection Automatic redaction and masking of sensitive data OCR support for scanned documents and images Structured JSON responses with detected entities Supports PDFs, images, scanned files, and text documents Detects: Names Phone numbers Email addresses Aadhaar numbers PAN numbers Passport details Credit card information…
Enterprise PII Detection & Redaction API endpoints
| Method | Endpoint | Description |
|---|---|---|
| Redaction | ||
| POST |
processFile /process |
Upload a document or image, extract text (Mistral Document AI) and redact PII in a single round trip. Returns both the original extracted text and the redacted text. Operator is… |
| POST |
redactText /redact |
Detect PII entities in `text` using Presidio and spaCy, then anonymize them with one of four operators: `replace`, `redact`, `mask`, `hash`. Synchronous and in-process — no… |
| Document & PDF | ||
| POST |
extractDocumentBatch /extract/batch |
Submit up to 20 documents for asynchronous extraction. Returns a `job_id` immediately; poll `GET /jobs/{job_id}` until status is `completed`. |
| POST |
extractDocument /extract |
Synchronously extract text from a single document (PDF, DOCX, PPTX, XLSX) using Mistral Document AI. For multiple files use `/extract/batch`. |
| OCR | ||
| POST |
ocrImageBatch /ocr/batch |
Submit up to 20 images for asynchronous OCR. Returns a `job_id` to poll with `GET /jobs/{job_id}`. |
| POST |
ocrImage /ocr |
Run OCR on a single image (PNG, JPG, JPEG, TIFF, WEBP). |
| Audio / Video | ||
| POST |
transcribeMediaBatch /transcribe/batch |
Submit multiple base64-encoded audio/video sources at once. Each source is sent to Whisper in parallel; per-file failures (bad base64, oversize, unsupported type) do not fail the… |
| POST |
transcribeMedia /transcribe |
Transcribe a single audio or video file (mp3, mp4, mpeg, mpga, m4a, wav, webm) via Whisper. Files are capped at **25 MB**. Returns a `job_id`; poll `GET /jobs/{job_id}` for the… |
| Jobs | ||
| GET |
getJobStatus /jobs/{job_id} |
Poll an async job (extract, OCR, or transcription). Status transitions `pending` → `processing` → `completed` or `failed`. Once completed, `results` contains one entry per… |
| Usage | ||
| GET |
getMyUsage /me/usage |
Same payload shape as `/usage`, but authenticated via the portal JWT instead of an API key. Useful for in-portal dashboards. |
| GET |
getUsage /usage |
Return today's and this month's request and token usage for the calling organization, along with its tier limits. |
| Reference | ||
| DELETE |
deletePattern /patterns/{pattern_id} |
Delete a user-defined pattern by id. Built-in patterns cannot be deleted and will return `403`. After deletion the analyzer is rebuilt on the next request. |
| GET |
listBusinessUnits /patterns/units |
List the four built-in business units used to organize patterns: `construction`, `agriculture`, `geospatial`, `hr_legal`. |
| GET |
listEntities /entities |
List every PII entity type the redactor can detect. Includes built-in entities and any active custom patterns registered for the calling organization. |
| POST |
createPattern /patterns |
Register a user-defined regex pattern under a business unit. The new pattern becomes available to `/redact` and `/process` on the next call (the analyzer is marked dirty and… |
| GET |
listLanguages /languages |
List languages the redactor can analyze. Each entry has an ISO 639-1 `code`, a human `label`, and an `installed` flag indicating whether the spaCy model for that language is… |
| POST |
testPatterns /patterns/test |
Run one or more regex patterns against a sample text and return every match. Useful for validating a pattern before calling `POST /patterns`. Nothing is saved. |
| GET |
listPatterns /patterns |
List all PII regex patterns — built-in plus any user-defined ones. Optionally filter by business unit. |
Enterprise PII Detection & Redaction API pricing
| Plan | Price | Rate limit | Quotas |
|---|---|---|---|
| BASIC | Free | — |
|
| PRO Recommended | $49 / month | 100 / minute |
|
| ULTRA | $99 / month | 1000 / hour |
|
| MEGA | $199 / month | 1000 / hour |
|