Document Parser and Extraction
Quantxt Theia is a smart data extraction software. It can extract data from complex and varying layout documents accurately and export in CSV or JSON. Automate data entry tasks using Quantxt Theia and save time for your staff. Quantxt Theia supports PDF, JPG/JPEG, and PNG file types.
Document Parser and Extraction endpoints
| Method | Endpoint | Description |
|---|---|---|
| Extraction Vocabularies | ||
| POST |
Create /dictionaries |
Create one or more variations for a field name to be captured from the input documents. |
| DELETE |
Delete /dictionaries/{id} |
Delete an existing vocabulary by id |
| GET |
List /dictionaries |
Get a list of existing extraction vocabularies |
| Data Import & Export | ||
| POST |
Import data files /search/file |
Upload document files so they can be processed by the data extraction engine. Supported file types: .pdf, .jpg/.jpeg or .png. The server returns a UUID for each uploaded file.… |
| GET |
JSON export /reports/{id}/json |
Export results of data extraction in JSON format |
| GET |
CSV Export /reports/{id}/csv |
Export results in as a CSV file |
| Data Extraction | ||
| POST |
Create Model /search/new |
Create a custom extraction model. A model is simply a collection of vocabularies and tells the system: 1- What properties to look for in the documents to extract 2- What should… |
| POST |
Submit Job /search/new |
Takes a list of up to **30** file uuids and one or more dictionaries and provides extracted data. `title`: (Optional) Set a name for the job `files`: (Required) Array of file… |
| GET |
Monitor Job /search/progress/{id} |
Check the status of a submitted job via **search/new** Once **progress** reaches 100, job is completed and results can be exported. |
| DELETE |
Delete Job /search/{id} |
Delete a completed or in-progress job |
Document Parser and Extraction pricing
| Plan | Price | Rate limit | Quotas |
|---|---|---|---|
| BASIC | Free | 5 / minute |
|