Document Extraction | Plextera Public API

Document extraction is the Document Insights flow for turning a document into structured fields. It is asynchronous: you create an extraction, receive an ID, and then either poll for the result or receive a webhook event.

The id returned from POST /document-insights/extractions is the extractionId you use for polling, feedback, and event correlation.

Before you start

You need:

An API key. See Authentication.
A document, either as a fileId from the Files API or as a public url.
Optionally, a hubId if you want to route directly to a known Document Insights hub.
Optionally, metadata for client correlation and automatic hub routing.

Extraction lifecycle

Status	Meaning
`QUEUED`	The extraction was accepted and is waiting to be processed.
`PROCESSING`	Document Insights is processing the document.
`COMPLETED`	Extraction finished successfully and `output` is available.
`FAILED`	Processing started but could not complete. Check `error`.
`REJECTED`	The document was rejected before or during validation. Check `error`.

Terminal statuses are COMPLETED, FAILED, and REJECTED.

Create an extraction from an uploaded file

Upload the document

$ curl https://api.plextera.com/api/public/v1/files \
>   -H "Authorization: api-key YOUR_API_KEY" \
>   -F "file=@lab-result.pdf"

Submit the extraction

$ curl -X POST https://api.plextera.com/api/public/v1/document-insights/extractions \
>   -H "Authorization: api-key YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "document": {
>       "fileId": "file_01JY7M4ZVX5R1P3M3Q0TA1S7ZM"
>     },
>     "metadata": {
>       "customerDocumentId": "lab-42",
>       "sourceSystem": "patient-portal"
>     }
>   }'

Store the returned ID

1 {
2   "id": "69654f0bc073ef404baec649",
3   "operation": "extract",
4   "status": "QUEUED",
5   "outputAvailable": false,
6   "metadata": {
7     "customerDocumentId": "lab-42",
8     "sourceSystem": "patient-portal"
9   }
10 }

Create an extraction from a URL

Use a URL when the document is already available to Plextera over HTTPS.

$ curl -X POST https://api.plextera.com/api/public/v1/document-insights/extractions \
>   -H "Authorization: api-key YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "document": {
>       "url": "https://example.com/documents/lab-result.pdf",
>       "fileName": "lab-result.pdf"
>     },
>     "metadata": {
>       "customerDocumentId": "lab-42"
>     }
>   }'

Use fileId when your integration already uploads files to Plextera. Use url when your source system can provide a stable HTTPS download URL.

Hub routing

You can route an extraction in two ways:

Method	How it works
Explicit hub	Send `hubId` when the client knows exactly which hub should process the document.
Automatic routing	Omit `hubId`; Plextera can use `metadata` and organization configuration to choose the hub.

1 {
2   "document": { "fileId": "file_01JY7M4ZVX5R1P3M3Q0TA1S7ZM" },
3   "hubId": "69ccad03c7574856f010eaa5",
4   "metadata": {
5     "documentType": "lab_result",
6     "clientDocumentId": "lab-42"
7   }
8 }

metadata keys and values must be non-empty strings. Maximum: 50 entries, 64 characters per key, and 512 characters per value.

Poll for output

Call GET /document-insights/extractions/{extractionId} until the extraction reaches a terminal status.

$ curl https://api.plextera.com/api/public/v1/document-insights/extractions/69654f0bc073ef404baec649 \
>   -H "Authorization: api-key YOUR_API_KEY"

When status is COMPLETED, outputAvailable is true and output contains the extracted fields.

1 {
2   "id": "69654f0bc073ef404baec649",
3   "operation": "extract",
4   "status": "COMPLETED",
5   "outputAvailable": true,
6   "output": {
7     "fieldCount": 3,
8     "fields": [
9       {
10         "id": "field_01",
11         "name": "labName",
12         "type": "text",
13         "value": "Quest Diagnostics",
14         "metadata": {
15           "extracted": true,
16           "confidence": 1.0,
17           "page": 1,
18           "placement": { "x": 43, "y": 12, "width": 22, "height": 4 }
19         }
20       }
21     ]
22   }
23 }

Avoid tight polling loops. Poll with a delay and stop as soon as the status is COMPLETED, FAILED, or REJECTED.

Receive extraction events

If you do not want to poll, create an event subscription for:

document-insights.extraction.completed
document-insights.extraction.failed
document-insights.extraction.rejected

For completed extractions, the event payload includes the same completed extraction model as GET /document-insights/extractions/{extractionId}, including output.

See Event Subscriptions for setup, headers, signatures, and retry behavior.

Submit feedback

Use feedback when a value is incorrect or the extraction needs review.

$ curl -X POST https://api.plextera.com/api/public/v1/document-insights/extractions/69654f0bc073ef404baec649/feedback \
>   -H "Authorization: api-key YOUR_API_KEY" \
>   -H "Content-Type: application/json" \
>   -d '{
>     "fieldId": "field_03",
>     "message": "Collection date should be 2026-04-05."
>   }'

message is required and can contain up to 1024 characters. fieldId is optional; include it when the feedback applies to one extracted field.

API Reference - Document Insights endpoints and schemas
Event Reference - Document Insights event payloads