Skip to main content

Documents API

Endpoints for uploading, listing, retrieving, downloading, exporting, and emailing documents. All endpoints require Bearer token authentication.


POST /v1/documents

Upload a document for OCR processing.

Content-Type: multipart/form-data

Form Fields:

FieldTypeRequiredDescription
fileFileYesDocument image (JPEG, PNG, or PDF). Max 25 MB.
titleStringNoDescriptive title for the document

Example (curl):

curl -X POST https://api.openesl.com/v1/documents \
-H "Authorization: Bearer $TOKEN" \
-F "file=@bol_photo.jpg;type=image/jpeg" \
-F "title=BOL - Load #4521 CHI to DAL"

Response (200):

{
"document": {
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"status": "queued",
"received_at": "2026-02-21T15:30:16.258193+00:00"
}
}

After upload, a background task processes the document. Poll the detail endpoint to check status.


GET /v1/documents

List documents for the authenticated user, sorted by date (newest first).

Query Parameters:

ParameterDefaultMaxDescription
limit50200Number of documents to return
offset0--Number of documents to skip

Response (200):

{
"items": [
{
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"title": "BOL - Load #4521 CHI to DAL",
"status": "completed",
"received_at": "2026-02-21T15:30:16.258193+00:00",
"processed_at": "2026-02-21T15:30:22.158193+00:00"
}
],
"paging": {
"limit": 50,
"offset": 0,
"total": 42
}
}

GET /v1/documents/:id

Get document detail including OCR results.

Response (200):

{
"document": {
"id": "f47ac10b-58cc-4372-a567-0e02b2c3d479",
"title": "BOL - Load #4521 CHI to DAL",
"status": "completed",
"received_at": "2026-02-21T15:30:16.258193+00:00",
"processed_at": "2026-02-21T15:30:22.158193+00:00"
},
"original_filename": "bol_load_4521.jpg",
"content_type": "image/jpeg",
"error_message": null,
"ocr": {
"engine": "textract",
"language": "en",
"text": "BILL OF LADING\nSTRAIGHT BILL OF LADING...",
"revised_text": null,
"revised_at": null
}
}
note

The ocr field is null until processing completes. The document object is nested -- original_filename, content_type, error_message, and ocr are sibling fields.

Document Status Values:

StatusDescription
queuedWaiting for processing to begin
processingOCR engine is extracting text
completedText extraction finished
failedProcessing failed (see error_message)

DELETE /v1/documents/:id

Delete a document, its OCR result, and the stored file.

Response: 204 No Content


GET /v1/documents/:id/download

Download the original uploaded file.

Response: Binary file with appropriate Content-Type and Content-Disposition: attachment headers.

Returns 400 if no file is stored for this document.


GET /v1/documents/:id/export

Export OCR results as a Word (.docx) or Excel (.xlsx) file.

Query Parameters:

ParameterRequiredValuesDescription
formatYesdocx or xlsxExport format
formattedNotrue (default) or falseInclude formatting and title heading

Example:

curl -OJ https://api.openesl.com/v1/documents/$DOC_ID/export?format=docx&formatted=true \
-H "Authorization: Bearer $TOKEN"

Response: Binary file with appropriate MIME type.

FormatMIME Type
docxapplication/vnd.openxmlformats-officedocument.wordprocessingml.document
xlsxapplication/vnd.openxmlformats-officedocument.spreadsheetml.sheet

Returns 400 if the document is not completed or has no OCR results.


POST /v1/documents/:id/email

Email the document (with the original file as an attachment) to the authenticated user's email address.

Response (202):

{
"message": "Document sent to your email."
}

The email is sent asynchronously. It includes the original file as an attachment and a preview of the OCR text.

Returns 400 if the document is not completed or has no file stored.


PUT /v1/documents/:id/ocr

Save a user revision of the OCR text. The original OCR output is never modified; the revision is stored alongside it.

Request:

{
"revised_text": "BILL OF LADING\nCorrected text here..."
}

Response (200):

{
"ocr": {
"engine": "textract",
"language": "en",
"text": "BILL OF LADING\nOriginal OCR text...",
"revised_text": "BILL OF LADING\nCorrected text here...",
"revised_at": "2026-02-21T16:45:00.000000+00:00"
},
"message": "Revision saved."
}

Returns 400 if the document is not completed or has no OCR results.