Standardize Documents

Standardize a batch documents, either by passing a list of Document IDs or by passing a dataset name. Pass a schemaId to standardize the documents using a specific structure, or leave it empty to create an ad-hoc structure as the AI sees fit. Standardization handles lists (arrays) by splitting documents into smaller sub-documents behind the scenes - the AI will do its best to decide how and when it is appropriate to split.

You can specify certain parameters, by default they are left to auto which lets the AI decide.

  1. displayMode - Controls how the AI sees the document. The options are:
    • auto - Automatically determine the best mode based on the document content.
    • spatial - Represent text in the document according to its spatial layout.
    • sections - Represent the document as a list of sections (paragraphs, tables, images, etc.) as seen in the web UX.
  2. splitMode - Controls how the AI splits the document into sub-documents. The options are:
    • auto - Automatically determine the best mode based on the document content.
    • all - Split the document into single-page sub-documents, so each page is handled separately.
    • never - Do not split the document at all, so the entire document is handled as a single unit. This can lead to poor performance for long documents, or documents with lots of dense data that needs to be extracted.
  3. effortLevel - Controls how much effort the AI puts into the standardization. The options are:
    • standard - Use the standard effort level.
    • high - Use the high effort level, which takes longer but can produce better results. Costs +2 credits per page.
Language
Credentials
Header
Click Try It! to start a request and see the response here!