Standardization Parameters
Standardizing documents a Schema involves some configuration parameters. Learn here about what they are and how they work
Understanding Standardization
Every document you upload goes through Parsing stage, and then optionally may also go through a Standardization stage.
- Parsing captures the raw ingredients—text, layout, tables, checkboxes, and images.
- Standardization turns that parsed content into structured data, usually based on a Schema so downstream tools always see the same fields.
Uploading a document always produces a Parsing result. Running a Standardization job on top of that result gives you validated fields that can flow into workflows and integrations.
Working With Schemas
A Schema defines the exact fields, data types, and validation rules you expect in the output. The Quick Start Guide walks through creating one, if you haven't already made one.
Almost all use cases are better served by using a schema, but if you have no idea what sort of document you're up against and just want to condense it into a useful set of fields, you may standardize without a schema.
Parameter Reference
Standardization jobs expose a few controls so you can balance accuracy, speed, and credit usage.
Effort Level
Effort Level trades more credits and time for improved accuracy and reasoning:
standard(default) is the fastest option for clean, predictable documents.highuses a more capable model and adds validation passes.extendedgoes deepest on long or tricky files where fields span many pages
Display Mode
Determines how the document is presented to the AI:
Spatialkeeps approximate page layout so positional cues survive.Sectionsstreams the viewer output from top to bottom, emitting Markdown tables.Imageshares pixels alongside text—ideal for handwriting, signatures, or complex tables.Autolets DocuPipe choose for you; stick with this unless you have a specific requirement.
Split Mode
Large files can be split so the AI works on smaller chunks. You can split a document yourself or let DocuPipe decide.
Allsplits aggressively (often page-by-page). Use this for repetitive forms but avoid it when a field spans multiple pages.Neverkeeps the file intact so the AI can use full-context cues. Works best for short (1–10 page) documents.Autobalances both approaches and is the default for most workflows.
Updated about 1 month ago
