Downloading Extraction Results

How to download your standardization results in different formats, including merging multiple documents into a single file.

After running Standardization on your documents, you can download the results in several formats. Select one or more standardizations from the dashboard and click Download to see the available options.

Download Formats

FormatWhat you get
CSVOne row per document in a single CSV file
JSONAll selected documents combined into one JSON file
ExcelIndividual Excel files bundled in a zip
Merged ExcelAll selected documents combined into a single Excel file
XMLIndividual XML files bundled in a zip
📘

All formats are available from both the dashboard and the API. See the API reference for the corresponding endpoints.

Merged Excel

The Merged Excel option combines extraction results from multiple documents into a single Excel workbook. This is useful when you have many similar documents (e.g., invoices, bank statements, insurance forms) and want all the data in one place.

How it works

  • Scalar fields (single values like "invoice_number", "date", "vendor_name") become a table where each document is one row. Columns are the field names.
  • Array fields (tables like "line_items", "transactions") are concatenated across all documents into a single sheet. Each row keeps its original data.
  • A filename column is added to every sheet, showing which document each row came from. If your schema already has a field called "filename", the column is named "docupipe_filename" instead to avoid conflicts.

Example

Suppose you have 3 invoices, each with scalar fields (invoice_number, date, total) and an array field (line_items). The merged Excel will contain:

Main sheet:

filenameinvoice_numberdatetotal
invoice_001.pdfINV-1002026-01-151,250.00
invoice_002.pdfINV-1012026-01-20830.00
invoice_003.pdfINV-1022026-02-012,100.00

line_items sheet:

filenamedescriptionquantityunit_priceamount
invoice_001.pdfWidget A1050.00500.00
invoice_001.pdfWidget B5150.00750.00
invoice_002.pdfService Fee1830.00830.00
invoice_003.pdfWidget A2050.001,000.00
invoice_003.pdfWidget C10110.001,100.00
👍

You can merge up to 500 standardizations in a single download. The documents don't need to share the same schema - fields that don't exist in a particular document will simply appear as empty cells.

API usage

To use the merged Excel download via the API:

curl -X POST https://api.docupipe.ai/standardization/download/merged-excel \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{"standardizationIds": ["std_id_1", "std_id_2", "std_id_3"]}'

The response contains a presigned URL that you can use to download the file. The URL expires after 24 hours.

🚧

For very large merges, the download may take a few seconds to generate. The file is built server-side and uploaded to cloud storage before the download URL is returned.