Downloading Extraction Results
How to download your standardization results in different formats, including merging multiple documents into a single file.
After running Standardization on your documents, you can download the results in several formats. Select one or more standardizations from the dashboard and click Download to see the available options.
Download Formats
| Format | What you get |
|---|---|
| CSV | One row per document in a single CSV file |
| JSON | All selected documents combined into one JSON file |
| Excel | Individual Excel files bundled in a zip |
| Merged Excel | All selected documents combined into a single Excel file |
| XML | Individual XML files bundled in a zip |
All formats are available from both the dashboard and the API. See the API reference for the corresponding endpoints.
Merged Excel
The Merged Excel option combines extraction results from multiple documents into a single Excel workbook. This is useful when you have many similar documents (e.g., invoices, bank statements, insurance forms) and want all the data in one place.
How it works
- Scalar fields (single values like "invoice_number", "date", "vendor_name") become a table where each document is one row. Columns are the field names.
- Array fields (tables like "line_items", "transactions") are concatenated across all documents into a single sheet. Each row keeps its original data.
- A filename column is added to every sheet, showing which document each row came from. If your schema already has a field called "filename", the column is named "docupipe_filename" instead to avoid conflicts.
Example
Suppose you have 3 invoices, each with scalar fields (invoice_number, date, total) and an array field (line_items). The merged Excel will contain:
Main sheet:
| filename | invoice_number | date | total |
|---|---|---|---|
| invoice_001.pdf | INV-100 | 2026-01-15 | 1,250.00 |
| invoice_002.pdf | INV-101 | 2026-01-20 | 830.00 |
| invoice_003.pdf | INV-102 | 2026-02-01 | 2,100.00 |
line_items sheet:
| filename | description | quantity | unit_price | amount |
|---|---|---|---|---|
| invoice_001.pdf | Widget A | 10 | 50.00 | 500.00 |
| invoice_001.pdf | Widget B | 5 | 150.00 | 750.00 |
| invoice_002.pdf | Service Fee | 1 | 830.00 | 830.00 |
| invoice_003.pdf | Widget A | 20 | 50.00 | 1,000.00 |
| invoice_003.pdf | Widget C | 10 | 110.00 | 1,100.00 |
You can merge up to 500 standardizations in a single download. The documents don't need to share the same schema - fields that don't exist in a particular document will simply appear as empty cells.
API usage
To use the merged Excel download via the API:
curl -X POST https://api.docupipe.ai/standardization/download/merged-excel \
-H "Authorization: Bearer YOUR_API_KEY" \
-H "Content-Type: application/json" \
-d '{"standardizationIds": ["std_id_1", "std_id_2", "std_id_3"]}'The response contains a presigned URL that you can use to download the file. The URL expires after 24 hours.
For very large merges, the download may take a few seconds to generate. The file is built server-side and uploaded to cloud storage before the download URL is returned.
Updated 1 day ago