This tutorial shows how to model invoices in PdfParse using child tables and foreign keys so your output lands in normalized SQLite, not flat JSON.
Demo placeholder (swap in your own clip):
<video controls width="100%" poster="/blog/placeholders/invoice-demo-poster.png">
<source src="/blog/invoice-builder.webm" type="video/webm" />
<source src="/blog/invoice-builder.mp4" type="video/mp4" />
</video>
1) Schema: parent + children
invoice(parent):id,invoice_number,invoice_date,subtotal,tax,total,currency.customer(child):id,invoice_id(FK →invoice.id),name,email,company,address.invoice_items(child, repeating):id,invoice_id(FK →invoice.id),description,quantity,unit_price,line_total.
Copyable schema (tables)
invoice
| key | type | prompt |
|---|---|---|
| invoice_number | string | ”Invoice number shown on the document” |
| invoice_date | date | ”Issue date of the invoice” |
| subtotal | number | ”Subtotal before tax/fees” |
| tax | number | ”Total tax amount” |
| total | number | ”Final total after tax/fees” |
| currency | string | ”Currency code (e.g., USD, EUR) from the document” |
customer (child of invoice)
| key | type | prompt |
|---|---|---|
| invoice_id | number | ”Link to parent invoice id” |
| name | string | ”Customer full name or business name” |
| string | ”Customer email address if present” | |
| company | string | ”Company name if different from customer name” |
| address | string | ”Full mailing address for the customer” |
invoice_items (repeating child of invoice)
| key | type | prompt |
|---|---|---|
| invoice_id | number | ”Link to parent invoice id” |
| description | string | ”Line item description” |
| quantity | number | ”Quantity on the line item” |
| unit_price | number | ”Price per unit” |
| line_total | number | ”Quantity multiplied by unit price (if reliable)” |
Copyable schema (JSON)
{
"invoice": {
"fields": [
{ "key": "invoice_number", "type": "string", "prompt": "Invoice number shown on the document" },
{ "key": "invoice_date", "type": "date", "prompt": "Issue date of the invoice" },
{ "key": "subtotal", "type": "number", "prompt": "Subtotal before tax/fees" },
{ "key": "tax", "type": "number", "prompt": "Total tax amount" },
{ "key": "total", "type": "number", "prompt": "Final total after tax/fees" },
{ "key": "currency", "type": "string", "prompt": "Currency code (e.g., USD, EUR) from the document" }
]
},
"customer": {
"parent": "invoice",
"fields": [
{ "key": "invoice_id", "type": "number", "prompt": "Link to parent invoice id" },
{ "key": "name", "type": "string", "prompt": "Customer full name or business name" },
{ "key": "email", "type": "string", "prompt": "Customer email address if present" },
{ "key": "company", "type": "string", "prompt": "Company name if different from customer name" },
{ "key": "address", "type": "string", "prompt": "Full mailing address for the customer" }
]
},
"invoice_items": {
"parent": "invoice",
"repeating": true,
"fields": [
{ "key": "invoice_id", "type": "number", "prompt": "Link to parent invoice id" },
{ "key": "description", "type": "string", "prompt": "Line item description" },
{ "key": "quantity", "type": "number", "prompt": "Quantity on the line item" },
{ "key": "unit_price", "type": "number", "prompt": "Price per unit" },
{ "key": "line_total", "type": "number", "prompt": "Quantity multiplied by unit price (if reliable)" }
]
}
}
2) Why this shape
- Keeps customers normalized (no duplicated addresses per line).
- Items live in their own table so you can aggregate (
SUM(line_total)) and filter (quantity > 5). - Foreign keys (
invoice_id) keep BI tools and LLM agents traversing relationships cleanly.
3) Schema prompts and validation
- Mark
invoice_itemsas a repeating child table. - Add prompts per field, e.g.,
quantity→ “Integer quantity on the line item”;unit_price→ “Numeric price per item”. - Capture currency code on the parent invoice so totals are unambiguous.
- Consider computing
line_totalif the model is reliable; otherwise keepquantityandunit_priceonly.
4) Quick SQL unlocked
SELECT i.invoice_number, SUM(it.line_total) AS total_lines
FROM invoice i
JOIN invoice_items it ON it.invoice_id = i.id
GROUP BY i.invoice_number;
5) Field mapping checklist
- Parent: invoice number, issue date, currency, subtotal, tax, total.
- Customer child: name, email, company, address lines. Link via
invoice_id. - Items child: description, quantity, unit_price, line_total (optional).
6) Testing and validation
- Run a few samples; export as SQLite and confirm every
invoice_items.invoice_idmatches aninvoice.id. - Spot-check numeric types (quantity/price/totals) aren’t strings.
- Check totals reconcile; if not, tighten prompts for tax/discounts.
7) Download your SQLite database
Once you’ve processed your invoices and reviewed the extractions, you can download the complete SQLite database containing all your invoice data with proper foreign key relationships.
How to download:
- Navigate to Dashboard → Projects → [Your Invoice Project]
- Click on the Downloads tab in the navigation
- Click the “Download Database” button
- Save the
.sqlitefile to your local machine
Download demo placeholder (swap in your own clip):
<video controls width="100%" poster="/blog/placeholders/download-demo-poster.png"><source src="/blog/sqlite-download.webm" type="video/webm" /><source src="/blog/sqlite-download.mp4" type="video/mp4" /></video>
Breadcrumb navigation: Dashboard → Projects → [Your Project] → Downloads → Download Database
The downloaded SQLite file contains all three tables (invoice, customer, invoice_items) with foreign keys intact. You can immediately:
- Open it with any SQLite client (DB Browser, DataGrip, etc.)
- Query it with Python, Node.js, or any programming language
- Import into Excel, Power BI, or other analytics tools
- Feed it to AI agents for natural language querying
See our detailed guide on integrating SQLite databases into your workflow for Python examples, API patterns, and BI tool connections.
Related tutorials
Continue learning about structured document extraction:
- Build a Bank Statement Extractor with JSON Arrays - See how to model transaction data using JSON arrays for flexibility
- Build a Resume Database with SQLite - Learn how to extract and query hundreds of resumes with SQLite downloads
- Introducing PdfParse - Product overview and core features walkthrough
Need help with your invoice schema? Contact us at support@pdfparse.net for:
- Custom schema design for complex invoices
- Handling edge cases (discounts, taxes per line, partial payments, multi-currency)
- Bulk processing guidance
- Integration support for accounting systems (QuickBooks, Xero, NetSuite)