taxon
EU-sovereign by design — every byte stays in the EU

Turn documents into structured data, in the EU.

Taxon is an OCR + AI extraction API for teams that can't send their documents to US-hosted models. Describe the fields you want with JSON Schema; get matching JSON back, with confidences and a tier-trace.

No credit card. 100 extractions/month free. Self-hosted option available.

JSON Schema in. JSON out.

No vertical "invoice extractor" lock-in — describe the fields you actually want and Taxon's tier router picks the cheapest model that can answer. Switch from Mindee, Reducto, Extend, or LlamaParse in an afternoon.

1. Describe what you want
schema.json JSON Schema
{
  "type": "object",
  "properties": {
    "invoice_number": { "type": "string" },
    "issue_date":     { "type": "string", "format": "date" },
    "total":          { "type": "number" },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": { "type": "string" },
          "quantity":    { "type": "number" },
          "price":       { "type": "number" }
        }
      }
    }
  }
}
2. POST a file, get JSON
extract.sh cURL
# Upload a PDF
FILE_ID=$(curl -sS -X POST https://app.taxon.kfs.hr/v1/files \
  -H "Authorization: Bearer $TAXON_KEY" \
  -F file=@invoice.pdf \
  | jq -r .id)

# Extract against the schema
curl -sS -X POST https://app.taxon.kfs.hr/v1/extractions \
  -H "Authorization: Bearer $TAXON_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "file_id": "'$FILE_ID'",
    "json_schema": '$(cat schema.json)'
  }'
3. Get back structured data
response.json 200 OK
{
  "id": "ext_8a3f…",
  "status": "completed",
  "tier": 1,
  "confidence": 0.94,
  "data": {
    "invoice_number": "INV-2026-0481",
    "issue_date": "2026-04-12",
    "total": 1247.50,
    "line_items": [/* … */]
  }
}

Built for teams that take EU residency seriously.

Three things closed-API competitors structurally cannot offer.

EU-sovereign data path

Inference on Mistral La Plateforme, Nebius, Scaleway, OVHcloud, or your own self-hosted vLLM. Storage in Hetzner Object Storage. Zero US subprocessors on the document path. Schrems II clean.

Workspace-scoped corrections gold

Every correction your team makes lands in this workspace's audit + accuracy ledger — only ever for you. The accumulated value is yours, the schema versions stay deterministic, your historical extractions never silently re-interpret.

Horizontal — JSON Schema driven

No "invoice extractor" / "receipt extractor" / "ID parser" SKU maze. One API, one schema language. Works for invoices, contracts, ID cards, receipts, lab reports — anything you can describe.

DSR endpoints from v1

Access export, erasure, rectification, portability — first-class API features, not bolt-ons. PII redaction toggle as a pre-LLM step. Immutable audit log. DPA on file with every subprocessor.

Self-host the whole thing

Helm chart for k3s, Postgres-backed job queue, vLLM-friendly provider abstraction. Run it on-prem when your auditors require it; switch back to the SaaS without code changes.

Tier-routed pricing

Tier 0 → text-only LLM (cents). Tier 1 → vision LLM (when needed). Tier 2 → docTR + LLM fallback. Pay only for the tier the router actually uses; trace it in every response.

Transparent, EU-VAT inclusive.

Paddle handles VAT MOSS for you. Cancel any time, export your data any time, no minimum commit on the self-serve tiers.

Free
€0
/month, forever
  • · 1,000 pages / month
  • · Auto-tier routing
  • · No credit card
  • · Community support
Start free
Starter
Beta
€69
/month
  • · 3,000 pages included
  • · €0.025 / page above
  • · up to 30,000 pages / month
  • · 5 seats
  • · force_tier Fast / Auto
Start Starter
Growth
Beta
€299
/month
  • · 18,000 pages included
  • · €0.020 / page above
  • · up to 150,000 pages / month
  • · 10 seats
  • · force_tier Auto / Fast / Accurate
Start Growth
Scale
Soon
/month
  • · Custom-LoRA fine-tuning
  • · Higher concurrency
  • · Custom retention windows
  • · Priority support
  • Lands when LoRA + multi-region failover ship.
Notify me
Enterprise
Custom
contact us
  • · Bring-your-own-model
  • · On-prem / single-tenant
  • · Signed DPA, SSO, audit hooks
  • · 99.9 % SLA, custom support
  • · EU-residency guarantee
Talk to us

Beta tiers are full-featured paid plans that don't yet carry a formal uptime SLA. We aim for high reliability but we're transparent about what we don't promise — Enterprise SLA is available on request once we exit Beta.

Pricing is per page, not per token — so you can predict the bill from your document volume without knowing what LLM ran behind the scenes. Each tier includes a soft monthly cap; we'll never silently bill past it.

Drop in a PDF. See structured JSON.

No setup, no install. The free tier covers most evaluations end-to-end.

Open the dashboard