# Mildport — every customer file, safely into port

> The embeddable import wizard and self-hosted engine that turns messy CSVs, spreadsheets, PDFs — even scans — into exactly the records your product expects. Deterministic. Explainable. Yours.

Canonical: https://mildport.com/ · Agent index: https://mildport.com/llms.txt
Source of truth: this file mirrors the landing sections in `apps/landing/src/app/sections/`. Last updated: 2026-06-11.

**Guarantees:** Runs in your VPC · Offline signed licenses · No black-box decisions · WCAG 2.2 AA review grid · Tenant #1 is our own CRM

Built by **Capitality** — and run in production by Capitality as tenant #1: the same widget, the same public API, the same license gates. We deleted our own importer to get here (the cutover commit: 142 files, +343 −12,236).

## Everything an importer owes you

- **Reads almost anything.** CSV, TSV, XLSX, ODS, JSON, XML — plus PDF table extraction, ZUGFeRD-style document facts, and OCR for scans and photos. Async jobs handle the heavy files.
- **Deterministic first, AI when it earns it.** Headers and value shapes scored by an explainable engine — same input, same answer. An optional, evidence-gated AI judge re-ranks only the uncertain tail: bring your own model, rationale shown in the grid, off by default.
- **Remembers every mapping.** Confirmed mappings persist per template and fingerprint; learned header aliases improve future matches. Your customer maps a file once.
- **Multi-file imports.** Contacts in one file, companies in another? Drop both, join on a key — left or inner — and import a single coherent dataset.
- **A review grid that fixes.** Inline editing, per-cell validation, error-row export, your own cleaning hooks and step gates — and a WCAG 2.2 AA accessible grid mode.
- **References that resolve.** Rows can point at records that already exist in your product. Mildport resolves them against your datasets — chips, pickers, graph apply.
- **Your brand, not ours.** Light-DOM theming over `--mildport-*` tokens: any design system, dark mode included.
- **Delivery you can audit.** HMAC-signed apply webhooks with durable retries and a delivery log — or browser-mode callbacks. Usage metering and retention policies built in.

## How it works — one element, four moments

1. **Drop in the element.** An Angular 21 custom element that works in React, Vue, Svelte or plain HTML. Attributes in, DOM events out — no SDK lock-in, no iframe.
2. **Your customer uploads anything.** CSV, TSV, XLSX, ODS, JSON, XML, PDFs with tables, scans and photos via OCR. Multiple files at once — joined on a key into one dataset.
3. **Mildport matches, they confirm.** Deterministic header + value matching with visible confidence, validation on every cell, inline fixes in the review grid, and your own cleaning hooks — client-side, before anything is delivered.
4. **You receive clean rows.** An HMAC-signed apply webhook with retries and a delivery audit — or an `onResults` callback in the browser. Either way: rows shaped exactly like your schema, with the mapping that produced them.

```ts
import { defineImportSuiteElement } from '@capitality-io/import-suite-ng';

await defineImportSuiteElement('mildport-import'); // framework-agnostic

<mildport-import
  api-base-url="https://imports.your-infra.example"
  license-key={SIGNED_TENANT_KEY}
></mildport-import>

el.columnSchema = yourTargetSchema;          // what a clean row looks like
el.addEventListener('import-applied', sync); // or a signed apply webhook
```

Published as `@capitality-io/import-suite-ng` (React wrapper: `@capitality-io/import-suite-react`) on GitHub Packages — restricted access, included with a pilot/license.

## Why self-hosted — the quadrant the cloud importers can't take

Flatfile, OneSchema and Dromo are excellent — and they're someone else's cloud. Mildport is the import engine for the deals where that's the dealbreaker: regulated data, security reviews, procurement that reads the architecture diagram.

- **Runs in your infrastructure.** A self-contained Docker Compose stack — service, Mongo, decode sidecars, optional MinIO. Or embed it next to your own services. Helm chart on the roadmap, preflight check included today.
- **Licenses verify offline.** Signed EdDSA license keys checked against a public key you deploy — no license server, no phone-home, air-gap friendly. Issue per-tenant keys with one CLI.
- **Explainable, replayable.** Deterministic matchers with visible scores, an event-sourced normalization pipeline you can replay, webhook deliveries with an audit trail. When compliance asks "why this mapping?" — there's an answer.
- **AI as an accelerant, never a black box.** An optional AI judge re-ranks only the low-confidence tail, must cite evidence, and starts in shadow — it goes live per tenant only after its suggestions measurably agree with what your people accept. Bring your own model (OpenAI-compatible or Anthropic), so even the AI stays in your VPC. Off by default, every decision logged.

## Shipped (from the repo's changelog, recent first)

- **ai 1.0 — AI mapping, deterministic-first, trust earned.** Evidence-gated judge re-ranks the low-confidence tail. Shadow-first per tenant, BYO model, decision log, AI picks tinted in the grid with rationale on hover.
- **Cutover — Capitality deleted its own importer.** Our CRM consumes the standalone Mildport service over REST as tenant #1.
- **v0.2.x — Multi-file imports.** Join contacts + companies on a key pair, left or inner, with join-health feedback.
- **v0.2.6 — Dark mode + theme polish.** Spartan-aligned `--mildport-*` tokens with light/dark presets.
- **v0.2.2 — OCR.** Image-only PDFs and receipt photos become reviewable rows with per-cell confidence and a mandatory extract-confirm step.
- **v0.2.x — PDF ingestion, both kinds.** Embedded tables extracted server-side; document-style PDFs (ZUGFeRD invoices) map grounded facts with a pdf.js preview.
- **v0.2.0 — Reference resolution.** Imported rows resolve against existing records — status chips, ambiguity pickers, graph apply, SSRF-guarded server datasets.
- **v0.1.0 — Value-based detection.** Value shapes (emails, IBANs, dates, phones) vote alongside header names.
- **Engine — self-host stack + signed everything.** Docker Compose with Mongo, decode sidecars, optional MinIO; HMAC-signed deliveries with audit; EdDSA offline licenses with revocation lists and per-plan entitlements.
- **Packages — Angular + React published.** `@capitality-io/import-suite-ng` and `import-suite-react` on GitHub Packages, semver-pinned by hosts.

## Roadmap (published openly — NOT yet generally available)

| Status    | Item                                                      |
| --------- | --------------------------------------------------------- |
| Up next   | Smart joins + server-side XLSX joins                      |
| Up next   | Per-tenant AI config for the hosted tier                  |
| In design | AI transforms as reviewable diffs — never silent rewrites |
| Planned   | Vue + vanilla wrappers, headless REST mode                |
| Planned   | Helm chart + air-gap guide                                |
| Planned   | Auditor-ready decision log for the whole import           |
| Planned   | Web-worker validation for huge sheets                     |
| Planned   | Self-serve licensing + sandbox tier                       |
| Horizon   | Pipelines: recurring imports (SFTP, S3, inbound email)    |
| Horizon   | MCP server — agent-drivable imports                       |

## Contact

- Pilots & licensing: licensing@mildport.com
- Security reports: security@mildport.com (see https://mildport.com/.well-known/security.txt)
- The live demo at https://mildport.com/ runs in-browser; uploaded files never leave your machine.

© 2026 Capitality. Proprietary software — all rights reserved.
