Structure from chaos.
Most documents arriving in your enterprise are unstructured — scanned forms, PDF invoices, handwritten field reports, multi-language regulatory filings, photographed permit sheets. TeamSync's Metadata Extraction capability turns them into structured records: typed metadata, extracted text, classified document type, all stored against the document in the Intelligent Repository.
Talk to an IDP solutions engineer · Compare to ABBYY · Compare to Hyperscience
What's in the capability.
| Component | Purpose |
|---|---|
| OCR (Optical Character Recognition) | 100+ languages; printed text from scans, PDFs, photos |
| ICR (Intelligent Character Recognition) | Handwriting via ML models; checkbox + signature detection |
| Document classification | Auto-tag document type (invoice / contract / claim / form / report) per the Intelligent Repository taxonomy |
| Field extraction | Type-specific field extraction (amount + due date for invoice; counterparty + effective date for contract; patient ID + diagnosis for clinical note) |
| Confidence scoring | Per-field confidence; below-threshold fields flagged for human review |
| Human-in-the-loop verification | Review interface for low-confidence extractions; corrections feed back into model training (with consent) |
| Audit ledger | Every extraction event anchored: source document, model version, output, human corrections |
How TeamSync compares.
| Capability | TeamSync | ABBYY Vantage | Hyperscience | Rossum | Tungsten Automation |
|---|---|---|---|---|---|
| Native to the document platform (no integration) | ✅ | Standalone | Standalone | Standalone | Standalone |
| Multilingual OCR (100+ languages) | ✅ | ✅ Strong | Limited | Strong (EU focus) | ✅ |
| ICR (handwriting) | ✅ | ✅ | ✅ Strong | Limited | ✅ |
| Per-field confidence with human-in-the-loop | ✅ | ✅ | ✅ | ✅ | ✅ |
| Audit ledger Merkle anchor per extraction | ✅ | Standard log | Standard log | Standard log | Standard log |
| Per-cluster pricing (no per-page metering) | ✅ | Per-page | Per-page | Per-document | Per-page |
Read the IDP alternative comparisons →
Frequently asked questions.
What about edge-case OCR accuracy — handwriting, low-quality scans?
For the 90% of enterprise documents (printed text, standard forms), TeamSync meets the production accuracy bar. For edge cases (handwriting on legacy forms, very-low-quality scans, exotic-language handwriting), TeamSync coexists with specialist IDP tools (ABBYY, Hyperscience) that can be invoked as workflow nodes.
Can I train a custom field extractor?
Yes. Customers train custom extractors for their document types via the human-in-the-loop verification interface. Training data is tenant-isolated.
Does it handle multi-page documents?
Yes including page-segment classification (a single PDF containing multiple document types is split and each section classified separately).
Related capabilities
- Intelligent Repository — extracted metadata lands here
- Business Process Automation — extraction as a workflow node
- Document Templates — extracted fields populate templates
- DocuTalk — AI grounds in extracted metadata
Related compliance overlays
- HIPAA — PHI extraction with tenant isolation
- FDA 21 CFR Part 11 — clinical-form extraction with audit