Split | Document Segmentation, Made Easy | Reducto
Studio

Customers

Pricing
Introducing Deep Extract: the most accurate structured document extraction agent yet
Split

Document Segmentation, Made Easy

Describe your sections in plain English. Split returns the page ranges for each one.

Helping everyone from startups to Fortune 10 enterprises unlock their data.

  • Harvey
  • Scale AI
  • Newfront
  • Medallion
  • Vanta
  • Legora
  • Rogo
  • Levelpath
  • JLL
  • Vise
  • Laurel
  • Toast
  • Mercor
  • Zip
  • Anterior
  • Supio
Split

Find where each section starts and ends

Definition
Split classifies every page of a document against a list of sections you describe in natural language and returns the page numbers each section occupies. Add a partition_key and Split groups repeating sections by an identifier from the document itself.
Who it's for
Teams that need to route different parts of a document to different schemas or separate bundled sub-documents for individual processing.
The problem it solves
Long documents waste time and tokens when sent whole to an LLM. Split finds section boundaries first so downstream steps only process the pages they need.
Split in the platform

How Split connects to the rest of the platform

/parseParse
Structured content from any document is needed for LLM or RAG use.
Structured chunks with typed blocks, bounding boxes, and confidence scores.
Returns the content itself, not a section map.
/extractExtract
The fields to pull are defined and typed JSON is needed.
Schema-typed JSON with optional citations on every value.
Pulls specific fields. Pipe Split page ranges into Extract.
/splitSplit
One file contains multiple logical documents or sections.
Page ranges for each section, with confidence scores.
/classifyClassify
Files need to be routed by type before processing.
Best-matching category with per-criterion confidence.
Identifies file type. Split maps sections within a file.
/editEdit
A PDF form needs filling or a DOCX needs updating.
A downloadable edited file, plus a reusable form schema.
Produces a filled document, not a page map.

Try Split on your own documents. Open it in Studio.

Where AI teams ship Split

Get only the pages you need

When a downstream step only needs part of a long document, Split finds the right pages first.

Annual reports & 10-Ks

Separate the executive summary, financials, and risk factors so each section gets the right extraction schema.

Combined brokerage statements

One PDF, many accounts. Set partition_key: "account_number" and Split returns one partition per account, with no manual page-counting.

Patient charts & encounter histories

Group pages by patient visit using a partition key, then process each encounter with the right schema.

Mailroom & intake batches

Identify cover letters, policies, and supporting documents inside a single intake PDF so each team gets the right pages.

Long contracts & data rooms

Find the indemnity clause, fee schedule, or assignment language, then pass the page range to Extract.

Reuse one parse for many splits

Parse a 200-page packet once, then run Split with different section descriptions against the same jobid://. No re-uploading, no re-billing for the parse.

Try Split on your own documents. Open it in Studio.

Why Split

Why teams switch to Split

  1. 01

    Describe sections in plain English

    You write descriptions, not rules. Split classifies each page against them.

  2. 02

    Repeating sections, grouped automatically

    Set a partition_key and Split returns one partition per identifier (account, patient, claim) read straight off the page.

  3. 03

    Confidence on every section

    Each split returns high or low confidence. Route low-confidence segments to review and auto-process the rest.

  4. 04

    One parse, many splits

    Pass a jobid:// from a prior Parse and Split runs against the cached read, saving the Parse credits on each iteration.

  5. 05

    Tunable for table-heavy documents

    Set table_cutoff: "preserve" to send full tables when partition keys live deep inside one.

  6. 06

    Composable with the rest of the platform

    Output is a map of section names to page ranges. Pipe them into Parse, Classify, or Extract.

How Split works

How Split works in four steps

  1. STEP 01

    Send file + section list

    Upload a file or point at a URL. Describe each section in natural language.

    POST /split
  2. STEP 02

    Parse runs underneath

    OCR, layout detection, and table reconstruction produce structured content for the classifier.

    jobid:// available
  3. STEP 03

    Classify pages by section

    Every page is scored against your descriptions. Partition keys group matched pages by an identifier from the document.

    descriptions → pages
  4. STEP 04

    You get splits[]

    One entry per section with name, pages, and confidence. Feed page ranges into downstream steps.

    splits[].pages
Built for production

Enterprise-ready from day one

  • SOC 2 Type II
  • HIPAA
  • Zero Data Retention
  • VPC · On-prem · Air-gapped
  • EU · AU regional endpoints
  • 99.9%+ uptime SLA
  • Enterprise support
Visit the Trust Center

Try Split on your own documents. Open it in Studio.

The rest of the platform

What runs after Split

FAQ

Common questions about Split

Document work starts here

Try Split on your documents

No setup, no credit card.

Reducto logoLLM Center