Permission-aware chunks for AI retrieval.
Send a document, page, transcript, or SOP. Get back clean chunks tagged with department, access level, source URL, and per-principal permissions — ready for embedding and retrieval without leaking restricted context.
Lineage preserved
Every chunk carries its source_id and source_url so you can audit what your AI saw.
Department-aware
Tag chunks by team — finance, legal, HR, ops — and filter at retrieval time.
Access boundaries
Public, internal, restricted, confidential plus per-principal ACLs on every chunk.
How access boundaries reach retrieval
Every chunk carries source-level access metadata — department, access_level, and source_url — plus per-principal ACL rows in chunk_permissions. Those tags ride with the chunk through your embedding pipeline, into your vector store, and back out at retrieval time.
Your retrieval code filters on those tags before passing context to the LLM. A finance-only query never returns an HR chunk. A user without read permission on a specific contract never sees it — even if it would have been the most semantically relevant match.
AutoChunk is the data layer; authorization stays in your code. We make sure every chunk knows where it came from and who can see it. Read the security model →
Two calls: extract → chunk
POST a binary file to /api/v1/extract for clean text, then pipe the result into /api/v1/chunk to get back permission-tagged chunks. Or skip step 1 if your text is already extracted.
# 1. Extract text from a PDF (also: HTML, DOCX, plain text)
curl -X POST https://autochunk.ai/api/v1/extract \
-H "x-api-key: rh_live_XXXXXXXX" \
-F "file=@contract.pdf"
# → { "text": "Payment terms are net 30...", "format": "pdf",
# "metadata": { "pages": 7, "characters": 12453, "words": 2189, ... } }
# 2. Chunk the extracted text with permission tags
curl -X POST https://autochunk.ai/api/v1/chunk \
-H "x-api-key: rh_live_XXXXXXXX" \
-H "content-type: application/json" \
-d '{
"source": {
"type": "pdf",
"url": "https://example.com/contract-892.pdf",
"title": "MSA — Acme Corp",
"department": "finance",
"access_level": "restricted",
"permissions": [
{ "principal_type": "role", "principal_id": "finance-lead", "permission": "read" }
]
},
"content": "Payment terms are net 30. Late payments incur a 2% fee per month..."
}'Response
{
"source_id": "src_4f2a...",
"chunk_count": 7,
"total_tokens": 3421,
"chunks": [
{
"chunk_id": "chk_9b1c...",
"source_id": "src_4f2a...",
"chunk_text": "Payment terms are net 30...",
"summary": null,
"department": "finance",
"access_level": "restricted",
"source_url": "https://example.com/contract-892.pdf",
"token_count": 178,
"embedding_ready": true
}
]
}Try it, then get a key
The playground lets you paste any text and see the exact output without signing up. When you're ready to integrate, request an invite-only API key — we'll reply within 48 hours.