Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lumina-ai-inc/chunkr/llms.txt

Use this file to discover all available pages before exploring further.

The Chunkr API is a powerful document processing service that converts documents into RAG/LLM-ready data through advanced layout analysis and intelligent chunking.

Base URL

All API requests should be made to:
https://api.chunkr.ai

API Versioning

The current API version is v1. All endpoints are prefixed with /api/v1. Example endpoint:
https://api.chunkr.ai/api/v1/task

Request Format

The API accepts requests in two formats:
  • JSON (Recommended): Content-Type: application/json
  • Multipart Form Data: Content-Type: multipart/form-data (deprecated)

Response Format

All successful responses return JSON with appropriate HTTP status codes:
  • 200 OK - Request successful
  • 400 Bad Request - Invalid request parameters
  • 401 Unauthorized - Authentication failed
  • 404 Not Found - Resource not found
  • 429 Too Many Requests - Rate limit exceeded
  • 500 Internal Server Error - Server error

Rate Limits

The API implements intelligent rate limiting to ensure fair usage and optimal performance.
Chunkr uses a token bucket algorithm for rate limiting across different service types:

Service Rate Limits

ServiceDefault Rate LimitConfigurable
General OCR5 requests/secondYes
Segmentation5 requests/secondYes
LLM ProcessingVaries by modelYes

Batch Sizes

To optimize throughput, the API processes requests in batches:
  • General OCR: 30 pages per batch
  • Segmentation: 3 pages per batch

Rate Limit Headers

Currently, rate limit information is managed server-side. If you exceed the rate limit, you’ll receive a 429 Too Many Requests response.
When you hit a rate limit, the API will return a 429 status code with the message “Usage limit exceeded”. Implement exponential backoff in your retry logic.

Timeouts

Different operations have different timeout configurations:
  • General OCR: Configurable (no default timeout)
  • Segmentation: Configurable (no default timeout)
  • LLM Processing: 150 seconds default
  • API Request: 600 seconds (10 minutes)

File Size Limits

The API accepts files up to 1 GB by default. Both total request size and in-memory limits are enforced.
  • Max Total Limit: 1 GB (configurable via MAX_TOTAL_LIMIT)
  • Max Memory Limit: 1 GB (configurable via MAX_MEMORY_LIMIT)

Supported File Types

The API automatically detects file types and supports various document formats. Common MIME types include:
  • PDF documents
  • Images (JPEG, PNG)
  • Other document formats

Health Check

Check the API health and version:
curl https://api.chunkr.ai/health
Response:
OK - Version {version}

API Documentation

Interactive API documentation is available at:
  • Swagger UI: https://api.chunkr.ai/swagger-ui/
  • ReDoc: https://api.chunkr.ai/redoc
  • OpenAPI Spec: https://api.chunkr.ai/openapi.json

Next Steps

Authentication

Learn how to authenticate your API requests

Error Handling

Understand error responses and status codes

Task Management

Create and manage document processing tasks