API Overview

The Chunkr API is a powerful document processing service that converts documents into RAG/LLM-ready data through advanced layout analysis and intelligent chunking.

Base URL

All API requests should be made to:

https://api.chunkr.ai

API Versioning

The current API version is v1. All endpoints are prefixed with /api/v1. Example endpoint:

https://api.chunkr.ai/api/v1/task

Request Format

The API accepts requests in two formats:

JSON (Recommended): Content-Type: application/json
Multipart Form Data: Content-Type: multipart/form-data (deprecated)

Response Format

All successful responses return JSON with appropriate HTTP status codes:

200 OK - Request successful
400 Bad Request - Invalid request parameters
401 Unauthorized - Authentication failed
404 Not Found - Resource not found
429 Too Many Requests - Rate limit exceeded
500 Internal Server Error - Server error

Rate Limits

The API implements intelligent rate limiting to ensure fair usage and optimal performance.

Chunkr uses a token bucket algorithm for rate limiting across different service types:

Service Rate Limits

Service	Default Rate Limit	Configurable
General OCR	5 requests/second	Yes
Segmentation	5 requests/second	Yes
LLM Processing	Varies by model	Yes

Batch Sizes

To optimize throughput, the API processes requests in batches:

General OCR: 30 pages per batch
Segmentation: 3 pages per batch

Rate Limit Headers

Currently, rate limit information is managed server-side. If you exceed the rate limit, you’ll receive a 429 Too Many Requests response.

When you hit a rate limit, the API will return a 429 status code with the message “Usage limit exceeded”. Implement exponential backoff in your retry logic.

Timeouts

Different operations have different timeout configurations:

General OCR: Configurable (no default timeout)
Segmentation: Configurable (no default timeout)
LLM Processing: 150 seconds default
API Request: 600 seconds (10 minutes)

File Size Limits

The API accepts files up to 1 GB by default. Both total request size and in-memory limits are enforced.

Max Total Limit: 1 GB (configurable via MAX_TOTAL_LIMIT)
Max Memory Limit: 1 GB (configurable via MAX_MEMORY_LIMIT)

Supported File Types

The API automatically detects file types and supports various document formats. Common MIME types include:

PDF documents
Images (JPEG, PNG)
Other document formats

Health Check

Check the API health and version:

curl https://api.chunkr.ai/health

Response:

OK - Version {version}

API Documentation

Interactive API documentation is available at:

Swagger UI: https://api.chunkr.ai/swagger-ui/
ReDoc: https://api.chunkr.ai/redoc
OpenAPI Spec: https://api.chunkr.ai/openapi.json

Next Steps

Authentication

Learn how to authenticate your API requests

Error Handling

Understand error responses and status codes

Task Management

Create and manage document processing tasks

Overview

Tasks

Models

Base URL

API Versioning

Request Format

Response Format

Rate Limits

Service Rate Limits

Batch Sizes

Rate Limit Headers

Timeouts

File Size Limits

Supported File Types

Health Check

API Documentation

Next Steps

Authentication

Error Handling

Task Management

Overview

Tasks

Models

Documentation Index

​Base URL

​API Versioning

​Request Format

​Response Format

​Rate Limits

​Service Rate Limits

​Batch Sizes

​Rate Limit Headers

​Timeouts

​File Size Limits

​Supported File Types

​Health Check

​API Documentation

​Next Steps

Authentication

Error Handling

Task Management

Base URL

API Versioning

Request Format

Response Format

Rate Limits

Service Rate Limits

Batch Sizes

Rate Limit Headers

Timeouts

File Size Limits

Supported File Types

Health Check

API Documentation

Next Steps