Six production-ready AI services. One API key. No infrastructure to manage, no pipelines to build. Start querying in under a minute.
Upload your documents once. Query forever. We handle chunking, embedding, vector storage, re-ranking, and generation — all in a single API call. Bring your own LLM or use ours.
Charged per query. Embedding ingestion billed separately at $0.004 / 1k tokens.
POST https://api.drut.ai/v1/rag/query
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"collection_id": "col_acme_docs_v2",
"query": "What is our refund policy for enterprise plans?",
"top_k": 5,
"rerank": true,
"model": "drut-rag-1",
"stream": false
}Send us a natural language question and your schema. Get back validated, optimised SQL — tested against PostgreSQL, MySQL, BigQuery, and Snowflake. Supports CTEs, window functions, and multi-table joins.
Charged per generation call. Schema tokens are included in the per-call price.
POST https://api.drut.ai/v1/sql/generate
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"dialect": "postgresql",
"schema": {
"orders": ["id", "user_id", "amount", "created_at", "status"],
"users": ["id", "email", "plan", "region"]
},
"question": "Monthly revenue by region for paid users in 2024",
"validate": true
}Give your agents a long-term memory that persists across sessions. Store facts, events, user preferences, and agent observations. Retrieve by semantic similarity or exact filters. Memory points expire or persist based on your policy.
Charged per memory point stored per day. Retrieval queries are free up to 50k/mo.
POST https://api.drut.ai/v1/memory/write
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"namespace": "user_9821",
"points": [
{
"content": "Prefers concise answers, dislikes bullet lists.",
"type": "preference",
"ttl_days": 90
},
{
"content": "Reviewed Q3 report on 2025-11-03, flagged APAC anomaly.",
"type": "event",
"ttl_days": 30
}
]
}Send any document — PDF, scanned image, invoice, contract — and receive structured JSON with entities, tables, key-value pairs, and layout annotations. Powered by our Mistral-based fine-tuned extractor.
Charged per page extracted. Tables and forms count as one page each.
POST https://api.drut.ai/v1/extract
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"document_url": "https://storage.co/invoices/inv_0042.pdf",
"extract": ["entities", "tables", "key_values"],
"language": "en",
"output_format": "json"
}Generate dense vector embeddings for text, code, or structured data. Choose from our model family — 256d for speed, 1536d for precision. Batch up to 2,048 inputs per request.
Charged per 1k tokens embedded. Batching is strongly encouraged — up to 2,048 inputs.
POST https://api.drut.ai/v1/embed
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"model": "drut-embed-base",
"inputs": [
"Agent memory systems for production LLMs",
"Long-term episodic storage in transformer agents"
],
"truncate": true
}Pass a query and a list of candidate documents. Get back a precision-scored, sorted list using our cross-encoder model. Works with any vector retrieval system or BM25.
Charged per reranking call, regardless of candidate count (up to 100 docs).
POST https://api.drut.ai/v1/rerank
Authorization: Bearer drut_sk_...
Content-Type: application/json
{
"query": "How to reduce inference cost for large models?",
"documents": [
"Quantisation techniques reduce model size by 4x...",
"The history of transformer architectures...",
"Batching strategies for GPU inference workloads..."
],
"top_n": 2,
"return_scores": true
}Get an API key and start building in minutes. All services share the same authentication and base URL. SDKs for Python and TypeScript available.