API Solutions — Drut AI

AI capabilities,
delivered as APIs.

Six production-ready AI services. One API key. No infrastructure to manage, no pipelines to build. Start querying in under a minute.

API Services

< 1

Min Integration

99.9%

Uptime SLA

Infra to Manage

API

/v1/rag/query/v1/sql/generate/v1/memory/v1/extract/v1/embed/v1/rerank/v1/rag/query/v1/sql/generate/v1/memory/v1/extract/v1/embed/v1/rerank

API Offerings · 6 services

RAG-as-a-Service

Retrieval-augmented generation. No infra, no indexing pipelines.

retrieval

POST/v1/rag/query

Upload your documents once. Query forever. We handle chunking, embedding, vector storage, re-ranking, and generation — all in a single API call. Bring your own LLM or use ours.

Use cases

Enterprise knowledge basesSupport copilotsResearch assistantsDoc QA

Latency< 400ms p95

SLA99.9% uptime

Pricing

Charged per query. Embedding ingestion billed separately at $0.004 / 1k tokens.

Starter

$0.008/ query

Up to 10k queries/mo

Growth

$0.005/ query

10k – 500k queries/mo

Scale

$0.003/ query

500k+ queries/mo

Enterprise

Custom/ query

SLA + dedicated cluster

POST https://api.drut.ai/v1/rag/query
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "collection_id": "col_acme_docs_v2",
  "query": "What is our refund policy for enterprise plans?",
  "top_k": 5,
  "rerank": true,
  "model": "drut-rag-1",
  "stream": false
}

NL-to-SQL-as-a-Service

Natural language to production SQL. Dialect-aware, schema-grounded.

generation

POST/v1/sql/generate

Send us a natural language question and your schema. Get back validated, optimised SQL — tested against PostgreSQL, MySQL, BigQuery, and Snowflake. Supports CTEs, window functions, and multi-table joins.

Use cases

BI self-serviceData chatbotsReport generatorsAnalyst copilots

Latency< 250ms p95

SLA99.9% uptime

Pricing

Charged per generation call. Schema tokens are included in the per-call price.

Starter

$0.012/ call

Up to 5k calls/mo

Growth

$0.009/ call

5k – 100k calls/mo

Scale

$0.006/ call

100k+ calls/mo

Enterprise

Custom/ call

Private deployment

POST https://api.drut.ai/v1/sql/generate
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "dialect": "postgresql",
  "schema": {
    "orders": ["id", "user_id", "amount", "created_at", "status"],
    "users":  ["id", "email", "plan", "region"]
  },
  "question": "Monthly revenue by region for paid users in 2024",
  "validate": true
}

Agent Memory Layer

Persistent, queryable memory for AI agents. Priced by what you store.

memory

POST/v1/memory

Give your agents a long-term memory that persists across sessions. Store facts, events, user preferences, and agent observations. Retrieve by semantic similarity or exact filters. Memory points expire or persist based on your policy.

Use cases

Personalised agentsCustomer historyResearch agentsMulti-session bots

Latency< 60ms p95

SLA99.95% uptime

Pricing

Charged per memory point stored per day. Retrieval queries are free up to 50k/mo.

Starter

$0.0002/ point / day

Up to 500k points

Growth

$0.00015/ point / day

500k – 10M points

Scale

$0.00009/ point / day

10M+ points

Enterprise

Custom/ point / day

Isolated namespace

POST https://api.drut.ai/v1/memory/write
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "namespace": "user_9821",
  "points": [
    {
      "content": "Prefers concise answers, dislikes bullet lists.",
      "type": "preference",
      "ttl_days": 90
    },
    {
      "content": "Reviewed Q3 report on 2025-11-03, flagged APAC anomaly.",
      "type": "event",
      "ttl_days": 30
    }
  ]
}

Document Extraction API

PDFs, tables, forms — structured JSON out. Every time.

vision

POST/v1/extract

Send any document — PDF, scanned image, invoice, contract — and receive structured JSON with entities, tables, key-value pairs, and layout annotations. Powered by our Mistral-based fine-tuned extractor.

Use cases

Invoice processingContract analysisForm digitisationMedical records

Latency< 800ms p95

SLA99.9% uptime

Pricing

Charged per page extracted. Tables and forms count as one page each.

Starter

$0.018/ page

Up to 2k pages/mo

Growth

$0.012/ page

2k – 50k pages/mo

Scale

$0.007/ page

50k+ pages/mo

Enterprise

Custom/ page

On-prem available

POST https://api.drut.ai/v1/extract
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "document_url": "https://storage.co/invoices/inv_0042.pdf",
  "extract": ["entities", "tables", "key_values"],
  "language": "en",
  "output_format": "json"
}

Embeddings API

High-fidelity embeddings for search, clustering, and classification.

retrieval

POST/v1/embed

Generate dense vector embeddings for text, code, or structured data. Choose from our model family — 256d for speed, 1536d for precision. Batch up to 2,048 inputs per request.

Use cases

Semantic searchDuplicate detectionClusteringRecommendation

Latency< 180ms p95

SLA99.9% uptime

Pricing

Charged per 1k tokens embedded. Batching is strongly encouraged — up to 2,048 inputs.

Drut-Embed-Fast

$0.00008/ 1k tokens

256d, 120ms p95

Drut-Embed-Base

$0.00018/ 1k tokens

768d, 180ms p95

Drut-Embed-Large

$0.00035/ 1k tokens

1536d, 280ms p95

Enterprise

Custom/ 1k tokens

Fine-tuned domain model

POST https://api.drut.ai/v1/embed
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "model": "drut-embed-base",
  "inputs": [
    "Agent memory systems for production LLMs",
    "Long-term episodic storage in transformer agents"
  ],
  "truncate": true
}

Reranking API

Cross-encoder reranking. Drop your retrieval precision from 70% to 94%.

retrieval

POST/v1/rerank

Pass a query and a list of candidate documents. Get back a precision-scored, sorted list using our cross-encoder model. Works with any vector retrieval system or BM25.

Use cases

RAG precision boostSearch re-orderingCandidate shortlistingFAQ matching

Latency< 100ms p95

SLA99.9% uptime

Pricing

Charged per reranking call, regardless of candidate count (up to 100 docs).

Starter

$0.006/ call

Up to 20k calls/mo

Growth

$0.004/ call

20k – 500k calls/mo

Scale

$0.0025/ call

500k+ calls/mo

Enterprise

Custom/ call

Latency SLA

POST https://api.drut.ai/v1/rerank
Authorization: Bearer drut_sk_...
Content-Type: application/json

{
  "query": "How to reduce inference cost for large models?",
  "documents": [
    "Quantisation techniques reduce model size by 4x...",
    "The history of transformer architectures...",
    "Batching strategies for GPU inference workloads..."
  ],
  "top_n": 2,
  "return_scores": true
}

Ready to integrate?

Get an API key and start building in minutes. All services share the same authentication and base URL. SDKs for Python and TypeScript available.

Get API Key View Docs →

AI capabilities,delivered as APIs.

Ready to integrate?

AI capabilities,
delivered as APIs.