DRUT AI | Modern Agentic Ai Company

The Challenge

The Due Diligence Bottleneck

M&A due diligence at a major AmLaw 100 firm was consuming junior associate capacity at a rate that limited deal throughput. A mid-sized acquisition with 3,000 contracts in the data room required 6–8 junior associates reading for 3–4 weeks — extracting risk clauses, flagging unusual terms, and checking compliance with the firm's playbook requirements. At billing rates of $350–$450 per hour, the associate cost alone for a typical deal ran $200,000–$280,000. Deal complexity was increasing but headcount was not, creating a capacity ceiling on deal volume that partners could accept.

Accuracy Requirements in a Zero-Tolerance Domain

The domain has near-zero tolerance for missed risk signals. A missed change of control clause, an overlooked indemnification carve-out, or an undetected non-compete conflict can expose a client to eight-figure liability. The firm's existing document review process had a 98.1% clause recall rate — impressive, but in a 3,000-contract data room, 1.9% miss rate means 57 potentially material clauses not surfaced for partner review. Any AI-assisted system needed to demonstrate at minimum parity with that baseline, with the aspiration of exceeding it. This was a non-negotiable requirement before any production deployment would be authorized.

"In a 3,000-contract data room, a 1.9% miss rate means 57 potentially material clauses not surfaced. We needed to beat that — not just match it."

Our Solution

Data Room Ingestion and Contract Parsing

The system ingests full data rooms via secure, authenticated connections to the firm's virtual data room providers (Intralinks, Datasite). Contracts are extracted, normalized across formats (PDF, Word, scanned documents via OCR), and chunked at the section level with structural metadata preserved. A document classifier identifies contract type (employment, supplier, customer, IP assignment, lease, etc.) and applies the appropriate extraction schema. Each contract type has a defined set of material clauses the system looks for — drawn from the firm's playbooks and expanded through analysis of 50,000 historical contract reviews.

Risk Register Generation

The system generates a structured risk register per data room: every flagged clause with its source contract, page reference, relevant playbook standard, and a severity classification (material risk, notable deviation, standard term, favorable term). Partners review the risk register rather than reading source documents — they click through to underlying contracts only when a flagged item warrants deeper review. Clause extraction recall reached 99.2% on the evaluation set — exceeding the firm's manual baseline. The false positive rate (flagging standard terms as notable deviations) was tuned conservatively: better to surface more flags and trust partner judgment than to suppress flags and miss material items.

Cross-Document Consistency Analysis

A distinctive capability beyond standard clause extraction: the system identifies inconsistencies across contracts in the same data room. A definition in one agreement that conflicts with how the same term is defined in a related agreement. An indemnification obligation in a supplier contract that contradicts a limitation of liability in a customer contract. These cross-document conflicts are among the highest-value signals in due diligence and the most difficult for manual review to catch systematically.

"Cross-document conflict detection — finding the definition in one agreement that contradicts the related agreement — is the capability that most surprised partner-level reviewers."

Results

Time and Cost Impact

Due diligence cycle time compressed from 3 weeks to 3 days on a representative 3,000-contract data room. Associate hours dropped from 1,200+ to approximately 180 — the remaining time spent on partner-level review of surfaced flags, client communication, and issues requiring legal judgment. At the firm's billing rates, this represented roughly $180,000 in associate cost reduction per deal. The firm's chosen model was to pass approximately 40% of this saving to clients as competitive pricing and retain 60% as margin improvement. Deal throughput increased because the capacity constraint had been removed.

The Zero Hallucination Audit

The firm conducted a 6-month audit of AI-extracted clauses against source contracts. Auditors reviewed a random sample of 2,400 extracted clauses across 18 deals. Zero fabricated clauses were found — every extracted clause mapped to an exact source passage with accurate page and section citation. This outcome was not accidental. The system is explicitly designed to extract and cite, never to infer or summarize without grounding. When a clause pattern is detected, the system returns the verbatim passage and its location. Synthesis of meaning is left to the reviewing attorney.

Implementation

Process & Timeline

Playbook Digitization

Converted 12 practice-area playbooks into machine-readable extraction schemas. Defined clause taxonomy with senior associates across M&A, PE, and corporate practices.

Ingestion & OCR Pipeline

Built secure data room connectors, multi-format normalization, and OCR pipeline for scanned documents. Established data handling protocols meeting firm security requirements.

Extraction Engine

Built contract type classifier, clause extraction models, and cross-document consistency analysis. Extensive evaluation against historical deal data.

Risk Register & UI

Built partner-facing risk register interface with clause drill-down, severity filtering, and export to Word. Three rounds of partner feedback on information hierarchy.

Validation & Launch

Shadow ran on 3 live deals alongside traditional review. Validated recall before production authorization. Phased launch by practice group.

Technology Stack

GPT-4oLlamaIndexPineconePythonFastAPITesseract OCRPostgreSQLAWS S3Intralinks APIDatasite APIWeasyPrintCelery

Contract Analysis System for M&A Due Diligence