The litigation team faced four million documents and a six-month discovery deadline. Traditional linear review—attorneys examining each document individually—would require 200 reviewers working full-time for the entire period. The economics were impossible: at prevailing contract review rates, document review alone would cost £12 million before any substantive legal work began.
Using Technology-Assisted Review, the team achieved 95% recall of relevant documents by reviewing just 8% of the collection. Total cost: under £1.5 million. More importantly, the TAR-assisted review proved more accurate than manual methods would have achieved—consistently identifying relevant documents that human reviewers, suffering from fatigue and inconsistency, would have missed.
TAR represents perhaps the most significant methodological advance in litigation practice in decades. Understanding how it works—and how to implement it effectively—has become essential knowledge for any legal professional involved in discovery or document-intensive matters.
How Technology-Assisted Review Actually Works
The Fundamental Concept
At its core, TAR uses machine learning to predict document relevance based on examples provided by human reviewers. The process involves several interconnected steps:
Training: Experienced attorneys review a sample of documents, coding each as relevant, not relevant, or privileged. These coded documents become training data that teaches the system what "relevant" means for this particular matter.
Learning: The TAR system analyses the training documents, identifying patterns that distinguish relevant from non-relevant documents. These patterns extend far beyond keywords to include word usage patterns, phrase constructions, document structure, metadata characteristics, and conceptual relationships.
Prediction: The system applies what it learned to predict relevance for unreviewed documents, generating relevance scores that indicate the likelihood each document is relevant to the matter.
Refinement: As attorneys review additional documents, confirming or correcting the system's predictions, the system continues learning and improving its predictions.
What Makes TAR Different from Keyword Searching
Keyword searches identify documents containing specific terms—a fundamentally limited approach for several reasons:
Synonym blindness: A search for "terminated" won't find documents about "fired," "let go," "dismissed," or "discharged" unless all variants are anticipated and included.
Context ignorance: A search for "breach" returns documents about data breaches, contract breaches, breaches of duty, and security breaches—without understanding which type is relevant to the matter.
Conceptual limitation: Keywords cannot identify documents that discuss relevant concepts using entirely different vocabulary—a document about "performance improvement plans" may be highly relevant to a termination dispute without containing any of the expected keywords.
TAR systems, by contrast, learn to recognise relevant documents based on the totality of their characteristics. A document can be identified as relevant even if it uses completely different language than training examples, as long as it shares conceptual similarity.
TAR Methodologies: From TAR 1.0 to Continuous Active Learning
TAR 1.0: Simple Active Learning (SAL)
The original TAR methodology, sometimes called "predictive coding" or "simple active learning," follows a batch-based approach:
- Seed Set Development: Subject matter experts review an initial set of documents (typically 1,000-2,500) to create training data
- Initial Training: The system trains on the seed set, learning initial patterns
- Scoring: The system scores the entire document population, generating relevance predictions
- Review: Attorneys review high-scored documents, coding them as relevant or not
- Iteration: The new coded documents are added to training data, and the process repeats
- Stabilisation: When additional training stops significantly changing the model's predictions, review is considered sufficiently complete
Characteristics of TAR 1.0:
- Clear separation between training and review phases
- Seed set quality critically important
- Multiple training iterations required
- Well-documented and court-accepted methodology
- Can be computationally intensive for very large populations
TAR 2.0: Continuous Active Learning (CAL)
The evolved approach, developed by researchers including Professor Maura Grossman and Gordon Cormack, eliminates the separate seed set phase:
- Initial Selection: The system presents documents predicted to be most useful for training (often based on diversity or uncertainty measures)
- Immediate Learning: Each attorney coding decision immediately updates the model
- Dynamic Presentation: The next batch of documents reflects the updated model's predictions
- Continuous Refinement: The process continues without distinct training/review phases
- Completion: Review continues until statistical validation confirms acceptable recall
Characteristics of TAR 2.0/CAL:
- No separate seed set—every reviewed document contributes to training
- More efficient than TAR 1.0 in many scenarios
- System optimises which documents to present for review
- Adapts dynamically as relevance criteria emerge or evolve
- Generally reaches recall targets with fewer reviewed documents
TAR 3.0 and Beyond: Integrated AI
Emerging approaches extend TAR with advanced capabilities:
Multi-Model Ensembles: Combining multiple machine learning algorithms to improve accuracy and reduce model-specific biases.
Concept-Based Analysis: Understanding documents at a semantic level, identifying topics and themes rather than just statistical patterns.
Cross-Matter Learning: Leveraging learning from prior matters to accelerate training on new, similar matters.
Integrated Issue Coding: Simultaneously predicting relevance and categorising documents by legal issue, privilege status, and other dimensions.
Statistical Validation: Proving TAR Works
The Core Metrics
TAR validation centres on two key metrics:
Recall: What percentage of all relevant documents did the review identify?
Formula: Relevant documents found ÷ Total relevant documents in the collection
Target: Typically 75-90% depending on matter requirements and proportionality considerations
Precision: What percentage of documents identified as relevant actually were relevant?
Formula: True relevant documents ÷ All documents coded relevant
Higher precision means fewer non-relevant documents reviewed, improving efficiency
There is an inherent trade-off: increasing recall (finding more relevant documents) typically decreases precision (includes more false positives). The optimal balance depends on matter requirements.
Validation Methods
Elusion Sampling
The gold standard for TAR validation:
- Take a random statistical sample from documents the TAR system classified as non-relevant
- Review every document in the sample manually (by senior attorneys, blind to TAR classification)
- Count relevant documents found in the sample—these are documents TAR "missed"
- Extrapolate statistically to estimate total missed relevant documents in the non-relevant set
- Calculate recall confidence interval based on sample results
If the elusion sample contains few relevant documents, TAR achieved high recall. If the sample reveals significant missed documents, additional training or review may be needed.
Control Set Comparison
An alternative validation approach:
- At the outset, set aside a random sample as a "control set"
- Review the control set manually using traditional methods
- Do not use control set documents for TAR training
- Compare TAR predictions against manual control set coding
- Calculate recall and precision based on comparison
Control sets provide ongoing validation throughout the review process, not just at completion.
Sample Sizes and Confidence Intervals
Statistical validity requires adequate sample sizes:
| Confidence Level | Margin of Error | Required Sample Size |
|---|---|---|
| 95% | ±5% | 385 documents minimum |
| 95% | ±3% | 1,067 documents minimum |
| 99% | ±3% | 1,843 documents minimum |
Sample size requirements don't depend on population size beyond minimum thresholds—a counterintuitive but statistically sound principle.
Implementing TAR: A Practical Protocol
Phase 1: Collection Analysis
Before TAR begins, thoroughly understand the document collection:
Document Population Assessment:
- Total volume after processing and de-duplication
- Document types present (emails, attachments, native files, images)
- Date ranges covered
- Custodians represented
- Language distribution
Richness Estimation:
Richness refers to the percentage of the collection that's relevant. A random sample review provides initial estimates:
| Richness Level | Percentage | TAR Implications |
|---|---|---|
| Very Low | < 5% | TAR highly efficient; finding needles in haystack |
| Low | 5-15% | TAR typically optimal approach |
| Moderate | 15-40% | TAR beneficial; efficiency gains moderate |
| High | > 40% | Linear review may be competitive; TAR still adds consistency |
Phase 2: Seed Set Development (TAR 1.0) or Initial Training (TAR 2.0)
For TAR 1.0:
Seed set composition matters significantly:
- Include representative examples of all relevance categories
- Include clearly non-relevant examples
- Use a mix of judgmental selection (known important documents) and random selection (ensuring coverage)
- Typical size: 1,000-2,500 documents
- Review by subject matter experts, not junior reviewers
For TAR 2.0/CAL:
No separate seed set needed, but initial training should involve:
- Subject matter expert reviewers for initial batches
- Clear understanding of relevance criteria before beginning
- Calibration exercises if multiple reviewers involved
- Careful attention to edge cases in early review
Phase 3: Iterative Training and Review
Throughout the TAR process:
Maintain Review Consistency:
- Detailed review guidelines documented and shared
- Regular calibration exercises among reviewers
- Escalation protocols for unclear documents
- Quality control sampling of reviewer decisions
Monitor Progress:
- Track recall estimates at regular intervals
- Monitor precision trends (decreasing precision may indicate model degradation)
- Watch for stabilisation signals indicating training sufficiency
- Verify all document types and custodians are being captured
Phase 4: Validation and Completion
Before production:
- Conduct elusion sampling of non-relevant set
- Calculate recall confidence interval
- Document methodology comprehensively
- Preserve all training data, model parameters, and decision records
- Prepare defensibility documentation for potential challenges
When to Use TAR—And When Not To
TAR Is Appropriate When:
Large document volumes: TAR's efficiency advantages increase with scale. Below 10,000 documents, linear review may be comparable in cost.
Low to moderate richness: TAR excels at finding needles in haystacks. When 80% of documents are relevant, finding them isn't the challenge.
Definable relevance criteria: TAR learns what "relevant" means from examples. If relevance criteria are too vague or contested, training will be inconsistent.
Timeline permits validation: TAR requires statistical validation. Rush reviews with no time for proper protocol may not be TAR-suitable.
TAR May Not Be Ideal When:
Very small collections: Setup and validation overhead may exceed linear review cost for small populations.
Extremely high richness: When most documents are relevant, identifying them is straightforward; TAR's advantages diminish.
Highly technical content: Domain-specific vocabulary in technical or scientific documents may require specialised training or domain expertise that general TAR systems lack.
Rapidly evolving issues: If relevance criteria change substantially during review (e.g., new claims added), TAR training may become problematic.
TAR and Court Acceptance
The Landmark Cases
TAR has achieved broad judicial acceptance through key decisions:
Da Silva Moore v. Publicis Groupe (S.D.N.Y. 2012): The first published decision approving TAR, finding it can produce results "superior to the results of [manual] review." The court approved a detailed TAR protocol including seed set disclosure to opposing counsel.
Rio Tinto PLC v. Vale S.A. (S.D.N.Y. 2015): The court went further, holding TAR is "the gold standard" and suggesting parties may have an obligation to use it where appropriate given the proportionality requirements of discovery.
Pyrrho Investments Ltd v. MWB Property Ltd (UK High Court 2016): The first English case approving TAR, finding it was "at least as accurate as, if not more accurate than" the manual review originally proposed.
Defensibility Requirements
Courts generally expect:
Transparency: Document your methodology and share it with opposing parties. "Black box" approaches that can't be explained invite challenge.
Validation: Provide statistical evidence that TAR achieved acceptable recall. Elusion sampling results with confidence intervals are the standard.
Reasonableness: The approach should be proportionate to the case. TAR for a ten-document dispute invites questions; TAR for millions of documents is expected.
Cooperation: Courts favour good faith engagement with opposing parties on TAR protocols. Unilateral imposition of contested methodologies is disfavoured.
RUNO's SENTINEL Advanced Review Platform
RUNO's SENTINEL module implements continuous active learning with integrated quality controls designed for defensible large-scale review:
Adaptive Learning Engine: The system learns from every attorney decision, continuously refining its understanding of relevance for your specific matter. Unlike batch-based approaches, learning happens in real-time as review progresses.
Intelligent Document Presentation: SENTINEL's Predictive Coding Panel presents documents optimised for training efficiency, prioritising documents that will most improve the model's performance.
Real-Time Validation: Built-in validation tools provide real-time recall estimates and confidence intervals, enabling teams to track progress toward defensible review completion without waiting for separate validation exercises.
Integrated Privilege Detection: Privilege analysis runs alongside relevance coding, with specialised models trained on privilege indicators. The system flags potential privilege issues for attorney verification before any production risk materialises.
Complete Audit Trail: Every training decision, model update, and review action is logged, creating the documentation foundation for defensibility. Export-ready reports satisfy court requirements for methodology transparency.
Cross-Matter Learning: SENTINEL enables organisations to build institutional knowledge across matters. Privilege models learn from prior reviews; relevance patterns from similar matters provide training acceleration.
Conclusion: The Present and Future of Document Review
Technology-Assisted Review represents the maturation of document review from manual process to data science discipline. The principles—machine learning from human decisions, statistical validation, continuous improvement—have transformed what's possible at scale.
The four-million document matter that would have required 200 reviewers and £12 million becomes manageable. More importantly, TAR achieves accuracy that manual review cannot match—machines don't get tired, don't lose focus, and apply criteria consistently across millions of documents.
Courts recognise this. TAR isn't just accepted; in appropriate cases, failure to use it may itself be unreasonable. The question isn't whether to use TAR, but how to implement it effectively.
The answer lies in rigorous methodology: proper training, continuous validation, complete documentation. TAR done well is defensible, efficient, and accurate. TAR done poorly creates risk and wastes the technology's potential.
For legal teams facing large-scale document review, TAR isn't the future. It's the present. The only question is whether you're using it effectively.
Explore RUNO's SENTINEL Advanced Review Platform or request a demonstration to see how TAR transforms document review.