eDiscovery AI: How Semantic Search Changes Document Review

Document review is where litigation budgets go to die. A mid-size commercial dispute can produce tens of thousands of documents, and someone has to look at every one to decide: relevant, privileged, or both.

For decades, legal teams have relied on keyword search to narrow the pile. But keyword search has a fundamental problem that no amount of Boolean logic can fix.

Keywords find words. Legal evidence is organized around concepts.

The Keyword Problem in eDiscovery

Traditional eDiscovery platforms match your search terms against the text of every document in the collection. Type “breach of contract” and you get every document containing that exact phrase. Sounds reasonable until you consider what it misses.

A document describing “failure to perform under the agreement” covers the same concept but uses none of your search terms. An email where a project manager writes “they never delivered what they promised” is potentially your most damning piece of evidence, and keyword search won’t find it unless you guessed every possible way a witness might describe the issue.

The standard workaround: run dozens or hundreds of keyword variations, combine them with Boolean operators, and iterate. It’s time-consuming, expensive, and still incomplete.

Studies have consistently shown that keyword search in eDiscovery achieves recall rates (the percentage of relevant documents actually found) between 20% and 80%, depending on case complexity and the skill of the person writing queries.

Best case, keyword search misses 20% of relevant documents. Worst case, it misses 80%.

How Semantic Search Works in eDiscovery AI

Semantic search takes a fundamentally different approach. It matches meaning, not character strings.

When you upload documents to an eDiscovery AI platform, each document gets processed into a mathematical representation of its content (what AI researchers call an embedding). This embedding captures the concepts, relationships, and context within the document, not just the individual words.

When you search, your query is converted into the same kind of embedding. The system then finds documents whose meaning is closest to your query’s meaning, regardless of the specific words used.

Here’s what that looks like in practice:

Search for “breach of contract” and semantic search also returns documents about “failure to perform,” “non-compliance with agreement terms,” and “they didn’t hold up their end of the deal.”
Search for “communications about the merger” and the system finds emails, memos, and notes discussing the acquisition, even if the word “merger” never appears.
Search for “evidence of price fixing” and the system identifies documents discussing coordinated pricing, market allocation, and bid rigging. The concepts that constitute price fixing, not just the label.

Why This Matters for Litigation Teams

eDiscovery AI semantic search changes 3 things at once for litigation teams:

Better recall means fewer missed documents

Semantic search catches the documents that keyword search misses: different terminology, informal language, industry jargon that doesn’t match your search terms. In privilege review, that means catching privileged communications that don’t contain the word “attorney” or “privileged.” In relevance review, it means finding the evidence your case depends on.

Fewer false positives mean less time wasted

Keyword search returns every document containing your terms, whether or not those terms appear in a relevant context. “Mercury” shows up in documents about the planet, the element, the car brand, and the Roman god. Semantic search understands context, so it returns documents where “mercury” means what you actually mean.

Simpler queries mean faster workflows

You describe what you’re looking for in plain language. No complex Boolean queries, no proximity operators, no wildcards. That makes search accessible to attorneys who aren’t query specialists, and it kills the iterative query-refinement cycle that eats hours in traditional eDiscovery workflows.

Semantic Search and Defensibility

A common concern with any new review methodology is defensibility. Courts have long accepted keyword search as a reasonable approach to document review, largely because the methodology is transparent and reproducible. Can semantic search meet the same standard?

The answer is yes, provided the platform supports it. Dodonai’s eDiscovery AI shows exactly which passages in each document matched your query and why, creating a transparent audit trail. You can document your search methodology, demonstrate that your queries were designed to capture all relevant concepts, and show that the results were verified against the source documents.

This matters because courts increasingly recognize that keyword search alone may be insufficient. Judges in cases like Da Silva Moore v. Publicis Groupe and subsequent decisions have endorsed technology-assisted review methodologies, including approaches that go beyond simple keyword matching.

Where Semantic Search Fits in the Review Workflow

Semantic search doesn’t replace the entire document review process — it transforms the first and most expensive phase. Here’s how it fits into a typical workflow:

Early case assessment: Run broad semantic queries to estimate the volume and nature of relevant documents before committing to a full review.
First-pass relevance review: Use semantic search to identify the most likely relevant documents, prioritizing reviewer attention where it matters most.
Privilege review: Search for communications involving legal advice, even when they don’t use standard privilege markers.
Issue coding: Tag documents by legal issue using concept-based searches that capture all relevant terminology variations.
Quality control: Run semantic searches against the “not relevant” set to catch documents that may have been miscoded during manual review.

For teams already using Dodonai for deposition summaries or medical record chronologies, adding eDiscovery search creates a unified workflow where document identification, analysis, and summarization happen in a single platform.

The Cost Equation

Beyond accuracy, semantic search changes the economics of document review. Traditional eDiscovery platforms charge per-gigabyte hosting fees plus per-user licensing that can run tens of thousands of dollars per year. Add the attorney hours spent building and refining keyword queries, and the cost of reviewing false positives, and document review often consumes the largest share of any litigation budget.

Dodonai’s eDiscovery AI starts at $30 per month — a fraction of legacy platform costs. But the bigger savings come from the reduction in review time: fewer false positives to sift through, fewer missed documents requiring supplemental productions, and simpler queries that don’t require eDiscovery specialists to build.

For small and mid-size firms that previously couldn’t afford dedicated eDiscovery platforms, this changes what’s possible. Enterprise-grade search capabilities are no longer gated behind enterprise budgets. For a detailed look at how Dodonai compares to legacy platforms, see our eDiscovery software comparison.

Getting Started

If your firm is still relying on keyword search for document review, semantic search represents the single biggest efficiency gain available in modern eDiscovery. The technology is mature, the results are defensible, and the cost barrier is gone.

Try Dodonai’s eDiscovery AI on your documents — upload a document set and run your first semantic search in minutes, not weeks.