How eDiscovery Technology Can Mitigate Challenges of Investigations

Some of the challenges associated with searching through data for an investigation are similar to those faced during the discovery stage of the litigation process as it exists in the United States. The need to defensibly preserve information and effectively search through it to find relevant facts, for instance. However, there are key differences between the information needs of discovery and those of a digital investigation.

In discovery, the goal is to find every relevant piece of information that can reasonably be found. The point is to be exhaustive in terms of relevant information found- finding the evidence is the point of the exercise. To achieve that objective, legal teams have an indeterminate, and generally lengthy, timeline. Discovery can and often does span years—during which the costs just keep piling up.
The goal of an investigation, on the other hand, is either to find evidence in support of an allegation or to conclude that there’s nothing there to find and that the allegation was false or mistaken. To that end, investigation teams should conduct a reasonable and diligent search for evidence of suspected malfeasance or noncompliant behavior, but they need not turn over every conceivable stone. The less exhaustive nature of digital investigations means the timelines are significantly shorter: an investigation often seeks to reach at least a preliminary conclusion in a matter of weeks.

Despite their differences, though, investigations have enough in common with discovery processes that they can benefit from the technologies developed for eDiscovery. The goals may differ, the tools need not. After all, Both eDiscovery and investigations demand the ability to rapidly and accurately sift through reams of data, discarding unhelpful or irrelevant data sources, surfacing important facts, and identifying hidden patterns. The tools developed for eDiscovery are therefore tremendously applicable to investigations. 

report on ipro website-largeDownload the report to discover what 184 legal practitioners from law firms, corporations and governments have to say about how AI is deployed across Legal Discovery.

eDiscovery tools fall into four broad categories: 

  1. Automation tools such as deNISTing, deduplication, data processing, email threading, and optical character recognition. These techniques can quickly eliminate extraneous data files, organize messages into related threads, and allow investigation teams to focus on the data that might help their search rather than getting bogged down in duplicate files or missing clues in image files that haven’t been converted to searchable text. 
  2. Context tools such as entity search, basic entity extraction, foreign language extraction, language translation, and dark language detection. These basic analytics capabilities can quickly detect concepts such as persons, places, and things and can then group like concepts together. For global organizations where employees use multiple languages, language extraction and translation tools are crucial for conducting efficient and rapid investigations that incorporate data sources written in other languages. These capabilities are founded in part on natural language processing (NLP), a branch of artificial intelligence (AI). Dark language detection—which can unearth code words—is particularly useful in the context of investigations. 
  3. Proactive intelligence tools such as technology-assisted review (TAR), topic modeling, concept clustering, and document classification. These approaches use AI—both the older predictive coding models and the newer continuous active learning approaches—to group documents into related sets and determine which are most likely relevant or helpful. 
  4. Emerging intelligence approaches such as relationship analysis, auto-detection and auto-redaction of sensitive entities, network analysis, sentiment analysis, and anomaly detection. These advanced analytics tools represent the next frontier for legal technology, and they’re particularly helpful for investigation teams, as they can quickly uncover relationships between parties—even when those relationships are carefully concealed—based on communication patterns. Auto-detection and auto-redaction of sensitive entities also help organizations protect private data when an investigation involves cross-border data transfers and the accompanying data privacy considerations. 

According to our 2021 State of AI and Technology Adoption in Legal Discovery report, investigation teams have embraced many—but not all—of these tools. For example, the large majority of investigation teams are using TAR to quickly review datasets. About 75 to 80% are using entity search to identify concepts such as persons or locations without knowing specific names or keywords for those concepts. 

But other tools that would be enormously helpful in the context of investigations are underutilized. Not even half of investigation teams are using anomaly detection to recognize unusual patterns of behavior; fewer than a third are using dark language detection to highlight potential code words or suspect behavior. 

To better understand the applicability of these technologies to investigations, let’s look at a few specific use cases. 

Use Case Examples and Practical Tips 

Early Case Assessments (ECA) 

In a proactive ECA investigation, an organization seeks to quickly determine the key facts of a potential litigation matter to determine how it will ultimately play out. How long will the underlying matter take to work its way through the court system? What is its outcome likely to be? How much will it cost to reach that probable conclusion? With these answers, legal teams can make a well-informed decision about how to proceed. Often, settling a case early is the most cost-effective—and least damaging—way to manage it. 

But if an organization’s ECA process isn’t designed to handle the rising volume of business data, the costs of ECA can spiral out of control, negating its potential cost savings. Fortunately, eDiscovery techniques using AI can quickly cut through excessive data to highlight the key facts of a matter, pinpoint helpful custodians, and decide on a reasonable strategy for handling the matter. 

To make the most of ECA, leverage tools such as: 

  • Relationship and network analysis to examine the timing, frequency, and intensity of communications and activities between different individuals, and therefore to discern the relationships among them. These analytics tools can uncover additional custodians or parties of interest and rapidly bring a fuzzy allegation into clear focus.
  • Concept clustering to identify related concepts within or across different documents and data points and to uncover new areas for investigation and analysis.
  • TAR to quickly focus on highly relevant documents and gain a sense of the overall strength or weakness of a case—and to just as quickly disregard documents that are not helpful.

Internal Investigations 

When an organization learns of potential employee misconduct or internal noncompliance—whether that concern is reported by an employee, pointed out by a regulatory agency, or detected during a standard audit—it must act quickly. The investigation team’s first priority is to determine whether that complaint is founded so that it can take prompt remedial action or whether it is a false alarm that it can just as swiftly close the book on. In those cases where an allegation is founded, the organization must also prepare for the possibility of legal action or a regulatory agency’s intervention. 

Internal investigations run the gamut of everything from accounting fraud to sexual harassment and anything in between. Helpful technologies for internal investigations include these: 

  • Entity search and basic entity extraction to identify key concepts—be they people, activities, places, or events—and group those concepts together. These tools can give an investigation team a quick sense of who the major players are and what happened.
  • Dark language detection to identify code words or phrases that may indicate malfeasance. These tools can quickly and accurately pinpoint potential areas of concern so the investigation team doesn’t waste time looking for suspicious conversations and can instead focus on the most likely suspects first.
  • Anomaly detection to notice those moments when a pattern—be it a pattern of communication, financial transfers, or work hours—has changed, potentially indicating misconduct. 

Government Investigations 

Regulatory agency investigations can take several forms, from investigations prompted by a whistleblower complaint, regulatory inquiries, and subpoenas to second requests in proposed mergers and acquisitions or more general government oversight into corporate compliance. For all of these investigations, time is of the essence, and cooperation is critical. 

Helpful technologies for government investigations include the following: 

  • Data processing tools like deduplication and hashing to eliminate duplicate files and irrelevant, unhelpful data sets even in the face of limited information about what exactly the agency is investigating.
  • TAR to gain a broad sense of what the government’s search parameters may reveal about its investigation within the tight timeline of a regulatory investigation.
  • Foreign language extraction and translation to ensure that nothing is missed—even when it’s written in a different language. 

Unlock the Power of eDiscovery Technology for Efficient, Effective Investigations 

Chances are your organization already has access to many of these tools and technologies as part of your eDiscovery toolkit. If you’re not leveraging those tools in your investigations—or if the solutions you’re using are too unwieldy or difficult to use in the rapid-fire pace of an investigation—now is the time to unlock the power of intuitive, accessible eDiscovery technology

Investigations of all kinds are likely to rise as employees return to the office. Will you be ready?