Defined as the identification, collection, preservation, processing, review, analysis and production of electronically stored information (ESI) to meet the mandates imposed by common-law requirements for discovery.” Historically, we’d do this manually—a lot of lawyers reading a lot of documents, linearly. In the big data era, that’s not much of an option, so most organizations use eDiscovery tools.
Contemporary eDiscovery scenarios frequently involve both structured (e.g., databases) and unstructured information (e.g., emails). These collections may be dozens, hundreds, or thousands of gigabytes, all of which must be analyzed with only a small volume (perhaps just a couple of megabytes) produced. It’s a lot like looking for a needle in a stack of needles. Data mining these sources with text and content analytics is therefore key to success. It also means that machine learning and predictive analytics software are crucial tools in the eDiscovery toolbox.