Although eDiscovery is said to have been ‘invented’ in 1999, it is now an integral part of the legal process. Even a slip-up over a detail could cost you your case, or weaken your position when you’re negotiating a settlement, this is why you can’t cut corners when it comes to eDiscovery.
For many years, lawyers argued that the only way not to cut corners in the discovery process was to carry out a manual review of the documents. It was expensive, sure, but not as costly as messing up. But as gigabytes became terabytes became petabytes, and data sets of millions of emails were no longer remarkable, the costs eDiscovery rose so steeply, they were deterring businesses from pursuing their interests in court.
The impregnability of manual review collapsed under the weight of its cost, and research showing that machines were at least as skillful as humans in searching electronic documents. In 2012, a New York District Court judge came to aid of CFOs worldwide when he passed ‘computer-assisted review’ into U.S. jurisprudence. The judge ruled – and we paraphrase – that computerized searches do not cut corners, and that its results are defensible in a court of law.
Technology Assisted Review, as we shall call it here, is known by many different names, which is not helpful. Both IT and Legal are unclear if the ever-expanding terminology reflects a substantive difference in approach. Vendors use ‘algorithm’ the way magicians use ‘abracadabra’. Throw in buzzwords, such as Deep Learning and Artificial Intelligence and the confusion is complete. Businesses want to be better at eDiscovery and bring costs under control, but if they are not sure what they’re getting (or what they should be getting), the experience becomes frustrating.
The ROI of Technology Assisted Review (or TAR) has never been in doubt, and as the technology advances businesses are getting more bang for their buck. The state-of-the-art at the time of Judge Andrew Peck’s landmark ruling is now known as TAR 1.0. It saved time and money but it was clunky; if you added a new batch of documents to the data set, you had to re-start the search from scratch. A technique called Continuous Active Learning (or TAR 2.0) addresses this issue, and delivers results that are more stable. Most eDiscovery solutions are based on TAR 2.0 technologies.
One of the disadvantages of TAR 2.0 is that it involves a lot of unnecessary manual review of responsive documents. Unnecessary, because in almost all cases the document is responsive above a certain threshold: the reviewer is wasting his or her time. A new technique called Sampled Labelling identifies these sufficiently responsive documents and adds them to the training set; the reviewer does not have to rubber-stamp them. Sampled Labeling is very effective: it reduces manual labeling by anything between 10% and 95% compared with TAR 2.0.
Remember, you pay for eDiscovery by the gigabyte just as you pay for apples by the pound, so any technique that shrinks the pile of documents for manual review improves the ROI of you eDiscovery solution. Not only that, the faster you can make an assessment of the relevant data, the easier it is to enforce the kind of settlement that is to your advantage.