Monday, October 15, 2012

Everson On eDiscovery: Understanding Predictive Coding

Everson On eDiscovery: Understanding Predictive Coding

Author: Eric Everson, MBA, MSIT-SE


(About the Author: Eric Everson is a software engineer turned law scholar.  He is currently a 3L at Florida A&M University College of Law where he serves as the President of the Electronic Discovery Law Student Association.)

The specialization of eDiscovery is rapidly emerging as one loaded with acronyms and unique terminology; as practitioners terms like TAR (technology assisted review), CAR (computer assisted review), or intelligent review get tossed around in exchange for a process also known as predictive coding.  Don’t let this technical sounding term scare you, as understanding at least the basics of predictive coding can have a major impact on your litigation costs.

First, what is predictive coding?  To define it broadly, predictive coding is a machine- learning process wherein software is used to intake documents to build a training set which is then used to model the data to establish patterns.  These patterns are used to assign prediction scores which allow us to gauge the computer’s statistical accuracy.  Here is a great way to think of it:

In the midst of discovery between two corporations, 800,000 documents are produced for review.  You know you don’t need all 800,000 of them and in fact there are probably only about 80 files that are truly going to be valuable as evidence in your matter.  Predictive coding is the process that allows you to put a computer to task for accurately narrowing these documents to a more manageable volume.  You do not load all 800,000 documents from the beginning, first you start by training the software with what are called training sets.  These training sets are very important because this is where we get our accuracy percentage which we will use to narrow the volume of documents.  Once you have your training sets established, you decide which percentage (also known as the prediction score) will get you closer to the 80 documents you want from the mountain of 800,000.  A good measure to start with might be a prediction score of 75%; this will likely narrow your 800,000 documents to 8,000 and from there you can retrain the software to narrow even farther as you work your way toward the 80 documents you need.      

Next, why do you care about predictive coding?  The bottom line is drawn from that age old adage, “time is money”.  Why would you spend time (or money) on having lawyers or paralegals try to work their way through a mountain of 800,000 documents?  With predictive coding software, we divert these resources from this part of the discovery process and put the computer to task.  A single successful law suit that utilizes predictive coding software will pay for the software many times over.  It is an investment that law firms cannot afford to miss in an environment where the volume of digital data is predicted (pun intended) to multiply 50 times over the next ten years.  

Finally, when do you need predictive coding software?  The thing about predictive coding software is that it is a tool that can be extremely valuable under the right circumstances.  Predictive coding is not always necessary in a law suit.  There may not be a huge number of files subject to the litigation or alternatively a keyword search may be equally sufficient given the context of discovery.  It is important to know that predictive coding software is a tool that is available when the right circumstances arise.

In conclusion, predictive coding offers a technology-based approach to reducing the costs associated with complex litigation.  We are at a time in history when terms like cloud computing and big data are being increasingly used.  Additionally, given new mediums of content generation (email, text, images, video, tweets, likes, +1’s, shares, etc) the volume of data is explosive.  Learning to use tools like predictive coding software today, will prepare you to litigate in the face of big data tomorrow.  

#eDiscovery

Are you on Twitter?  Follow me @IntleDiscovery

---------------------------------------------------------------------------------

About the Author:  Eric Everson is a 3L law student at Florida A&M University – College of Law where he will graduate in May 2013.  Prior to law school he earned an MBA and Masters in Software Engineering while working for ten years in executive leadership within the U.S. telecommunications industry.  The views and opinions presented in this blog are his own and are not to be construed as legal advice.  Eric Everson currently serves on the Board of Governors for The Florida Bar Young Lawyers Division Law Student Division and is the President of the Electronic Discovery Law Student Association at Florida A&M University – College of Law.  Follow @IntleDiscovery        

No comments:

Post a Comment