Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014

There are several ways that an evaluation can be conducted following predictive coding. a. After the documents have been categorized by the system, review can be continued on newly generated random samples of documents. That is, the same expert continues to evaluate random samples of documents until a sample size the parties agree is adequate has been obtained. The system’s efficacy on this sample is taken as a measure of its performance. b. A separate random sample of documents designated by the predictive coding system as non-responsive can be evaluated to compute the Elusion measure. Elusion is the proportion of documents classified as putatively nonresponsive that should have been classified as responsive. Ideally, only a small proportion of the documents in the putatively non-responsive set will be found to be responsive. In practice, the proportion of responsive documents in the putatively non-responsive set should be only a small fraction of the prevalence of responsive documents. Elusion, therefore, needs to be compared to the original estimate of responsive document prevalence. The size of this sample will depend on the required confidence level and confidence interval. .

Intro to Predictive Coding: Overview & Interpretation of Terminology June 2014 | Page 18