DECODING THE REVIEW DATABASE

3 min readMar 3, 2021

Review databases can be intimidating. Between custom review workflows, an array of deployed bells and whistles, and sometimes just sheer size, it’s easy to lose track of what a review database is at its essence. For the readers who built a “review database” in the Processing Processing reading, I imagine for some it’s almost impossible to fathom that the general concept of a review database built in Excel and one built in Relativity are the same. By stripping back the layers of complexity of a review database, we can expose the fundamentals of what is happening to build the context necessary to understand the complexity.

The purpose of a review database is to host a document, that document’s information, and the input of the reviewer on that document. In order to organize this information in a useful manner, databases are composed of fields. These fields can be separated into two buckets, 1) Fields containing document data, and 2) Fields containing reviewer data. Document data fields provide insight into the who, what, where, and when of the data in the matter. The reviewer data is the why and how it’s relevant (or not) to the matter. This is the essence of a review database. Everything else experienced is built on top of that foundation.

Technological advancements have optimized the reviewer capability and, as a result of those advancements, developed a layer of complexity making it more difficult to navigate for the untrained eye. However, this hasn’t shifted the essence of the database. All of these complexities start with the interaction of the software with the two buckets of fields — document data or reviewer input. This applies for simple and complex tools alike.

How folders populated in Relativity was a mystery to me until a colleague pointed out that there was a field that contained the File Path — this was document data. The review database wasn’t performing magic; it was leveraging document data to organize the documents in an optimized fashion.

Likewise, predictive coding felt like voodoo until I got an hour with an expert who explained how the program cross-referenced reviewer input fields and the text in the extracted text fields to replicate a reviewer’s inputs for documents with similar text patterns. Again, no magic; just leveraging the document data and reviewer input fields (although, in this case, the genius of the developers who create predictive coding tools is pretty magical).

Understanding that at the heart of a review database is simply an interaction between software and fields is not going to make the algorithm or script suddenly click, but it does simplify the experience. Because in its simplest form, a review database is nothing more than an Excel sheet with a set of columns to describe the documents and a set of columns to assign relevance. Everything else is built upon that very simple concept.

Through necessity and ingenuity, that concept has been and continues to be optimized by some pretty damn intelligent folks. But, if you find yourself a bit lost at what is taking place, start at this foundational place and recognize that all of these tools are based on an interaction with document data and reviewer inputs.

DECODING THE REVIEW DATABASE

Written by Austin Buell