Load Files to Document Productions

December 08, 2021 by Will Newman

When a litigant produces documents in discovery, they often do not just produce the documents themselves. They usually produce a separate document, called a load file, that contains metadata about the documents, along with information that special software can use to load the document production.

Why should you continue to read this post about load files?

You read this blog’s previous posts on document productions and you’re now really running out of web pages to read.
You have a bunch of blog posts to read, but are unsure when one post ends and another begins, or when the posts were made, or whether any post is related to any other post.
You’re a software developer ready to get into the gazillion dollar legal software game but unaware of how current legal software implements legal strategy.

Litigants Do Not Like to Produce Normal Computer Files

There are two major reasons why litigants do not produce normal computer files in their document productions.

The first is that normal computer files contain metadata that the producing party may not want to share. For example, most computer files contain information about when they were created, who created them, and when they were last modified. Many word processing files may also contain “track changes” notations about how the document was edited and comments from people who have reviewed the document. By sharing the document in its normal form, a lawyer runs the risk of sharing more information than what appears on the face of the document.

The second is that normal computer files do not contain individual Bates numbers on each of their pages. Bates numbers allow litigants to quickly refer to specific pages of their document productions in depositions, communications between parties, and in motions to the court.

Document Productions Often Produce Snapshots of Files Instead

Instead of producing copies of original files, attorneys often produce TIFF images of individual pages of files. TIFF image format is traditionally used for picture images, so it may seem strange to use the format for documents that often are largely comprised of text. This takes up much more disk space and is harder to browse.

Lawyers do this because TIFF images do not contain the same metadata as the original file. Previous drafts and hidden information like the date a file was created are not captured in a TIFF image. Plus, as an image file, it is easier for a computer to stamp Bates numbers on each individual page.

Load Files Make Working with TIFF Images Easier

To help lawyers work with hundreds of thousands of TIFF images, various companies produce eDiscovery software that ingests the images and allows lawyers to browse them. Some of these programs are stand-alone software, but most work through websites.

To function, the software takes in the TIFF images, and also a load file, which is just a text file that tells the software what to do with the TIFF images. I touch on this subject in a previous post.

First, it tells the software what images together form a single document (for example, Bates numbers 1-25 are a single document, then numbers 26-28 are the next document, etc.). It also tells the software which documents are “family members” of other documents, by saying, for example, that Bates numbers 26-28 are an attachment to an email found at Bates numbers 1-25.

Next, the load file may provide specific metadata that the document producer agrees to share. So the file may say that the document whose Bates number is 1-25 was created on January 2, 2019, and its author was Jane Perez. This allows the reader to learn about the document quickly and gives greater control to the producer about what information she wants to share. Load files come in different formats, based on different eDiscovery softwares, but since they are small and easy to make, it is common for document producers to produce three or four different load files to accommodate other litigants’ needs.

Also, the load file may direct eDiscovery software to a “native” file that could not be converted to TIFF images. In that case, the load file may say that the document whose Bates number is 35 can be found in another directory with the filename Spreadsheet.xlsx.

Document Productions Also May Come with OCR Text Files

In addition to a load file, a document production may come with a bunch of text files that contain searchable text. This can be helpful because TIFF images are not automatically text searchable; they are just a picture of a document and not a text document themselves. Document producers may accompany these images with text files so they can be searched and the load files can help eDiscovery software connect the text files with the images.

But even if a production does not come with OCR text files, most eDiscovery software can scan the TIFF images themselves and develop their own text searchable files.

Litigation discovery