Bulk Downloads

Our bulk data files contain the same information that is available via our API, but are much faster to download if you want to interact with a large number of cases. Each file contains all of the cases from a single jurisdiction or reporter.

Requesting Access

Bulk data files for our whitelisted jurisdictions (currently Illinois and Arkansas) are available to everyone without a login.

Bulk data files for the remaining jurisdictions are available to research scholars who sign a research agreement. You can request a research agreement by creating an account and then visiting your account page.

See our About page for details on our data access restrictions.

Downloading

You can download bulk data manually from our website, or use the API if you are fetching many files at once.

To download all cases via the API, use the body_format and filter_type parameters to the /bulk/ endpoint to select all cases, sorted by jurisdiction, of your desired body_format.

API Equivalence

Each file that we offer for download is equivalent to a particular query to our API. For example, the file "Illinois-20180829-text.zip" contains all cases that would be returned by an API query with full_case=true&jurisdiction=ill&body_format=text. We offer files for each possible jurisdiction value and each possible reporter value, combined with body_format=text and body_format=xml.

The JSON objects returned by the API and in bulk files differ only in that bulk JSON objects do not include "url" fields, which can be reconstructed from object IDs.

Data Format

Bulk data files are provided as zipped directories. Each directory is in BagIt format, with a layout like this:

  • Illinois-20180829-text/
    • bag-info.txt
    • bagit.txt
    • manifest-sha512.txt
    • data/
      • data.jsonl.xz

Because the zip file provides no additional compression, we recommend uncompressing it for convenience and keeping the uncompressed directory on disk.

Caselaw data is stored within the data/data.jsonl.xz file. The .jsonl.xz suffix indicates that the file is compressed with xzip, and is a text file where each line represents a JSON object.

Using Bulk Data

The data.jsonl.xz file can be unzipped using third-party GUI programs like The Unarchiver (Mac) or 7-zip (Windows), or from the command line with a command like unxz -k data/data.jsonl.xz.

However, this increases the disk space needed by about 500%, and in most cases is unnecessary. Instead we recommend interacting directly with the compressed files.

To read the file from the command line, run:

xzcat data/data.jsonl.xz | less

If you install jq you can get nicely formatted output ...

xzcat data/data.jsonl.xz | jq | less

.. or run more sophisticated queries. For example, to extract the name of each case:

xzcat data/data.jsonl.xz | jq .name | less

You can also interact directly with the compressed files from code. The following example prints the name of each case using Python:

import lzma, json

with lzma.open("data/data.jsonl.xz") as in_file:
for line in in_file:
case = json.loads(str(line, 'utf8'))
print(case['name'])