Courts cite the web, and the web is always changing. In the time between the moment when a website was cited and the time an opinion is published, information on that website can disappear or change subtly... or substantially.

This obscures and dilutes the value of citations to web resources in court documents. Over time, these unarchived resources are exposed ever more to the vagaries of the web—site reorganizations, domain-name lapses, etc. The citations become useless or increasingly suspect.

SCOTUS has begun to address this matter, but at present offers only PDFs made from printouts of webpages, and does so only weeks or months after the citing opinion was published.

A Stopgap Measure

Our system determines in near real-time when the Court publishes an opinion and promptly takes snapshots of the web pages it cites.

Though it's possible the web page has changed in the time between citation and publication of the opinion, this process catches the resource at the freshest possible moment.

A Long-Term Solution

As legal publishing evolves over time, we expect judges, clerks and others to use services like Harvard's to create an archive of any web resource they cite and provide links to it as well.

How does this work?

  1. our application watches the Supreme Court website for new opinions
  2. when it sees one, it downloads it and converts it from a PDF to text
  3. it looks for web citations in the text, then passes any it finds to a human on our team for evaluation
  4. if the resource is findable online when the human validates the URL, the application archives it using's API
  5. users can browse opinions, justices or the master list of citations found in the opinions
  6. notifications go out to subscribers whenever citations are harvested from new opinions

This site was developed by Phil Ardery and is hosted by UC Berkeley Law Library.