Field

The caching of "Good Tea" - Field v. Google

Prickly Poet

Blake Field is a lawyer and poet in Nevada. Being a poet, we was given to writing poems, which he published on his website at www.blakeswritings.com. Being a lawyer, we was given to suing, which he did when Google cached his poems in their search engine. Specifically, in 2004, Field filed a claim for copyright infringement against Google asserting that the caching of his poem "Good Tea" constituted unauthorized copying and distribution of his copyrighted work.

The Google cache is basically a repository of crawled web pages. In its unending quest to index the billions of web pages available on the Internet, the GoogleBot locates, crawls, analyzes, and catalogs the web pages in its searchable index. Consequently, when you type in a search term on Google, rather than crawling the entire web searching for pages matching your criteria and returning a result set a month later, Google accesses its index using optimized algorithms and returns a result set in a matter of milliseconds.

The result set, as shown below contains usually contains the following six items: Title, Snippet, URL, Size, Similar Pages, and Cached (the last three are optional depending on the site information in the index):

As you can see from the illustration, the Title is the prominent part of the result set, and it also contains the hyperlink to the underlying website. The Snippet consists of one or two sentences from the Google index that have previously been extracted from the website. The URL identifies the location of the website, and also contains a hyperlink to the underlying website. If Google has indexed the website, it will show the Size of the site in kilobytes. Similar pages link is a hyperlink to another result set which is constrained to web pages that fall into the same general category as the selected result. And pertinent to our discussion here is the Cached link, which is a hyperlink to the representation of the website in Google’s cache. The cache is refreshed approximately every fourteen to twenty days.

The Google Cache

Why a cache? Given the two hyperlinks to the underlying website, it may seem not particularly useful for the result set to contain a hyperlink to the potentially stale version in the Google cache. However, there may be times when the underlying website is offline, has had pages deleted, or links broken. In these situations, having a cache backup can be a lifesaver. When a user clicks on the Cached hyperlink, a disclaimer is posted on top of the cached page, as shown in this image of the cache for this page:

Webmasters can prevent Google from caching their site by using the ‘no-archive’ metatag. Webmaster can also prevent Google from indexing their site at all by using the ‘no-index’ metatag or putting the ‘User-agent:* Disallow: /’ command in the website’s robots.txt file.

Field of Dreams

Getting back to Field, for reasons not entirely clear (although certain gold-digging innuendos have been made), Field registered his poems with the Copyright Office, posted them to his website without including exclusory metatags or robots.txt file commands, found them in the Google cache, and filed the lawsuit for copyright infringement. The theory put forth by Field is that when a user clicks on the ‘Cached’ link for his website, Google is creating and distributing copies of his works.

This is somewhat tortured logic since the copy has already been made and is residing in Google’s cache. When a user clicks on the ‘Cached’ link, the user is causing a copy of the work to be downloaded to the user’s browser, but there is no allegation from Field that users accessing the cache are infringing, nor were there allegations that Google was liable for contributory infringement or vicarious liability (which certainly would have been a stronger argument), or even that the original insertion of the work into the Google cache constituted direct copyright infringement.

Google responded that users clicking on the ‘Cached’ link does not constitute direct copyright infringement on behalf of Google, and that even if it did, Google would still not be liable under the following theories: 1) Implied License; 2) Estoppel; 3) Fair Use; and 4) Safe Harbor under the Digital Millennium Copyright Act.

Direct Copyright Infringement

The court had no problem finding that users clicking on the ‘Cached’ link did not constitute direct copyright infringement because there was no volitional act on the part of Google at that point in the process. The court then went into quite gratuitous depth about how even if it had ruled otherwise on this issue, each of the defenses proffered by Google would have been quite sufficient to avoid liability in any case.

Implied License

The ruling on Implied License is important because it lays the groundwork of a common law foundation for a theory that has been floating around in primarily academic and policy circles. In academic circles, the theory merely reflects common sense in acknowledging that the Internet doesn’t work unless users are deemed to have an implied license to access web pages posted on the Internet. Since web pages are technically copied to a user’s computer for display on the user’s computer when that user accesses a web page, anyone who accesses a web page without authorization would be guilty of direct copyright infringement. In policy circles, the Report of the Working Group for the National Information Infrastructure (the “White Paper” which laid the groundwork for the Digital Millennium Copyright Act), made occassional reference to the concept of implied license: “[A] defendant may successfully assert that the activity is non-infringing due to the existence of a license—statutory, negotiated or implied. All of these defenses are available in the NII environment. For instance, one or more of these defenses, such as fair use or the existence of an implied license, may be successful where a copyright owner’s posting to an automatic electronic email distribution list is reproduced and distributed to the subscribers of the same listserv in connection with a response to or comment on the posting.”

There are two dangerous aspects to this particular ruling. The general theory of implied access is that the act of posting a web page on the Internet constitutes the granting of the implied license. In this case, Field knew of the opt-out options, and chose not to avail himself of those options. It was this knowledge that formed the basis for the implied license.

Estoppel

This is another defense that was mentioned in the White Paper: The Supreme Court has stated that “[a] successful defense of a copyright infringement action may further the policies of the Copyright Act every bit as much as a successful prosecution of an infringement claim by the holder of the copyright.” There are a number of legal and equitable defenses available to defendents in copyright infringement actions. Fair use is the most common of the defenses. Others include misuse of copyright by the copyright owner, abandonment of copyright, estoppel, collateral estoppel, laches, res judicata, acquiescence, and unclean hands.”

In this case, the court reviewed the applicability of estoppel, and mapped the four part test for estoppel to the facts:

  • 1. Field knew of Google’s conduct;
  • 2. Field intended that Google rely upon his conduct;
  • 3. Google was ignorant of the true facts; and
  • 4. Google detrimentally relied on Field’s conduct.

Under this fact pattern, the court found that Field was aware of Google’s caching practices, and that by not putting in appropriate metatags, Field in fact intended Google to cache his web page. Further, since Field did not use the metatags, nor did he avail himself of Google’s removal tool, Google had no reason to know that Field objected to the caching of his web page, and in reliance on this behavior, Google cached the web page. Thus, all four estoppel factors were present and met.

Fair Use

The Fair Use analysis in Field was a good synthesis of the analyses that had preceded it. The business rules were extracted from that analysis and used to contruct the Fair Use Visualizer.

Additional discussion of the analysis and the implications of the heuristics articuluated there are discussed in Towards Mathematical Certainty in Fair Use Analyses.

The court began by addressing the four factors outlined in section 107 of the Copyright Act:

  • Factor 1 - Purpose of Use
  • Factor 2 - Nature of Copyrighted Work
  • Factor 3 - Relative Amount
  • Factor 4 - Market Effect

In looking at the first factor, the court analogized to the facts in Kelly v. Arriba, in which a visual search engine's use of copyrighted photographs to improve access to information on the internet was a transformative fair use based on the original function of the work, which was artistic in nature. Since Field claimed that he created his poems to serve an artistic function, Google's use added something new, and did not merely supercede the original work.

As for what's added, the first item is that Google's cache enables users to access content when the original page is inaccessible. Second, it allows users to detect changes that have been made to a particular page over time. Third, Google's highlighting feature allows users to understand why a result was responsive to their query. Fourth, the size of the 'Cached' link and header on the cached page serve notice to the user that this use is a complement, and not a substitute for, the original page. Fifth, websites have the ability to prevent Google from caching their pages, and the fact that billions of websites choose to have their pages cached indicates that they do not view the cache copies as substitutes for their pages.

The Court noted that when a use is found to be transformative, the "commercial" nature of the use is of less importance in analyzing the first fair use factor. Since Google's use is transformative, the court gave little weight to this subfactor.

For the second factor, the Court again invoked transformation and noted that when a use is found to be transformative, that the second factor is not terribly significant in the overall fair use balancing. Here the court noted that, assuming the poems in question were creative, that this factor weighed only slightly in Field's favor.

A pattern emerges in looking at the third factor, where yet again the Court invokes transformation and notes that when a use is found to be transformative, and the work can otherwise be viewed free of charge, (like a TV show or a website), then this factor will be neutral even if all of the work is copied.

Field made his content available for free on his website. Because Google used no more than is necessary (even though that is all of it), this factor is neutral.

Finally, in looking at factor 4, the court found that there was no evidence that Google’s ‘Cached’ links had any impact on the potential market for Field's work, and consequently found that this factor weighed in favor of fair use.

In addition to the four part test, the Court threw in a ‘Good Faith’ fifth factor. The Court noted that the Copyright Act authorizes courts to consider other factors than the four non-exclusive factors typically referenced. In this case, the Court considered whether Google operated its cache in good faith, and finding that it did, awarded Google bonus fair use points.

Safe Harbor

Section 512(a)-(d) provides for four different types of safe harbor for Online Service Providers. In this case, Google was claiming immunity under section 512(b), the safe harbor for system caching. This safe harbor provides that “[a] service provider shall not be liable for monetary relief…for infringement of copyright by reason of the intermediate and temporary storage of material on a system or network controlled or operated by or for the service provider…”

Google does not make “intermediate and temporary storage of that material. In Ellison v Robertson, AOL was storing Usenet postings for 14 days, which is clearly analogous to Google's caching of 14-20 days, and such caching was deemed temporary. Additionally, the work must be transmitted from the creator to a person other than himself, at the direction of the other person. In this case, Field transmitted the work to the GoogleBot at Google’s request, thus satisfying the requirements of 512(b)(1)(B).

Finally, Google’s storage of web pages is carried out through “an automat[ed] technical process” and “for the purpose of making the material available to users…who…request access to the material from [the originating site].” This is a two step process in which a user who clicks one of the hyperlinks and finds the site unavailable can return to the result set and access the website through the cached link.

Good Tea

Given the amount blood, sweat and tears that have been spilled over this, Good Tea must be one damn good poem. Unfortunately, its not so easy to tell. After this case was resolved, www.blakeswriting.com went mysteriously dark. Even though a Google search for Blake Field shows a reference to his page on the 11th pages of results (see the first illustration above), the link is dead. Furthermore, there is no 'Cached' link for the listing since Google removed it from the cache as soon as they became aware of Field's desire that it not be cached. Another illustration of the value of the Google cache.

However, there is another failsafe for these situations: the Internet Archive Wayback Machine. As you can see below, blakeswritings.com was archived, and future generations are safe from being subjected to the dry legal principles articulated in Field v. Google without knowing what all the fuss was about. Click on the image below to treat yourself to a paragraph or two of "Good Tea".