Ontologies are one of those neither-fish-nor-fowl creatures that really bring
the greyness of copyright as applied to software into blurred focus.
Do ontologies constitute copyrightable subject matter?
Scope of Protection: Is an ontology data or software?
Do specialized ontologies constitute derivative works of upper ontologies?
What effect does GNU GPL restrictions have on ontologies?
Copyright protects expression of an idea that is an original work fixed in
tangible form. Copyright does not protect any idea, procedure, process, system,
method of operation, concept, principle or discovery , or any expression that
is merged with the preceding. Copyright also does not protect scenes a faire.
Ontology is the branch of metaphysics concerned with the nature and relations
of being, and an ontology is a particular theory about the nature of being or
the kinds of existents. From a software perspective, an ontology is an
expression of a theory about the nature and relations of existents codified in
first order predicate logic. The expression of those ideas are comprised of a
collection of definitions and axioms, that collectively constitute an original
work that is fixed in an ASCII text file.
As far as non-copyrightable subject matter is concerned, discoveries and
scenes a faire are not relevant to our discussion. However, there is an issue
with the merger doctrine with regard to ideas and processes. Fortunately, there
is precedent with ontology’s lowly cousin, the taxonomy, that gives us guidance
in this regard.
A taxonomy is an orderly classification of a subject according to its
relationships. The Seventh Circuit specifically addressed the copyrightability
of a taxonomy ; the ADA published a taxonomy of dental procedures and
subsequently Delta Dental Association published a derivative work of the ADA’s
taxonomy that included most of the numbers and short descriptions of the ADA’s
taxonomy. Delta Dental did not dispute that a substantial amount of its
taxonomy was copied from ADA’s taxonomy. The issue before the court was whether
a taxonomy is copyrightable subject matter.
Delta challenged copyrightability on originality and systems arguments. The
threshold of originality for a literary work is very low, and that is overcome
by the numbering system employed and descriptions given. The more substantive
challenge involves the argument that taxonomies are systems. The court held
that a taxonomy is not system. A taxonomy may be used as part of a system,
e.g., a system of recording dental procedures in a dental office – but this
does not preclude protection for the taxonomy. The ADA cannot preclude a
dentist from using its taxonomy to record dental procedures as to do so would
provide protection for a system, but the ADA can prevent a party from copying
the taxonomy. Delta did not use the taxonomy as part of a "system"; it copied
the taxonomy and made a derivate work of the taxonomy.
Scope of Protection:
Is an ontology
It is necessary to make a determination of where along the data-software
spectrum an ontology lies as that will determine the scope of protection
afforded, which will indicate the appropriate infringement analysis.
A computer program, or software, is defined as "a set of statements or
instructions to be used directly or indirectly in a computer in order to bring
about a certain result."
"The line between ‘software’ (or computer program) and the ‘data’ to be
manipulated by the software is sometimes hard to discern. For example,
‘knowledge bases’, the databases used in artificial intelligence programs, are
not mere lists or records of facts. The knowledge base itself includes
structured rules and relationships needed for making decisions."
The fog gets thicker as object-oriented programming and component-based
architectures further blur the distinction between program and data. These
architectures invest more of the intelligence of a system into the organization
of the data, while as the same time making the data more operational within the
system. So in our spectrum, data contains data, databases contain data and
loosely coupled methods (stored procedures, triggers, etc.), taxonomies contain
data and structure, ontologies contain data, structure and rules, knowledge
bases contain data, structure and rules, objects contain data and tightly
coupled methods, components likewise contain data and tightly coupled methods,
and applications generally contain objects and components. Thus, as you move
along the spectrum, it becomes more difficult to distinguish the data from the
For purposes of determining the scope of protection for any given software
code, one could conclude that the appropriate method for determining the scope
of protection is to simply slot the purported code on the data-application
spectrum below, the position of which indicates the relative thinness-thickness
If the protection is thin, then there is no need to perform any analysis
pertaining to a derivative work. Since you would only be looking at literal
infringement, you would either determine that there was copying or not. If the
protection is thick, then you would look a literal infringement as well as
non-literal infringement analysis, in which case, by virtue of the
Abstraction-Filtration-Comparison test, you would necessarily need to perform a
derivative work analysis.
On the data end of the spectrum, a data in a list format (such as a phonebook)
does not meet the threshold of copyrightable subject matter. A database can be
copyrightable subject matter, but any alleged infringing code would only be
subject to a literal infringement analysis. The protection for a taxonomy is
thin. In the Dental Dental case, Delta admitted that a substantial amount of
its taxonomy was copied from the ADA taxonomy. The scope of protection for an
ontology would be thicker than an taxonomy, but thinner than a knowledge base.
works of upper ontologies?
Literal copying of a significant portion of source code is not always
sufficient to establish that a second work is a derivative work of an original
program. Conversely, a second work can be a derivative work of an original
program even though there is no copying of the literal source code of the
original program has been made. This is the case because copyright protection
does not always extend to all portions of a program’s code, while at the same
time, it can extend beyond the literal code of a program to its non-literal
aspects, such as its architecture, structure, sequence, organization, operation
modules, and user interface.
The copyright act is of little, if any, help in determining the definition of
a derivative work of software. However, the applicable provisions do provide
some, albeit cursory, guidance. Section 101 of the Copyright Act sets forth the
A ‘computer program’ is a set of statements or instructions to be used directly
or indirectly in a computer in order to bring about a certain result.
A ‘derivative work’ is a work based upon one or more preexisting works, such as
a translation, musical arrangement, dramatization, fictionalization, motion
picture version, sound recording, art reproduction, abridgement, condensation,
or any other form in which a work may be recast, transformed, or adapted. A
work consisting of editorial revisions, annotation, elaboration, or other
modifications which, as a whole, represent an original work of authorship, is a
On the data-application spectrum, ontologies lean towards data, and
consequently are afforded thinner protection than more method rich
implementations. This speaks to literal copying as being the test for
infringement on the basis of derivation. Furthermore, there may be case for
only considering the literally copied portion as infringing. This makes a
strong case for the position that specialized ontologies do not constitute
derivative works of upper level ontologies, but rather stand on their own with
reference to an API-like layer of upper facts.
What effect does
GNU GPL restrictions
have on onotologies?
Section 0 (it's a Unix thing) of the GPL defines "a work based on the
Program", as in "when you distribute the same sections as part of a whole which
is a ‘work based on the Program’, as such: a work based on the Program means
either the Program or any derivative work under copyright law".
The GPL does appear to rely on the definition of derivate work under copyright
law for drawing the line between those works that are covered and those that
The meaning of derivative work is subsumed in the definition of an exact copy
of the original work, at least partially, because, in some cases, distributing
an exact copy of the original work would not implicate the copyright law if
that original work contained no copyrightable subject matter. As such, no
compliance with the license is necessary for such re-distribution of such an
Distribution of an unmodified original work may be seen by some as a different
analysis from distribution of a derivative work. However, at the end of the
day, courts will inevitably follow the filtration test and reduce the original
work to its copyright protectable elements. This has led some observers to note
that what one is actually distributing as "an exact copy of the [purported]
original work," is in fact a derivative work of the protectable portion of the
original work. This results in a derivative work analysis because the court
will not be comparing what is re-distributed as a whole work (unless the entire
work constituted copyrightable subject matter), but rather portion of the work
that constitutes copyrightable subject matter.
Consequently, using the previous pegging of ontologies on our data-application
spectrum, and the definition of derivative work under the GPL, it seems that
lower level ontologies that reference upper level ontologies are not derivative
works and therefore lower level ontologies will not be constrained by the
requirements and restrictions of the GPL.