Ontology (computer science)
|
In computer science, an ontology is the product of an attempt to formulate an exhaustive and rigorous conceptual schema about a domain. An ontology is typically a hierarchical data structure containing all the relevant entities and their relationships and rules within that domain (eg. a domain ontology). The computer science usage of the term ontology is derived from the much older usage of the term ontology in philosophy.
An ontology which is not tied to a particular problem domain but attempts to describe general entities is known as a foundation ontology or upper ontology. Typically, more specialized domain specific schemata must be created to make the data useful for real world decisions.
Contents |
1 Uses for ontologies |
Types
A domain ontology is an ontology tied to a specific domain. A foundation ontology is a form of ontology that tries to be less specific, and in that way more generally applicable. It contains a core glossary in whose terms everything else in a broad domain can and must be described. An example is the 2000 words of English required by Longman's dictionary, used to define the 4000 most common English idioms. A foundation ontology in computer science would serve as core ontology for both computer programs and users, influencing their view of data and events.
A computer science example is that, by default, all computer programs have a foundation ontology consisting of a processor instruction set, standard library in a programming language, files in accessible file systems, or some other list of 'what exists'. Because these may be poor representations for certain problem domains, more specialized schema must be created to make the data useful in making real world decisions. Thus the need for standards which take 'core' ontologies (e.g. the Dublin Core in SGML) and solidify them into 'foundations'.
Tom R. Gruber and R. Studer have described an ontology in this sense as "an explicit and formal specification of a conceptualization".
Semantic web
Although the term 'ontology' has been used very loosely to label almost any conceptual classification scheme, among practising computational ontologists, a true ontology should besides the subsumption relation (also: is a, is subtype of or is subclass of, has hyperonym), also describe entities by other 'semantic relations' that specify how one concept is related to another.
The most common of the semantic relations other than subsumption is the part-of relation. In one formal notation, one might see a relation such as (isPartOf Spine Vertebrate), meaning that a 'Spine' (in that specific sense) is part of a Vertebrate. The ontologies are organized by concepts, not words, so that the concept 'spine' referring to the spine of a book would have to be labeled by a different term, such as 'BookSpine'. some terms are actually not included but they are used
Read more: Semantic Web
Link to ontology in philosophy
This is different from - but related to - the philosophical meaning of the word ontology, the study of existence. The purpose of a computational ontology is not to specify what does or does not 'exist', but to create a database containing concepts referring to entities of interest to the ontologist, and which will be useful in performing certain types of computations. For this reason, the reasoning used by philosophical ontologists can be helpful in recognizing and avoiding potential logical ambiguities.
Where alternative representations can equally well serve the purpose of the computational ontologist, time constraints usually dictate that one choice is made and others are ignored. For certain purposes, it can be better to ignore many of the details of the objects of interest. As a result, computational ontologies developed independently for different purposes will often differ greatly from each other. Different ontologies of the same domain of reasoning can also be due to different perceptions of the domain due to cultural background, education, ideology, or other reasons (see also the section "Practical lessons from philosophy" below ).
Uses for ontologies
Ontologies are commonly used in artificial intelligence and knowledge representation. Computer programs can use an ontology for a variety of purposes including inductive reasoning, classification, a variety of problem solving techniques, as well as to facilitate communication and sharing of information between different systems.
Foundation ontologies or upper ontologies are commercially valuable, creating competition to define them. Peter Murray-Rust has claimed that this leads to "semantic and ontological warfare due to competing standards", and accordingly any standard foundation ontology is likely to be contested among commercial or political parties, each with their own idea of 'what exists' (in the philosophical sense).
No one upper ontology has yet gained widespread acceptance as a de facto standard. Different organizations are attempting to define standards for specific domains. The 'Process Specification Language' (PSL) created by the National Institute for Standards and Technology (NIST) is one example.
Available ontologies
A well-known and quite comprehensive ontology available today is Cyc, a proprietary system under development since 1985, consisting of a foundation ontology and several domain-specific ontologies (called microtheories). A subset of that ontology has been released for free under the name OpenCyc, with a larger subset available for non-commercial use under the name ResearchCyc (see external links).
WordNet, a freely available database originally designed as a semantic network based on psycholinguistic principles, was expanded by addition of definitions and is now also viewed as a dictionary. It qualifies as an upper ontology by including the most general concepts as well as more specialized concepts, related to each other not only by the subsumption relations, but by other semantic relations as well, such as part-of and cause. However, unlike Cyc, it has not been formally axiomatized so as to make the logical relations between the concepts precise. It has been widely used in Natural Language Processing research.
The Suggested Upper Merged Ontology (SUMO) is another comprehensive ontology project. It includes an upper ontology, created by the IEEE working group P1600.1 (predominantly by Ian Niles and Adam Pease). It is extended with many domain ontologies and a complete set of links to WordNet. It is freely available.
This would reserve certain terms and their meanings for all 'P1600.1 standard' systems. Some would take this to entail that a general ontology (in the philosophical sense) defines 'what exists'. Some also feel that use of the adjective 'upper', in particular, implies a hierarchy one must accept rather than a foundation one can choose, and seems to suggest a cultural impact. Upper ontology creators however believe that an upper ontology simply defines a set of terms that people or software systems may choose to hold in common. The potential for cultural bias in SUMO has been tested by its translation into multiple non-English languages such as Chinese and Hindi, and its use within cultures that speak those languages, resulting in the conclusion by its creators and users that no significant linguistic or cultural biases are evident.
Additional examples of domain ontologies can be found at the Open Biomedical Ontology site. they act as an umbrella organisation for many ontologies specific to biological topics (such as cellular organelles). (see external links)
Practical lessons from philosophy: the purpose of an upper-ontology
The following are arguments both for and against the viability of any "absolute" upper ontology. The arguments disregard that under normal social conditions (such as the existence of academic and political freedoms) many ontologies will simultaneously exist and compete for adherents. Permanently adopting any single rigid system is unlikely, and probably undesirable in the public interest. That being said, encouraging private efforts to create a highly successful upper ontology that achieves adherents by virtue of its utility is likely to have a socially beneficial result - better communication.
Why an upper ontology is not feasible
Any effort to encode a useful upper or lower ontology will be characterized by ontological constraints that philosophers have found historically inescapable. Above all, these constraints cast serious doubt on attempts to build a general-purpose upper ontology.
- There is no self-evident way of dividing the world up into concepts
- There is no neutral ground that can serve as a means of translating between specialized (lower) ontologies
- Human language itself is already an arbitrary approximation of just one among many possible conceptual maps. To draw any necessary correlation between English words and any number of intellectual concepts we might like to represent in our ontologies is just asking for trouble. (WordNet is successful and useful precisely because it does not pretend to be a general-purpose upper ontology; rather, it is a tool for semantic / syntactic / linguistic disambiguation, which also happens to be richly embedded in the particulars and peculiarities of the English language.)
- Any hierarchical or topological representation of concepts must begin from some ontological, epistemological, linguistic, cultural, and — above all — pragmatic perspective.
Because any ontology is, among other things, a social / cultural artifact, there is no purely objective perspective from which to observe the whole terrain of concepts. Instead of asking, “what hierarchical representation of concepts best captures the universal relationships among general ideas,” it is more productive to ask “what specific purpose do we have in mind for this conceptual map of entities and what practical difference will this ontology make?” This pragmatic philosophical position surrenders all hope of devising the encoded ontology version of “everything that is the case,” (Wittgenstein, Tractatus Logico-Philosophicus).
According to Barry Smith in The Blackwell Guide to the Philosophy of Computing and Information (2004), "the project of building one single ontology, even one single top-level ontology, which would be at the same time non-trivial and also readily adopted by a broad population of different information systems communities, has largely been abandoned." (p. 159)
How ontologies will be employed in Artificial Intelligence is an open question, but much of what is known about concept acquisition and the social / linguistic interactions of human beings makes it unlikely that a general-purpose ontology is the essential foundation for learning or for achieving an intellect certifiable by the Turing test.
Why an upper ontology is feasible
While there is no single agreed metaphysics, the very existence of long standing arguments in the field shows that there are a number of models that do not have fatal flaws. Proponents of an upper ontology argue that while there is not one single true upper ontology, there are in fact several good ones that may be created. The benefits of standardization for communication and sharing suggest that practical system implementors should consider adopting a common upper ontology.
Several common arguments against upper ontology can be examined more clearly by separating issues of concept definition (ontology), language (lexicons), and facts (knowledge). The most common conflict among informal ontologies is a difference in language. This difference may be evident even when communities speak the same human language. People have different terms and phrases for the same concept. However, that does not necessarily mean that those people are referring to different concepts. They may simply be using different language. It is essential to separate the language used to refer to concepts from the concepts themselves. Formal ontologies typically use linguistic labels to refer to concepts, but the terms mean no more and no less than what their axioms say they mean. Labels are similar to variable names in software. They should be evocative, but should not be confused with the actual meaning of the name in the context of a formal system.
A second argument is that people believe different things, and therefore can't have the same ontology. However, many differences in belief are simply differences in the truth value of a particular assertion, not in the terms themselves that make up a particular logical assertion. Even arguments about the existence of a thing require a certain sharing of a concept, even though its existence in the real world may be disputed. Separating belief from naming and definition also helps to clarify this issue, and show how concepts can be held in common, even in the face of differing belief.
In summary, most disagreement about the viability of an upper ontology can be traced to the conflation of ontology, language and knowledge. Some additional concerns can be traced simply to the lack of common knowledge about specialized areas of knowledge. This is inescapable. Lack of knowledge does not however entail the impossibility of common ontology but rather points to the fact that many people, or agents or groups will have areas of their respective internal ontologies that do not overlap. The pragmatic issue is that sharing as much as possible is beneficial, and that a vast amount of ontology can be shared.
The several groups building upper ontologies and many users of those ontologies would no doubt be surprised to hear that such efforts have been abandoned.
Anatomy of an ontology
Contemporary ontologies share many structural similarities, regardless of which ontology language (and therefore syntax) is used. This section explains in simple, succinct terms many common features of ontologies. It is important to remember that ontologies don't 'do' anything. The functionality of a computational system that utilises an ontology is dependent not only on the structure of and data within the ontology, but also on the software implementation. For example, an ontology can support the creation of concept partitions. This functionality may not however be easily accessible to the user if their software does not provide an interface to it.
Concepts
Put simply, a concept is anything about which something can be said. It may be a real entity, or fictitious. It may be concrete or abstract, a task, a reasoning process, a function etc. Also known as classes, objects and categories in some languages.
- Partition: A group of related concepts about which rules can be created and applied. For example...
Car | | ----------------------- | | 2-Wheel Drive 4-Wheel Drive
The two sub-classes of 'Car' could be partitioned and a rule applied to ensure that an instance of 'Car' cannot be an instance of both '2-wheel Drive' and '4-Wheel Drive'. this is called a Disjoint Partition.
Another common use of a partition is to state that an instance of a concept must be an instance of one of the partitioned sub-concepts. This is an Exhaustive Partition.
Attributes
Every concept within an ontology can be described by assigning attributes. Attributes allow more complex relations to be modelled using the ontology. Consider an ontology that does not define attributes for its concepts; this would simply be a taxonomy (if concept relationships are described) or a Controlled Vocabulary. These are useful, but might not be considered true ontologies.
- Instance Attributes -
- Class Attributes -
- Local Attributes -
- Global Attributes -
Taxonomies & the 'directed acylic graph'
Once conceptual entities have been described, the result is more accurately described as a 'controlled vocabulary' or even a 'glossary'. It is obviously necessary to convey the relationships between entities if we are to accurately 'conceptualise' our domain of interest.
The two most common relationships ('is a' & 'part of') are described below.
The addition of 'is a' relationships creates a 'linear' taxonomy; a tree-like structure that clearly depicts how entities relate to one another. In such a structure, each entity is the 'child' of only one 'parent class'.
Class A | ------------------------------------------- | | Sub-Class B Sub-Class C | | ---------------------- ----------------------- | | | | Sub-Sub-Class D Sub-Sub-Class E Sub-Sub-Class F Sub-Sub-Class G
If we introduce 'part of' relationships to our ontology, we find that this simple and elegant tree structure quickly becomes complex and significantly more difficult to interpret manually.
It is not difficult to understand why; an entity that is described as 'part of' another entity might also be 'part of' a third entity. Consequently, entities may have more than one parent. The structure that emerges is known as a Directed Acyclic Graph.
Ontology languages
To be useful, ontologies must be expressed in a concrete notation. An ontology language is a formal language by which an ontology is built. There have been a number of data languages for ontologies, both proprietary and standards-based:
- The Cyc project has its own ontology language based on first-order predicate calculus, with some higher-order extensions, called CycL.
- KIF was created to serve as a syntax for first order logic that is easy for computers to process
- OWL is a language for making ontological statements, developed as a follow-on from RDF and RDFS, as well as earlier ontology language projects including OIL, DAML and DAML+OIL
Examples
See also
- Semantic Web
- Core ontology
- Ontology (in the philosophical sense)
- Topic Map
- Frame language
- Frame problem
- Application Programming Interface
- Expert System
- Knowledge base
- First Order Logic
- Second Order Logic
- Heterogeneous Database System
- taxonomy
- classification
- Info about rules in ontologies: theorems and regulations
External links
- What is ontology? (http://ontologyworks.com/what_is_ontology.php)
- What are the differences between a vocabulary, a taxonomy, a thesaurus, an ontology, and a meta-model? (http://www.metamodel.com/article.php?story=20030115211223271)
Available ontologies
- OpenCyc (http://www.opencyc.org/)
- ResearchCyc (http://research.cyc.com/)
- Open Biomedical Ontologies (http://obo.sourceforge.net/)
- Suggested Upper Merged Ontology (http://www.ontologyportal.org/)
- DOLCE (Descriptive Ontology for Linguistic and Cognitive Engineering) (http://www.loa-cnr.it/DOLCE.html)da:Ontologi (datalogi)
de:Ontologie (Informatik) et:Ontoloogia (arvutiteadus) es:Ontología (Informática) fr:Ontologie (informatique) it:Ontologia (informatica) nl:Ontologie (informatica) ru:Онтология (информатика) zh:本体论 (计算机)