Chapter 6 Appendixes
6.2 Appendix 2: typical search terms per research question
Research question Related search terms
1. What does “property ranking” ex-
actly entail (for an entity, such as
an ontology class), and which methods and techniques are de- scribed in scientific literature with respect to ranking of properties?
“(entity | node | subject) property (ordering | ranking | salience | prominence | summari- zation | summarizing )”, …
Additional search terms are listed below to drive specific search focus for locating:
Semantic Web research:
Specific extra keywords: “Seman- tic Web”, “rfds”, “rdf”, “ontolo- gy”, “owl”, “class(es)”, “taxono- my”.
Model Driven Architectures re- search:
Specific extra keywords: “mda”, “model”, “model-driven”, “model- based”,
XML schema processing research: Specific extra keywords: “XML schema”, XMLS, “complex type”
Database processing research: Specific extra keywords: “Data- base”, “RDBMS”, “relational da- ta”
In addition to the above key search terms, various searches were performed to find relevant papers that deal with Semantic Web browsing and visualization. Key search terms here were “Semantic Web (browsing | browser | visualization | editor |
2. Which potential ranking tech- niques may proof useful to further research?
Which potential ranking tech- niques may proof useful to further research?
Mostly overlapping with the terms mentioned above and below (although there is obviously a different context of interpretation).
3. Can we improve on existing rank- ing algorithms, or alternatively, specify effective new ranking al- gorithms?
“ranking”, “ordering”, “prioritization”, “sali- ence”, “prominence”, “terminology”, “termi- nological”, “terminological analysis”, “natural language processing”, “NLP”, “WordNet”, “framenet”, corpus , “hyperonymy”, “hypon- ymy”, “hypernym”, “hypernymy”, “semantic relation”, “meronymy” (and so forth); we also combined above terms with keywords such as “entity”, “class”, “table” and related terms.
4. What is the validity of the ranking information extracted from In- fobox templates for obtaining a target truth set to evaluate rank- ing algorithms in generic user in- terface applications?
“infobox”, “infobox template”, “ranking”, “ordering”, “truth set”, “baseline”, “ground truth”, “standard”, “golden standard”, “tar- get”, “evaluation”, “data extraction”, “ranking extraction”, …
5. How do Infobox template lan- guage variants influence the eval- uation of ranking algorithms?
“infobox”, “DBpedia”, ‘statistical (ordering | order | ranking | rank)”, “(ordering | ranking | etc.) metrics”, “(ordering | ranking | etc.) measures”, “cross language”, “cross cultural”, “language”, “cross lingual”, “variations”, …
6. What metrics are useful to assess an ontology property ranking al- gorithm in generic user interface scenarios?
“correlation”, “rank”, “information retrieval”, “rank metric”, “ranking metric”, “order met- rics”, “rank measure”, “order measure”, “measuring ranks”, “rank kpi”, “rank perfor- mance”, “ordering metric”, “ordering meas- ure”, “ordering kpi”
Glossary
Alchemy APIA web-based (http://www.alchemyapi.com) multi-lingual NLP terminology pro- cessing engine with services accessible via an Application Programming Interface (API).
Attribute (object)
A named characteristic or property of an object.
Class
A class is a general specification or description for a set of objects and defines aspects such as common data structures and relationships.
DBpedia
A project that extracts structured content from Wikipedia and publishes that data available as Linked Data. DBpedia is one of the largest Linked Data hubs on the Web with millions of characterized entities.
Holonym
A linguistic term for the name of the whole of which the meronym is a part. Formally,
Y is a holonym of X if X is a part of Y. As an example, Apple tree is a holonym of
apple. See also meronyms.
Hypernym
A linguistic term for a word whose meaning includes the meanings of other words. Hypernyms (also called superordinates) are thus general words. In formal notation, Y is a hypernym of X if X is a (kind of) Y. For example, dog is a hypernym, while Col- lie and Chihuahua are more specific.
Hyponym
A linguistic term for subdivisions of more general words. Hyponyms designate mem- bers of a class. Formally speaking, X is a hyponym of Y if X is a (kind of) Y. As an example, a footstep is a kind of step, or, in more technical terms, footstep is a hypo- nym, or subtype, of step, and step is a hypernym, or supertype, of footstep.
Individual
See instance.
Inferencing
Inference is the act or process of deriving logical conclusions from premises known or assumed to be true. The logic within and between statements in an ontology is the basis for inferring new conclusions from it. An inference engine is also known as a reasoner. Inference commonly proceeds by forward chaining or backward chaining. See also reasoners.
Inheritance
Inheritance is a relationship between classes where one class is a parent of another and implements "is-a" relationships between classes.
Instance
Instances are basic “ground level” components of ontologies. An instance is an indi- vidual member of a class. Synonymous terms are entity, individual or member. The instances in ontologies may include real world objects such as people and planets as well as abstract instances such as numbers and words.
Linked data
Linked data is a set of best practices for publishing and deploying instance and class data using the RDF data model, and uses Uniform Resource Identifiers to name the data objects. The approach exposes the data for access via the HTTP protocol. There is great emphasis on data interconnections, interrelationships and context.
Member
Meronym
A linguistic term for a word that denotes a constituent part or a member of something. Formally, X is a meronym of Y if X is a part of Y. For example, apple is a meronym of apple tree.
Natural language processing (NLP)
NLP is the process of a computer extracting meaningful information from natural language input and/or producing natural language output. NLP is one method for assigning structured data characterizations to text content for use in semantic technol- ogies. (Hand assignment is another method.) Some of the specific NLP techniques and applications relevant to semantic technologies include automatic summarization, co-reference resolution, machine translation, named entity recognition, relationship extraction, topic segmentation and recognition, word segmentation, and word sense disambiguation, among others.
Object
An object is an abstraction or simulation of physical things such as people and ma- chines or intangible things such as events and processes that captures their character- istics and behavior.
Ontology
A data model that defines the types, properties and interrelationships of the entities that really or fundamentally exist for a particular domain in a closely resembling manner.
OWL
The Web Ontology Language (OWL) is a family of dialects designed for defining and instantiating formal Web ontologies. An OWL-based ontology may include descrip- tions of classes along with their related properties and instances.
Property (triples)
Properties are the ways in which classes and instances can be related to one another. They are also known as predicates. Properties are used to define attribute relations for instances. Properties are fundamentally relationships.
Property (object)
Resource Description Framework (RDF)
The Resource Description Framework (RDF) is a family of World Wide Web Consor- tium (W3C) specifications. RDF was originally designed as a metadata model, but it has come to be used as a general method of modeling information through a variety of syntax formats. The RDF metadata model is based upon the idea of making state- ments about resources in the form of subject-predicate-object expressions, called tri- ples in RDF terminology. The subject denotes the resource, and the predicate denotes traits or aspects of the resource and expresses a relationship between the subject and the object.
RDFS / RDF Schema
RDFS or RDF Schema is an extensible knowledge representation language providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources.
Reasoner
A semantic reasoner, also known as a reasoning engine, rules engine or simply a rea- soner, is a piece of software able to infer logical consequences from a set of asserted facts or axioms. Many reasoners use first-order predicate logic to perform reasoning. See also inferencing.
Semantic Web
The Semantic Web is a collaborative movement led by the World Wide Web Consor- tium (W3C) that promotes common formats for data on the World Wide Web. By encouraging the inclusion of semantic content in web pages, the Semantic Web aims at converting the current web of unstructured documents into a “web of data”. It builds on the W3C’s Resource Description Framework (RDF).
Semantic Web browser
A browser used for navigating the Semantic Web (data). The Semantic Web architec- ture does not involve HTML. Semantic Web browsers specialize in processing RDF data from Web servers. A Semantic Web browser renders information that it can find on the Semantic Web about a specific resource. The views may contain hyperlinks for users to navigate between the found resources. Semantic Web browsers are also known as hyperdata browsers.
Statement
A statement consists of a subject, a predicate and an object. Statements are also known as S-P-O assertions. Statements are by definition the “facts” (or axioms) with- in ontologies.
Subject
A subject is a reference (or definition) to a particular object, thing or topic, or groups of such items. Subjects are also often referred to as concepts or topics.
Superordinate
See hypernym.
Synset
A Wordnet synonym set, a set of words that are interchangeable in some context without changing the truth value of the preposition in which they are embedded.
Terminological
The system of terms belonging or peculiar to a science, art, or specialized subject; nomenclature; the science of terms. Example usage: “the terminology of botany”.
Triple
A basic statement in the RDF language that compromises a subject, a property and an object, with the subject and property (and object optionally) referenced through a Uniform Resource Identifier (URI). See also statement.
WordNet
WordNet is a lexical database for the English language. It groups English words into sets of synonyms called “synsets”, provides short, general definitions, and records the various semantic relations between these synonym sets. The purpose is twofold: to produce a combination of dictionary and thesaurus that is more intuitively usable, and to support automatic text analysis and artificial intelligence applications. The database and software tools can be downloaded and used freely. Multiple language versions exist, and WordNet is a frequent reference structure for semantic applications.