TERMINOLOGY ¶
Before diving into how to use the Lenticular Lens, we define a number of terms we believe to be important for a better comprehension of what is offered by the tool. For some of the concepts defined here, we opt for a broader scope which fits best problems encounter in every day life .
RDF¶
As described by the W3C, RDF is a graph-based data model for interchanging data on the Web and it stands for Resource Description Framework. In other words, Lexico, the Oxford supported dictionary, emphasises on the semantic aspect by defining it as a model for encoding semantic relationships between items of data so that these relationships can be interpreted computationally. Expressing a relationship between a pair of RDF resources is done as a sequence of three terms .
The sequence of terms is called a triple where each of its terms is separated by whitespace and the sequence is terminated by ’.’. The example below illustrates two simple triple syntax. The triples, not only do they provid Spiderman’s mane and assert that Spiderman and Peter Parker are the same, but they also provid identification for
Spiderman,
Peter Parker and the vocabulary terms
name and
sameAs .
In RDF, any triple is an RDF STATEMENT. There exist various syntaxes for representing triple statements. For further reading, see
N-Triples, Notation3, N-Quads, Turtle, RDF XML, RDFa, JSON-LD…
Example 1: Triples in N-TRIPLE syntax
<http://example.org/hero#Spiderman> <http://xmlns.com/foaf/0.1/name> "Peter-Parker" .
<http://example.org/hero#Spiderman> <http://www.w3.org/2002/07/owl#sameAs> <http://example.org/person#Peter-Parker> .
Example 2: Triples in Turtle syntax
@prefix ex: <http://example.org/#> .
@prefix rel: <http://www.perceive.net/schemas/relationship/> .
ex:green-goblin rel:enemyOf ex:Spiderman .
ex:spiderman foaf:name ex:Peter-Parker.
Resource¶
“Anything can be a resource, including physical things, documents, abstract concepts, numbers and strings.” RDF 1.1
Referents¶
IRI - URI - URL - URN constitute ways in which things (resources) in the real world can be referred to in the digital world. “The resource denoted by an IRI is called its referent, and the resource denoted by a literal is called its literal value.” RDF 1.1. In the IRI example provided below, we illustrate six IRIs referents of resources of various types (hero, villain and vocabulary).
Example 3: IRIs
A resource's REFERENT is an IRI.
A resource's LITERAL VALUE is a LITERAL.
http://example.org/villain#Green-Goblin
http://example.org/villain#Thanos
http://example.org/hero#Spiderman
http://example.org/person#Peter-Parker
http://www.perceive.net/schemas/relationship/enemyOf
http://xmlns.com/foaf/0.1/name
Fusion provides nice wording of all these terms as provided below. In short, URL, URN and URI are all specific types of an IRI.
URI¶
“A Uniform Resource Identifier is a compact sequence of characters that identifies an abstract or physical resource. The set of characters is limited to US-ASCII excluding some reserved characters ( ; / ? : @ = & ). Characters outside the set of allowed characters can be represented using Percent-Encoding. A URI can be used as a locator, a name, or both. If a URI is a locator, it describes a resource’s primary access mechanism. If a URI is a name, it identifies a resource by giving it a unique name. The exact specifications of syntax and semantics of a URI depend on the used Scheme that is defined by the characters before the first colon. [RFC3986]”
URN¶
“A Uniform Resource Name is a URI in the scheme urn intended to serve as persistent, location-independent, resource identifier. Historically, the term also referred to any URI. [RFC3986] A URN consists of a Namespace Identifier (NID) and a Namespace Specific String (NSS): urn:
URL¶
“A Uniform Resource Locator is a URI that, in addition to identifying a resource, provides a means of locating the resource by describing its primary access mechanism [RFC3986]. As there is no exact definition of URL by means of a set of Schemes, URL is a useful but informal concept, usually referring to a subset of URIs that do not contain URNs [RFC3305].”
IRI¶
“An Internationalized Resource Identifier is defined similarly to a URI, but the character set is extended to the Universal Coded Character Set. Therefore, it can contain any Latin and non Latin characters except the reserved characters. Instead of extending the definition of URI, the term IRI was introduced to allow for a clear distinction and avoid incompatibilities. IRIs are meant to replace URIs in identifying resources in situations where the Universal Coded Character Set is supported. By definition, every URI is an IRI. Furthermore, there is a defined surjective mapping of IRIs to URIs: Every IRI can be mapped to exactly one URI, but different IRIs might map to the same URI. Therefore, the conversion back from a URI to an IRI may not produce the original IRI. [RFC3987]”
RDF Link¶
It is also known as a correspondent triple [Euzenat2013] or simply link in RDF and [VoID]. Links are triples in the form of and are meant to exist only between two datasets. However, beside the incentive for providers to link their data to existing datasets, we do not see a good reason for a link not to exist within the same dataset. Thereby, in this document, an RDF link is a relation between two resources (digital representations of entities presented as IRIs) regardless of the datasource they stem from. This means that a link can involve a minimum of one dataset and a maximum of two.
Identity Link¶
An equality relation between two resources (digital representations of the same real world entity presented as URIs) regardless of the datasource they stem from. In this document we often use the term link to denote identity link unless otherwise stated. Mainly described using the predicate owl:sameAs, such encoded semantic relationship entails full equality between two resources. This semantic interpretation applies independently of context even though in real life problems, the equality between resources may depend not only on their intrinsic properties but also on the purpose or task for which they are used. In this regard, we present in the Export menu section the Lenticular Lens approach on the use and documentation of identity links in practice.
Identity Link Network¶
A set of co-referent entities regardless of the data they stem from. This can also be viewed as an Identity Cluster or Identity Network of co-referent entities.
Link Validation¶
The process of accepting or rejecting a link. This process is documented with the validation status (accepted or rejected) of the link and the supporting reasons.
Named-graph¶
According to Wikipedia, named graphs are a key concept of Semantic Web architecture in which a set of Resource Description Framework statements (a graph) are identified using a IRI, allowing descriptions to be made of that set of statements such as context, provenance information or other such metadata.
Example 4: Named-graph
The example below illustrates a turtle named graph syntax of a linkset of three links using the owl:sameAs linktype. In the example, http://example.org/#linkset-1 is the IRI of the named graph. For more detail on Turtle RDF syntax, See https://www.w3.org/TR/turtle/#simple-triples. In this page, EXAMPLE 9 illustrates different ways of writing triples in Turtle.
@base <http://example.org/> .
@prefix ex: <http://example.org/#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix sim: <http://purl.org/ontology/similarity/> .
ex:linkset-1
{
ex:Chiara owl:sameAs ex:Latronico .
ex:Al owl:sameAs ex:Al_Idrissou .
ex:Al owl:sameAs ex:Al_Koudous .
}
Default Graph¶
In a triple-store jargon, any triple without a specific named-graph ends up in the default graph. In the example below, the triple at line 4 is located in the default graph while triples at line 8 and 9 are in an explicitly stated graph, in the named-graph: ex:linkset-1.
Example 5: Default Named-graph
@prefix ex: <http://example.org/#> .
@prefix owl: <http://www.w3.org/2002/07/owl#> .
ex:Chiara owl:sameAs ex:Latronico .
ex:graph-1
{
ex:Al owl:sameAs ex:Al_Idrissou .
ex:Al owl:sameAs ex:Al_Koudous .
}
Linkset¶
In [VoID], a linkset is a collection of RDF links between two datasets where all subjects stem from one dataset and all objects from the other dataset. In here, we alleviate such strict restriction on the number of linked datasets to be one, two or more. Therefore, here a linkset is a collection of RDF links between dataset(s), using the same link predicate (regardless of the link being an equality predicate or not). This predicate is called the linktype of the linkset.
Lens¶
Content-wise, a lens is a type of RDF linkset (in the sense that it is a collection of links sharing the same linktype) also involving one, two or more datasets. However, context-wise, it differs from a linkset as it is generated using different process (set-like link manipulation operators such as Union, Intersection, Difference or Transitivity) as compared to how a linkset comes about (matching algorithms).
Linkset / Lens Specification¶
In the Lenticular Lens, a Linkset / Lens specification can be viewed as a reproducibility metadata. It provides information on how set of links came to be. In other worlds, it explicitly defines the context in which a link hold and supports decisions making during the validation process (accepting or rejecting a link).
Matching Algorithm¶
A set of rules followed by a computer for finding pairs of matching resources. An algorithm can be further categorised as automated or semi-automated depending on whether it requires user assistance or not. Most matching algorithms employed in the Lenticular Lens are semi-automated.
Fuzzy Logic¶
From the family of Many-Valued Logics, Fuzzy Logic is a computing approach based on degrees of Truth stemmed form the unit [0, 1] truth space. Modelling reasoning with vague propositions is often addressed with Fuzzy Logic. A proposition, statement or event is said to be vague (the tomato is ripe) when it is hard or sometimes impossible to establish whether the proposition is completely true or false due to the involvement of vague concepts (ripe), thereby begging for the use of degrees of truth as oppose to binary truth values {0, 1}. “For example, we cannot exactly say whether a tomato is ripe or not, but rather can only say that the tomato is ripe to some degree” [Lukasiewicz2008].
Degrees of Truth is not to be confused with the Degrees of Certainty / Confidence as the latter is not an assignment of truth value but rather an evaluation of a weighted proposition (proposition with an assigned truth value) regardless of its truth value space ({0, 1} or [0, 1]).
Turtle Prefixes¶
Example 6: Prefixes used in this manual.
Below are the prefixes used in this document
@prefix A: <http://example.org/A#> .
@prefix B: <http://example.org/B#> .
@prefix C: <http://example.org/C#> .
@prefix dataset: <http://example.org/dataset#> .
@prefix dcterms: <http://purl.org/dc/terms/> .
@prefix ex: <http://example.org/#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix format: <http://www.w3.org/ns/formats/> .
@prefix grid: <http://www.grid.ac/ontology/> .
@prefix gridC: <http://vivoweb.org/ontology/core#> .
@prefix hero: <http://example.org/hero#> .
@prefix institutes: <http://www.grid.ac/institutes/> .
@prefix law: <http://www.opendatacommons.org/>
@prefix owl: <http://www.w3.org/2002/07/owl#> .
@prefix person: <http://example.org/person#> .
@prefix prov: <http://www.w3.org/ns/prov#> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
@prefix rdfs: <http://www.w3.org/2000/01/rdf-schema#>
@prefix rel: <http://www.perceive.net/schemas/relationship/> .
@prefix sim: <http://purl.org/ontology/similarity/> .
@prefix singleton: <http://lenticularlens.org/singleton#>
@prefix void: <http://rdfs.org/ns/void#> .
@prefix voidPlus: <http://lenticularlens.org/voidPlus#> .
@prefix validation: <http://vocabulary/validation#> .
@prefix wiki: <http://en.wikipedia.org/wiki/>
@prefix wikidata: <http://www.wikidata.org/entity/> .