LINK EXPORT ¶
Work in progress
At this point, linksets and/or Lenses have already been created and possibly validated. Now, new interesting questions come in mind:
-
Could these links be exported for external usage?
-
In what format are there available for export?
-
Can metadata be exported in combination with the links?
-
Can the linktype be modified prior to an export?
All these questions will be answered here.
1. Link Metadata Structure¶
The Lenticular Lens design imposes to itself and motivates its users to provide as much explicit settings as possible in support for describing why (clarity) and how (reproducibility) a set of links has been generated. Links metadata in the tool is organised into Generic and Specific metadata.
Indeed, all links residing in the Lenticular Lens can be exported. However, prior to doing that, one has to choose whether the metadata is to be included or not. If it is to be included, further directives are needed as to whether it should include Generic and/or Specific metadata.
- Specific metadata is the annotation that applies to a single link.
- Generic metadata is the annotation that applies to a collection of links as a whole.
1.1 Specific Metadata¶
Example-1 presents in a turtle format, a standard identity set of nine links where equivalent resources are linked with the well known linktype owl:sameAs. As shown in this example, the links are meant to reside in the default graph of a triplestore as no named graph is explicitly associated to them.
Example 1: Unnamed identity-set.
hero:BlackWidow owl:sameAs person:Nat .
hero:BlackWidow owl:sameAs person:Romanoff .
hero:BlackWidow owl:sameAs person:Natalia-Romanova .
hero:BlackWidow owl:sameAs person:Natasha .
hero:Spiderman owl:sameAs person:Peter-parker .
hero:Spiderman owl:sameAs person:Tom-Holland .
hero:Superman owl:sameAs person:Clark-Kent .
hero:Superman owl:sameAs person:Joseph .
hero:Superman owl:sameAs person:Kal-El .
In Example-2, the triples of Example-1 are now presented with specific annotations. With this type of annotation, one can for example specify the confidence strength of each of the nine triples. In new example, seven out of the nine links are now annotated with a validation statement. The triple hero:Spiderman owl:sameAs person:Tom-Holland for example is the only triple annotated as a non valid statement followed by the rational supporting its rejection. Another added value beside being able to make new statements about a link is that, the annotation itself can be used as a way of filtering links. For example, one can select for example, the 8 validated triples or the 7 triples validated as “true”.
Example 2: Annotation of individual links.
<<hero:BlackWidow owl:sameAs person:Nat>> validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Romanoff>> validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Natalia-Romanova>>
validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Natasha>> validation:status "True" .
<<hero:Spiderman owl:sameAs person:Peter-parker>> validation:status "True" .
<<hero:Spiderman owl:sameAs person:Tom-Holland>>
validation:status "False" .
validation:Rational "Tom-Holland does not have Spiderman properties which Peter-parker (fictitious) a.k.a Spiderman has." .
<<hero:Superman owl:sameAs person:Clark-Kent>> validation:status "True" .
<<hero:Superman owl:sameAs person:Joseph>> validation:status "True" .
hero:Superman owl:sameAs person:Kal-El .
1.2 Generic Metadata¶
The set of links illustrated in Example-1 as triples populating the default graph can now be referred to as ex:heroes-Identity in Example-3 as they have now been grouped in this named-graph. As illustrated in Example-4, this named-graph IRI can now be used to document any generic information deemed relevant such as the source and target datasets, the aligning method…
As opposed to Example-1 and Example-2, Example-3 highlights that in need of generic metadata, links in the spotlight need a referent to be annotated. In other words, there is no need to export the links in a named graph (.trig extension) when meta data is not required by the user. However, there is a grate advantage when need is to gather all links at once.
Example 3: Named identity-set..
ex:heroes-Identity
{
hero:BlackWidow owl:sameAs person:Nat .
hero:BlackWidow owl:sameAs person:Romanoff .
hero:BlackWidow owl:sameAs person:Natalia-Romanova .
hero:BlackWidow owl:sameAs person:Natasha .
hero:Spiderman owl:sameAs person:Peter-parker .
hero:Spiderman owl:sameAs person:Tom-Holland .
hero:Superman owl:sameAs person:Clark-Kent .
hero:Superman owl:sameAs person:Joseph .
hero:Superman owl:sameAs person:Kal-El .
}
Example-4: Annotation of a set of links in the Lenticular Lens.
- Generic Metadata. The linkset ex:heroes-Identity is presented with both generic and specific metadata. The generic metadata in the default graph (triple without a specific named graph) conveys information about the identity set: ex:heroes-Identity. This information includes among other, the type, subjects, license, description, format, number of triples (9), number of distinct entities (12), number of identity clusters (3)… As stated in the generic metadata of ex:heroes-Identity, the linkset contains 9 triples. Of these triples, the metadata informs us that 7 are validated as accepted, one as rejected and another one as remains (not validated).
ex:heroes-Identity
a void:Linkset ;
dcterms:description "Identifying Marvel's superheroes" ;
dcterms:license law:odc-public-domain-dedication-and-licence ;
dcterms:subject <http://example.org/resource/Person> ;
dcterms:subject <http://example.org/resource/Hero> ;
void:subjectsTarget dataset:Fictive-Persons ;
void:objectsTarget dataset:Superheroes ;
void:feature format:Turtle ;
void:linkPredicate owl:sameAs ;
void:triples 9 ;
void:entities 12 ;
void:distinctSubjects 3 ;
void:distinctObjects 9 ;
voidPlus:clusters 3 ;
validation:count 8 ;
validation:accepted 7 ;
validation:rejected 1 ;
validation:remains 1 ;
1.3 Complete Metadata¶
Example-3 illustrates the complete structure of linksets and lenses in the Lenticular Lens. Here, by default, a linkset/lens is annotated with both, generic and specific metadata.
Example 5: Complete metadata in RDF* turtle syntax annotation.
ex:heroes-Identity
a void:Linkset ;
dcterms:description "Identifying Marvel's superheroes" ;
dcterms:license law:odc-public-domain-dedication-and-licence ;
dcterms:subject <http://example.org/resource/Person> ;
dcterms:subject <http://example.org/resource/Hero> ;
void:subjectsTarget dataset:Fictive-Persons ;
void:objectsTarget dataset:Superheroes ;
void:feature format:Turtle ;
void:linkPredicate owl:sameAs ;
void:triples 9 ;
void:entities 12 ;
void:distinctSubjects 3 ;
void:distinctObjects 9 ;
voidPlus:clusters 3 ;
validation:count 8 ;
validation:accepted 7 ;
validation:rejected 1 ;
validation:remains 1 ;
ex:heroes-Identity
{
<<hero:BlackWidow owl:sameAs person:Nat>> validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Romanoff>> validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Natalia-Romanova>>
validation:status "True" .
<<hero:BlackWidow owl:sameAs person:Natasha>> validation:status "True" .
<<hero:Spiderman owl:sameAs person:Peter-parker>> validation:status "True" .
<<hero:Spiderman owl:sameAs person:Tom-Holland>>
validation:status "False" .
validation:Rational "Tom-Holland does not have Spiderman properties which Peter-parker (fictitious) a.k.a Spiderman has." .
<<hero:Superman owl:sameAs person:Clark-Kent>> validation:status "True" .
<<hero:Superman owl:sameAs person:Joseph>> validation:status "True" .
hero:Superman owl:sameAs person:Kal-El .
}
Separation of Concern. The linkset and lens presentation enable separation of concern in the sense that it provides the user with a number of options:
- Flat Representation: No reified triples and optionally within a named graph.
- Partial Representation: Choice of exclusively including either the generic or the specific metadata.
- Full Representation: as intended in the Lenticular Lens, set of links comes along with its generic and specific metadata if applicable.
2. RDF Link Reifications¶
The verb reify is defined in Lexico, the Oxford supported dictionary, as a way to make (something abstract) more concrete or real. In RDF – a data model that allows for the description of a resource (subject position of a triple) in the form –, reification offers means to define a triple as a resource as such that the triple could be described.
Several reification approaches exist: N-ary relations, RDF reification, Rdfstar and Singleton properties. In the next subsections we briefly describe them.
2.1 N-ary relations¶
The [N-ary relations] syntax provides a means to model a non binary relation, a relation that holds among more than two objects, as an object itself.
Example 6: The Purchase relation into an N-ary relations.
:Purchase_1
a :Purchase ;
:has_buyer :John ;
:has_object :Lenny_The_Lion ;
:has_purpose :Birthday_Gift ;
:has_amount 15 ;
:has_seller :books.example.com .
2.2 Standard reification¶
RDF reification is the standard reification proposed by W3C. As depicted in Example-6, it introduces a new resource of type rdf:Statement and three predicates to identify the rdf:subject, rdf:predicate, and rdf:object of the reified triple.
Example 7: Standard RDF reification.
### BlackWidow Triple.
hero:BlackWidow owl:sameAs person:Nat .
### Standard RDF reification of BlackWidow Triple
ex:reification-1 rdf:type rdf:Statement ;
rdf:subject hero:BlackWidow ;
rdf:predicate owl:sameAs ;
rdf:object person:Nat ;
validation:status "True" .
2.3 Singleton¶
The [Singleton properties] approach offers the latitude to make uniquely identifiable the predicate of the triple to reified where the new property is an rdf:singletonPropertyOf a more generic property
Example 8: Singleton.
### BlackWidow Triple.
hero:BlackWidow owl:sameAs person:Nat.
### Modification of BlackWidow Triple.
hero:BlackWidow singleton:sameAs-1 person:Nat.
### Singleton reification of BlackWidow Triple.
singleton:sameAs-1
rdf:type rdf:SingletonProperty ;
rdf:singletonPropertyOf owl:sameAs .
validation:status "True" .
2.4 RDFstar¶
RDF*: An in-line simple reification syntax that requires RDF* and SPARQL*.
Example 9: RDFstar / RDF*.
### BlackWidow Triple.
hero:BlackWidow owl:sameAs person:Nat .
### RDFstar reification BlackWidow Triple.
<<hero:BlackWidow owl:sameAs person:Nat>> validation:status "True" .
3. Export Formats¶
The Lenticular Lens offers a variety of four file formats for exporting a collection of links. These are: CSV (Example-10), JSON-LD (Example-11), RDF Turtle (Example-12), and RDF* Turtle (Example-13).
3.1 CSV file format¶
In our attend to export a linkset/lens with its complete metadata, in a CSV format, the Lenticular Lens generates two CSV tables. As available in Example-9, the first table is an illustration of a linkset generic metadata while the second table is an illustration of links with specific metadata (annotated triples).
Example 10: Linkset in a CSV format.
Keep in mind that the table below are a visual representation of a CSV table
--------------------
-- METADATA TABLE --
--------------------
---------------------------------------------------------------------------------------------------------
| Vocabulary Value |
---------------------------------------------------------------------------------------------------------
http://www.w3.org/ns/sparql-service-description#namedGraph http://example.org/#heroesIdentity
http://www.w3.org/1999/02/22-rdf-syntax-ns#type http://rdfs.org/ns/void#Linkset
http://purl.org/dc/terms/description Identifying Marvel's superheroes
http://purl.org/dc/terms/license law:odc-public-domain-dedication-and-licence
http://purl.org/dc/terms/subject http://example.org/resource/Person
http://purl.org/dc/terms/subject http://example.org/resource/Hero
http://rdfs.org/ns/void#subjectsTarget dataset:Fictive-Persons
http://rdfs.org/ns/void#objectsTarget dataset:Superheroes
http://rdfs.org/ns/void#feature format:Turtle
http://rdfs.org/ns/void#linkPredicate owl:sameAs
http://rdfs.org/ns/void#triples 9
http://rdfs.org/ns/void#entities 12
http://rdfs.org/ns/void#distinctSubjects 3
http://rdfs.org/ns/void#distinctObjects 9
http://vocabulary/voidPlus#clusters 3
http://vocabulary/validation#count 8
http://vocabulary/validation#accepted 7
http://vocabulary/validation#rejected 1
http://vocabulary/validation#remains 1
------------------------
-- LINKSET/LENS TABLE --
------------------------
------------------------------------------------------------------------------------------------------------------------------------------------
| NamedGraph Source Target ValStatus ValRational |
------------------------------------------------------------------------------------------------------------------------------------------------
http://example.org/#heroesIdentity http://example.org/hero#BlackWidow http://example.org/person#Nat True
http://example.org/#heroesIdentity http://example.org/hero#BlackWidow http://example.org/person#Romanoff True
http://example.org/#heroesIdentity http://example.org/hero#BlackWidow http://example.org/person#Natalia-Romanova True
http://example.org/#heroesIdentity http://example.org/hero#BlackWidow http://example.org/person#Natasha True
http://example.org/#heroesIdentity http://example.org/hero#Spiderman http://example.org/person#Peter-parker True
http://example.org/#heroesIdentity http://example.org/hero#Spiderman http://example.org/person#Tom-Holland False Tom-Holland does not have Spiderman properties which Peter-parker (fictitious) a.k.a Spiderman has.
http://example.org/#heroesIdentity http://example.org/hero#Superman http://example.org/person#Clark-Kent True
http://example.org/#heroesIdentity http://example.org/hero#Superman http://example.org/person#Joseph True
http://example.org/#heroesIdentity http://example.org/hero#Superman http://example.org/person#Kal-El True
3.2 JSON-LD file format¶
JSON-LD OUTPUT. TODO....
Example 11 Linkset in a JSON-LD format..
{
"@context":
{
"name": "http://xmlns.com/foaf/0.1/name",
"homepage":
{
"@id": "http://xmlns.com/foaf/0.1/workplaceHomepage",
"@type": "@id"
},
"Person": "http://xmlns.com/foaf/0.1/Person"
},
"@id": "https://me.example.com",
"@type": "Person",
"name": "John Smith",
"homepage": "https://www.example.com/"
}
3.3 RDF file formats¶
Knowing the right RDF file format in which to export a collection of links depends on a number of parameters. Depending on whether the collection of links is unnamed, named and/or annotated, one can respectively have the export output in Turtle or Trig. A Trig file implies a named graph but it says little about the reification approach taken for annotating individual triples or the graph of collection. In the event of annotation, one can choose whether to go for an RDF standard reification, singleton properties or RDFstar.
In this pile of options, our choice goes for always having a named-graph reified using the RDFstar syntax if one has RDF* and SPAQRL* in place. If not, our second option goes for singletons.
Example 12: nLinkset in an RDF Turtle format using singletons.
Our preference goes to an RDF* Turtle Output as illustrated in Example-6. However, because RDF* is not deployed on all triple stores, we provide the singleton, and RDF standard reification representation alternatives as well. An example of singletons in available here.
ex:heroes-Identity
a void:Linkset ;
dcterms:description "Identifying Marvel's heroes" ;
dcterms:license law:odc-public-domain-dedication-and-licence ;
dcterms:subject <http://example.org/resource/Person> ;
dcterms:subject <http://example.org/resource/Hero> ;
void:subjectsTarget dataset:Fictive-Persons ;
void:objectsTarget dataset:Superheroes ;
void:feature format:Turtle ;
void:linkPredicate owl:sameAs ;
void:triples 9 ;
void:entities 12 ;
void:distinctSubjects 3 ;
void:distinctObjects 9 ;
voidPlus:clusters 3 ;
validation:count 8 ;
validation:accepted 7 ;
validation:rejected 1 ;
validation:remains 1 ;
ex:heroes-Identity
{
hero:BlackWidow singleton:sameAs-1 person:Nat .
hero:BlackWidow singleton:sameAs-2 person:Romanoff .
hero:BlackWidow singleton:sameAs-3 person:Natalia-Romanova .
hero:BlackWidow singleton:sameAs-4 person:Natasha .
hero:Spiderman singleton:sameAs-5 person:Peter-parker .
hero:Spiderman singleton:sameAs-6 person:Tom-Holland .
hero:Superman singleton:sameAs-7 person:Clark-Kent .
hero:Superman singleton:sameAs-8 person:Joseph .
hero:Superman singleton:sameAs-9 person:Kal-El .
}
ex:heroes-Identity-singletons
{
singleton:sameAs-1
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-2
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-3
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-4
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-5
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-1
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
singleton:sameAs-1
rdf:subPropertyOf owl:sameAs ;
validation:status "True" .
hero:Superman owl:sameAs person:Kal-El .
singleton:sameAs-6
rdf:subPropertyOf owl:sameAs ;
validation:status "False" ;
validation:Rational "Tom-Holland does not have Spiderman properties which Peter-parker (fictitious) a.k.a Spiderman has." .
}
4. Link Restrictions¶
Whenever links are annotated, the Lenticular Lens offers a way to take advantage of it. It provides the user with options to filter out links of no interest, an alternative to the inconvenience of having to download all links when not needed. Example-12 lists all possible link restrictions options. For example to make sure that only accepted links are exported, the options Accepted is chosen in Example-12.
Example 13: Link Restrictions Options.
- All : Export all links
- Accepted : Export only accepted links.
- Rejected : Export only rejected links.
- Validated : Export rejected or validated links
- Not Validated : Export links with no accepted or rejected annotation.
- Threshold : Export links that pass a predefined threshold condition.
This feature applies to links annotated with confidence values.
5. Default Export Table¶
Example-14 summarises the various options for exporting a linkset or lens. In each set of selection category, a default selection is checked in. This means that, by default, an exported linkset or lens comes along with all its links and its complete annotation in a CSV format.
Example 13: Export Table.
METADATA STRUCTURE | RDF REIFICATION SYNTAX | LINK EXPORT FORMAT | LINK RESTRICTIONS |
---|---|---|---|
Partial:Generic | Standard RDF reification | RDF Turtle-trig | All |
Partial:Specific | RDFstar | RDF Turtle | Accepted |
Singleton | JSON-LD | Rejected | |
CSV | Validated | ||
Not Validated | |||
Threshold |