Apache Jena to learn RDF and SPARQL
RDF is one of the semantic web technology as well as the foundation for Turtle, N-Triples including JSON-LD. SPARQL is the query language for RDF. Use Apache Jena tools to learn RDF.
For example, if you see the web page, that is human readable, because the end-user is human. However, there are search engines who choose the page on behalf of the consumer. Therefore, a search engine is a machine who wants to read the web page metadata. There should be well-structured data in the web page to be understood by the search engines by semantic parsing.
Some of the machine-readable metadata are:
meta
tag- Microsdta
- Microformats
- RDFa
- JSON-LD
It is necessary to know where the Resource Description Framework(RDF)1 and SPARQL fit in the semantic web2. The semantic web is Web of data. RDF provides a foundation for publishing and linking data of all OWL3, SKOS, RDFS4 and so on. If the semantic web is a global database, SPARQL5 is the query language for that.
Introduction to RDF
For example, as explained in this article6, I can write a very simple RDF Turtle (rdf_example1.ttl) as follows:
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix : <http://ojitha.github.io/blog/> .
:post dc:title "Learn SPARQL for RDF" .
In the RDF/XML, when you convert TTL to XML using riot
7 tool available in the Apache Jean8
<JENA_HOME>/bin/riot --formatted=rdfxml rdf_example1.ttl
You get the output similar to the following:
<rdf:RDF
xmlns="http://ojitha.github.io/blog/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://ojitha.github.io/blog/post">
<dc:title>Learn SPARQL for RDF</dc:title>
</rdf:Description>
</rdf:RDF>
Here find the simple SPARQL query (rd_example1.rq) which is to find the title of a blog post of the given graph.
SELECT ?title
WHERE
{
<http://ojitha.github.io/blog/post> <http://purl.org/dc/elements/1.1/title> ?title .
}
to execute this query on the RDF data, I used Apache Jean
<JENA home>/bin/sparql --data=rdf_example1.ttl --query=rdf_example1.rq
The output of this is
--------------------------
| title |
==========================
| "Learn SPARQL for RDF" |
--------------------------
The RDF triple contains three components.
graph LR
s((Subject)) -- predicate ---o((Object))
As shown in the above diagram:
- Subject (s): RDF URI reference9 or blank node
- Predicate (p): RDF URI reference
- Object (o): RDF URI reference, a literal or blank node
As shown in the above graph (RDF Graph), it is a set of RDF triples.
RDF intended to represent metadata about web resources such as web page (author, title and so on). For example, Jekyll Front Matter10 suppose to define for static web pages. However, web resource means not only web page, although I take it as an example.
In the above RDF/XML shows,
- this is about
http://ojitha.github.io/blog/post
(say post) which is kind of description - this post has a property called
title
with the value of literal
We can say it is in one statement :
📝
http://ojitha.github.io/blog
/post has atitle
whose value is "Learn...."
To understand the RDF triple, execute the SPARQL query SELECT * WHERE { ?s ?p ?o }
using the following command:
sparql --data=rdf_example1.ttl --query=rdf_example1.rq
And your output show subject, predicate and object clearly.
----------------------------------------------------------------------------------------------------------
| s | p | o |
==========================================================================================================
| <http://ojitha.github.io/blog/post> | <http://purl.org/dc/elements/1.1/title> | "Learn SPARQL for RDF" |
----------------------------------------------------------------------------------------------------------
Now you can add another statement to the same data model
📝
http://ojitha.github.io/blog/post
has acreator
whovalue
is "Ojitha"
For example:
<rdf:RDF
xmlns="http://ojitha.github.io/blog/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
<rdf:Description rdf:about="http://ojitha.github.io/blog/post">
<dc:creator>Ojitha</dc:creator>
<dc:title>Learn SPARQL for RDF</dc:title>
</rdf:Description>
</rdf:RDF>
Now I am using foaf
to define the owner for the ojitha.github.io web site.
@prefix dc: <http://purl.org/dc/elements/1.1/> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix : <http://ojitha.github.io/blog/> .
:post dc:title "Learn SPARQL for RDF" .
:post dc:creator :owner .
:owner foaf:given "Ojitha".
:owner foaf:family "Kumanayaka" .
When you run the riot --formatted=rdfxml rdf_example2.ttl
command, the resulting RDF/XML is:
<rdf:RDF
xmlns="http://ojitha.github.io/blog/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/">
<rdf:Description rdf:about="http://ojitha.github.io/blog/post">
<dc:creator>
<rdf:Description rdf:about="http://ojitha.github.io/blog/owner">
<foaf:family>Kumanayaka</foaf:family>
<foaf:given>Ojitha</foaf:given>
</rdf:Description>
</dc:creator>
<dc:title>Learn SPARQL for RDF</dc:title>
</rdf:Description>
</rdf:RDF>
You can depict this as:
Triples of the Data Model
Number | Subject | Predicate | Object |
---|---|---|---|
1 | http://ojitha.github.io/blog/post | http://purl.org/dc/elements/1.1/creator | http://ojitha.github.io/blog/owner |
2 | http://ojitha.github.io/blog/owner | http://xmlns.com/foaf/0.1/family | "Kumanayaka" |
3 | http://ojitha.github.io/blog/owner | http://xmlns.com/foaf/0.1/given | "Ojitha" |
4 | http://ojitha.github.io/blog/post | http://purl.org/dc/elements/1.1/title | "Learn SPARQL for RDF" |
RDF family
You can find few serializations for RDF as explained in the RDF Primer11.
- Turtle12 (used above) and TRIG
- RDF/XML (used above)
- JSON-LD
- RDFa
- N-Triples
The most fundamental concept is International Resource Identifier(IRI) which is a global identifier can be reused. This can appear in subject, predicate or object in the RDF triple.
RDF is consist of IRIs and Literals(not only string, but there are other datatypes). The blank node is special, which is possible to use only in subject and object in the RDF triple.
RDF statements together create multi-graph. To form a multi-graph, you have to use RDF vocabulary, which is based on RDF11-SCHEMA. For example, RDF Class, use to classify resources into categories. Another one is Property, which defines the relationship between the two Classes. In addition to that, you can have sub-classes and sub-properties as well. For more information, you can find in RDF Vocabularies13. You can find number of vocabularies such as FOAF14, Dublin Core15, schema.org16 and [SKOS]17.
Embedding Turtle in HTML
you can use <script>
tag to embed Turtle document to the existing HTML page. For example then contents of RDF_example3.ttl can be embed as follows:
<script type="text/turtle">
@prefix : <http://ojitha.github.io/~ojitha/contact.rdf#> .
@prefix foaf: <http://xmlns.com/foaf/0.1/> .
@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .
:oj a foaf:Person ;
foaf:givenname "Ojitha" ;
foaf:family_name "Kumanayaka" ;
foaf:homepage <http://ojitha.github.io/> ;
foaf:mbox <mailto:ojithak@gmail.com> .
</script>
As well as see the complexity of the RDF/XML, when you execute riot --output=rdfxml rdf_example3.ttl
<rdf:RDF
xmlns="http://ojitha.github.io/~ojitha/contact.rdf#"
xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"
xmlns:foaf="http://xmlns.com/foaf/0.1/" >
<rdf:Description rdf:about="http://ojitha.github.io/~ojitha/contact.rdf#oj">
<foaf:mbox rdf:resource="mailto:ojithak@gmail.com"/>
<foaf:homepage rdf:resource="http://ojitha.github.io/"/>
<foaf:family_name>Kumanayaka</foaf:family_name>
<foaf:givenname>Ojitha</foaf:givenname>
<rdf:type rdf:resource="http://xmlns.com/foaf/0.1/Person"/>
</rdf:Description>
</rdf:RDF>
Here the JSON-LD code, when you execute the command: riot --output=json-ld rdf_example3.ttl
{
"@id" : "http://ojitha.github.io/~ojitha/contact.rdf#oj",
"@type" : "foaf:Person",
"family_name" : "Kumanayaka",
"givenname" : "Ojitha",
"homepage" : "http://ojitha.github.io/",
"mbox" : "mailto:ojithak@gmail.com",
"@context" : {
"mbox" : {
"@id" : "http://xmlns.com/foaf/0.1/mbox",
"@type" : "@id"
},
"homepage" : {
"@id" : "http://xmlns.com/foaf/0.1/homepage",
"@type" : "@id"
},
"family_name" : {
"@id" : "http://xmlns.com/foaf/0.1/family_name"
},
"givenname" : {
"@id" : "http://xmlns.com/foaf/0.1/givenname"
},
"@vocab" : "http://ojitha.github.io/~ojitha/contact.rdf#",
"rdf" : "http://www.w3.org/1999/02/22-rdf-syntax-ns#",
"foaf" : "http://xmlns.com/foaf/0.1/"
}
}
you can use <script type="application/ld+json">...</script>
to embed the above Json-LD to the HTML web page.
N-Triples is similar to this. You can use the [RDF Distiller][distiller] to generate.
<http://ojitha.github.io/~ojitha/contact.rdf#oj> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .
<http://ojitha.github.io/~ojitha/contact.rdf#oj> <http://xmlns.com/foaf/0.1/family_name> "Kumanayaka" .
<http://ojitha.github.io/~ojitha/contact.rdf#oj> <http://xmlns.com/foaf/0.1/givenname> "Ojitha" .
<http://ojitha.github.io/~ojitha/contact.rdf#oj> <http://xmlns.com/foaf/0.1/mbox> <mailto:ojithak@gmail.com> .
<http://ojitha.github.io/~ojitha/contact.rdf#oj> <http://xmlns.com/foaf/0.1/homepage> <http://ojitha.github.io/> .
you can create RDFa HTML file as follows using [RDF Distiller][distiller]
<?xml version='1.0' encoding='utf-8' ?>
<!DOCTYPE html>
<html prefix='foaf: http://xmlns.com/foaf/0.1/ rdf: http://www.w3.org/1999/02/22-rdf-syntax-ns#'
xmlns='http://www.w3.org/1999/xhtml'>
<body>
<div resource='http://ojitha.github.io/~ojitha/contact.rdf#oj' typeof='foaf:Person'>
<span class='type'>foaf:Person</span>
<div class='property'>
<span class='label'>
foaf:family_name
</span>
<span property='foaf:family_name'>Kumanayaka</span>
</div>
<div class='property'>
<span class='label'>
foaf:givenname
</span>
<span property='foaf:givenname'>Ojitha</span>
</div>
<div class='property'>
<span class='label'>
foaf:homepage
</span>
<a href='http://ojitha.github.io/' property='foaf:homepage'>http://ojitha.github.io/</a>
</div>
<div class='property'>
<span class='label'>
foaf:mbox
</span>
<a href='mailto:ojithak@gmail.com' property='foaf:mbox'>mailto:ojithak@gmail.com</a>
</div>
</div>
</body>
</html>
As shown in the above listings Turtle is the most easy formats to follow to write RDF documents.
Reference:
-
RDF, https://www.w3.org/RDF/ ↩
-
Semantic Web, https://www.w3.org/standards/semanticweb/
[distiller]: http://rdf.greggkellogg.net/distiller?command=serialize ↩ -
OWL, https://www.w3.org/OWL/ ↩
-
RDF URI reference, https://www.w3.org/TR/2004/REC-rdf-concepts-20040210/#section-Graph-URIref ↩
-
Jekyll Front Matter (https://jekyllrb.com/docs/step-by-step/03-front-matter/) ↩
-
RDF 1.1 Primer, http://www.w3.org/TR/rdf11-primer/ ↩
-
Turtle, https://www.w3.org/TR/turtle/ ↩
-
RDF Vocabularies, https://www.w3.org/TR/rdf11-primer/#section-vocabulary ↩
-
FOAF, http://xmlns.com/foaf/spec/ ↩
-
Dublin Core, http://dublincore.org/documents/dcmi-terms/ ↩
Comments
Post a Comment
commented your blog