-
Notifications
You must be signed in to change notification settings - Fork 3
/
Copy pathdocumentation.html
898 lines (898 loc) · 56.9 KB
/
documentation.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
<div xmlns="http://www.w3.org/1999/xhtml" data-template="templates:surround" data-template-with="templates/page.html" data-template-at="content">
<div class="row-fluid">
<div class="span16">
<div class="page-header">
<h1 data-template="config:app-title">Generated page</h1>
<p>A generic webservice to extract RDF statements from XML resources.</p>
</div>
<div class="row-fluid">
<div class="span8">
<h2>Documentation</h2>
<h3 id="background">Background</h3>
<p>
The extraction of RDF statements from XML data can be a complicated process involving a lot of XSLT magic.
Transformations are always specific to the XML data in question and cannot easily be applied to other XML repositories.
It gets even more difficult if the markup of the XML data in question cannot be influenced.
</p>
<p>
However, the underlying principle of creating RDF statments out of XML data is simple.
</p>
<ul>
<li>If we take a XML resource or a value in it as the thing we want to talk about, this is our <strong>subject</strong>.</li>
<li>We want to join this subject to "something else" using a <strong>predicate</strong> from a controlled vocabulary.</li>
<li>This "something else" is our <strong>object</strong>, consisting either of the XML resource itself, some value in it or some other resource on the semantic web.</li>
</ul>
<p>
The XTriples webservice makes it possible to create such statements out of any XML data based on a simple XML configuration.
The configuration defines the XML resources to crawl, the vocabularies to use
and a number of <strong>XPATH/XQuery based statement patterns</strong> to apply on the data.
</p>
<p>
The webservice accepts POST or GET request to <code>https://xtriples.lod.academy/extract.xql</code>.
</p>
<h3 id="Request">Requests</h3>
<h4>POST requests</h4>
<p>You can submit POST requests to <code>https://xtriples.lod.academy/extract.xql</code>
</p>
<p>The request body should contain your XTriples configuration. Additionally, you need to send the <code>Content-Type</code> HTTP header with a value of <code>application/xml</code> and the <code>format</code> HTTP header with one of the following values:</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>value</th>
<th>result</th>
</tr>
</thead>
<tbody>
<tr>
<td>rdf</td>
<td>returns extraction result as RDF</td>
</tr>
<tr>
<td>turtle</td>
<td>returns extraction result in Turtle notation</td>
</tr>
<tr>
<td>ntriples</td>
<td>returns extraction result as N-Triples</td>
</tr>
<tr>
<td>nquads</td>
<td>returns extraction result as N-Quads</td>
</tr>
<tr>
<td>trix</td>
<td>returns extraction result as TriX named graph</td>
</tr>
<tr>
<td>json</td>
<td>returns extraction result as JSON-LD</td>
</tr>
<tr>
<td>svg</td>
<td>returns extraction result as SVG Graph</td>
</tr>
<tr>
<td>xtriples</td>
<td>returns extraction result as XTriples XML for extractging purposes</td>
</tr>
</tbody>
</table>
<p>If you send no format header, the format defaults to rdf.</p>
<h4>GET requests</h4>
<p>The most compact way to use the webservice is via HTTP GET requests. This is the URL scheme:</p>
<p>
<code>https://xtriples.lod.academy/extract.xql?configuration=###YOUR_URI###&format=###FORMAT_KEYWORD###</code>
</p>
<p>The keywords for the format parameter are the same as for direct POST requests (see above).</p>
<h3 id="basic-configuration">Basic configuration</h3>
<p>
A configuration consists of a simple XML structure that tells the webservice which XML collections to crawl and which statements to extract from each resource of a collection. It has three relevant
areas.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<xtriples>
<configuration>
<vocabularies>
[1]
</vocabularies>
<triples>
<statement>[...]</statement>
<statement>[...]</statement>
<statement>[...]</statement>
[2]
</triples>
</configuration>
<collection>
[3]
</collection>
</xtriples></pre>
<p>
Below <code><collection></code> you configure the XML data that should be processed by the service. Below <code><vocabularies></code> you can configure the RDF vocabularies you would
like to use. Below <code><triples></code> you configure one ore more <code><statement></code> patterns that contain your <em>subjects</em>, <em>predicates</em> and <em>objects</em>.
</p>
<h4 id="collection">The <collection> tag</h4>
<p>
A collection consists of XML data. It can be a single XML file
with repeating nodes that represent the "resources". It can also be a XML based list of URLs pointing to single XML resources.
During extraction the webservice
crawls over all resources of a collection and applies the configured statements patterns to each XML resource.
It is possible to use several <code><collection></code> tags per configuration.
</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>uri</td>
<td>no</td>
<td>string</td>
<td>
The XML of the collection will be fetched from this URI if it is not submitted literally to the webservice.
</td>
</tr>
<tr>
<td>max</td>
<td>no</td>
<td>integer</td>
<td>
When set the extraction will stop after the number of resources set in this attribute.
</td>
</tr>
</tbody>
</table>
<p>
There are three methods for crawling the resources of a collection: XPATH based, link based and literal.
</p>
<p class="alert alert-info">
<strong>Note:</strong> It is possible to mix the three methods within a <code><collection></code> tag.
</p>
<h5>XPATH based resource crawling</h5>
<p>
XPATH based resource crawling is an automatic way of extracting XML resources. It is very handy if you want to crawl a collection with an unkown number of resources.
The XPATH constructs the path to the XML resources of a collection. You specify it with curly braces in the <code>uri</code>
attribute of a child <code><resource></code> of your <code><collection></code> tag.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<xtriples>
<configuration>
[...]
</configuration>
<collection uri="http://xml.collection.somewhere/resources.xml">
<resource uri="http://xml.collection.somewhere/resources/{//id}.xml" />
</collection>
</xtriples></pre>
<p>
In this example, the <code>uri</code> attribute of the <code><collection></code> tag points to a XML file that contains
a list of IDs of XML resources to harvest. During extraction, this list provides the context for the expression
in the <code>uri</code> attribute of the child <code><resource></code> tag. The attribute of the <code><resource></code>
tag contains a URL. At any position within the URL string you are allowed to use XPATH expressions in curly braces that will be
executed on the XML of the <code><collection></code>. In the above example, the service will walk through all ids
found by the XPATH expression in the <code><collection></code> and substitute the curly brackets, building correct links to XML resources.
</p>
<h6>Example 1: XPATH based resource crawling with resources all in one single file</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/all.xml" data-template-class="btn btn-info btn-mini">XML</a>
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-01.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-01.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 2: XPATH based resource crawling with resources spread over multiple files</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/index.xml" data-template-class="btn btn-info btn-mini">XML</a>
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-02.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-02.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h5>Link based resource crawling</h5>
<p>
Link based resource crawling works by submitting <code><resource></code> tags below the <code><collection></code> tag with complete URLs to the XML files..
This approach is handy if you know all URLs of a collection in advance or if you want to crawl a fixed number of resources.
Just add any number of resource tags below the <code><collection></code> tag with the link to the according resource in the
<code>uri</code> attribute. Here is an example:
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<xtriples>
<configuration>
[...]
</configuration>
<collection>
<resource uri="http://xml.collection.somewhere/resources/1.xml" />
<resource uri="http://xml.collection.somewhere/resources/2.xml" />
<resource uri="http://xml.collection.somewhere/resources/3.xml" />
</collection>
</xtriples></pre>
<p>
This will make the webservice crawl the three resources in question.
</p>
<h6>Example 3: Link based resource crawling with fixed resources in the configuration file</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-03.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-03.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h5>Literal resource crawling</h5>
<p>
Crawling of literal XML resources is the fastest way of extracting statements. In this approach the XML resources are submitted literally to the webservice.
Just submit any well formed XML below a <code><resource></code> tag.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<xtriples>
<configuration>
[...]
</configuration>
<collection>
<resource>
<!-- Any wellformed XML -->
</resource>
<resource>
<!-- Some more wellformed XML -->
</resource>
</collection>
</xtriples></pre>
<h6>Example 4: Literal resource crawling with XML resources</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-04.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-04.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h4 id="vocabularies">The <vocabularies> tag</h4>
<p>
Below the <code><vocabularies></code> tag you can configure any number of RDF vocabularies you would like to use for your statements.
Its comparable to the header of a SPARQL query where you declare your vocabularies. Each <code><vocabulary></code> tag can have the
following attributes.
</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>prefix</td>
<td>yes</td>
<td>string</td>
<td>
Sets the prefix for the vocabulary. This prefix can then be used in the <code>prefix</code> attribute
of a <code><subject></code>, <code><predicate></code> or <code><object></code> tag
(see below).
</td>
</tr>
<tr>
<td>uri</td>
<td>yes</td>
<td>xsAnyURI</td>
<td>
Sets the URI for the vocabulary.
</td>
</tr>
</tbody>
</table>
<h6>Code</h6>
<pre class="prettyprint linenums">
<xtriples>
<configuration>
<vocabularies>
<vocabulary prefix="rdf" uri="http://www.w3.org/1999/02/22-rdf-syntax-ns#"/>
<vocabulary prefix="dc" uri="http://purl.org/dc/elements/1.1/"/>
<vocabulary prefix="owl" uri="http://www.w3.org/2002/07/owl#"/>
</vocabularies>
[...]
</configuration>
[...]
</xtriples></pre>
<h4 id="triples">The <triples> tag</h4>
<p>
Below the <code><triples></code> tag you can define any number of <code><statement></code> patterns for the extraction.
The statement patterns are the core of the XTriples webservice.
</p>
<pre class="prettyprint linenums">
<triples>
<statement>[...]</statement>
<statement>[...]</statement>
<statement>[...]</statement>
</triples></pre>
<h4 id="statement">The <statement> tag</h4>
<p>
Each <code><statement></code> consists of exactly one <code><subject></code>, <code><predicate></code> and
<code><object></code> tag. An optional <code><condition></code> tag is allowed. Together they form a statement pattern that
will be applied to each XML resource that is crawled by the webservice.
</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>repeat</td>
<td>no</td>
<td>integer and/or {XPATH/XQuery}</td>
<td>
This will repeat the extraction of the statement pattern as many times as the number set in the attribute. The value of the current
iteration is written to the <code>$repeatIndex</code> variable. The attribute can contain a valid XPATH/XQuery expression in curly braces that
must result in an integer (no node sets allowed).
</td>
</tr>
</tbody>
</table>
<h6>Code</h6>
<pre class="prettyprint linenums">
<statement>
<subject prefix="resource">//id</subject>
<predicate prefix="rdf">type</predicate>
<object prefix="skos" type="uri">/string("Concept")</object>
</statement></pre>
<p>
The code above results in a skos statement for each resource from the configured XML repository.
</p>
<h6>Example 5: FOAF statement for one resource</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/1.xml" data-template-class="btn btn-info btn-mini">XML</a>
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-05.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-05.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h4 id="subject">The <subject> tag</h4>
<p>
The <code><subject></code> tag of a statement pattern can either result in an URI or a blank node. In case of a blank node the blank node
must have been created already by an earlier <code><object></code> tag.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<statement>
<subject prefix="###MY_VOCABULARY_PREFIX###">//id</subject>
[...]
</statement></pre>
<p>
In the above example, the XPATH expression will fetch the id value from each crawled resource. It is used together with the <code>prefix</code> attribute
to construct the subject URIs.
</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>prefix</td>
<td>no</td>
<td>string</td>
<td>
The <code>prefix</code> attribute can contain a value
defined in one of the vocabularies' <code>prefix</code> attributes. This binds the <code><subject></code> to a <code><vocabulary></code>
namespace. During exraction the value of the <code>prefix</code> attribute is substituted with the URI that has been defined
for the vocabulary's <code>uri</code> attribute. Similar to the prefix concept known from SPARQL.
</td>
</tr>
<tr>
<td>type</td>
<td>no</td>
<td>
<ul style="list-style-type: none; margin: 0; padding: 0;">
<li>bnode</li>
</ul>
</td>
<td>
Only needed for connecting blank nodes. When you set the type of a <code><subject></code> tag to <em>bnode</em>, then you should
apply the same identifier that has already been created by an earlier <code><object></code> tag expression. This will make the former
object (blank node) the subject of a new statement.
</td>
</tr>
<tr>
<td>resource</td>
<td>no</td>
<td>xsAnyURI</td>
<td>
If you set the <code>resource</code> attribute to a valid URL for an external XML document, this URL will be fetched during extraction and the <code><subject></code>
XPATH/Xquery will be executed on the external XML rather than the XML of the current resource.
</td>
</tr>
<tr>
<td>prepend</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>prepend</code> attribute is handy if you need to prepend a value to the current patterns XPATH/XQuery result. The attribute value should contain
a valid XPATH/XQuery expression in curly braces that results in a string (no node sets allowed). This expression will be executed <em>after</em> the <code><subject></code>
expression and it's result will be <em>prepended</em> to the overall result.
</td>
</tr>
<tr>
<td>append</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>append</code> attribute works the same as the <code>prepend</code> attribute only that the result will be <em>appended</em> to the overall result.
</td>
</tr>
</tbody>
</table>
<h5>Examples</h5>
<h6>Example 6: Extracting a subject URI</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-06.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-06.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 7: Creating a subject blank node</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-07.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-07.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 8: Including a value from an external XML resource in the subject</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-08.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-08.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 9: Prepend and append values to the subject result</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-09.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-09.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h4 id="predicate">The <predicate> tag</h4>
<p>
The <code><predicate></code> tag of a statement pattern contains the property URIs of the vocabularies defined in the <code><vocabularies></code>
section of the configuration. The final result of a predicate expression must always be an URI.
The predicate tag mostly contains simple property strings. But it is also possible to execute an XPATH/XQuery in the tag or
retrieve a value from an external XML resource by using the <code><resource></code> attribute.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<statement>
[...]
<predicate prefix="###MY_VOCABULARY_PREFIX###">###MY_PROPERTY###</subject>
[...]
</statement></pre>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>prefix</td>
<td>yes</td>
<td>string</td>
<td>
The required <code>prefix</code> attribute of the <code><predicate></code> must contain a value defined in a vocabulary's <code>prefix</code> attribute.
This binds the <code><predicate></code> to a <code><vocabulary></code> namespace.
The value of the <code>prefix</code> attribute is substituted with the URI that has been defined in the vocabulary's <code>uri</code> attribute.
</td>
</tr>
<tr>
<td>resource</td>
<td>no</td>
<td>xsAnyURI</td>
<td>
If you set the <code>resource</code> attribute to a valid URL for an external XML document, this URL will be fetched during extraction and the <code><subject></code>
XPATH/Xquery will be executed on the external XML rather than the XML of the current resource.
</td>
</tr>
<tr>
<td>prepend</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>prepend</code> attribute is handy if you need to prepend a value to the current patterns XPATH/XQuery result. The attribute value should contain
a valid XPATH/XQuery expression in curly braces that results in a string (no node sets allowed). This expression will be executed <em>after</em> the <code><predicate></code>
expression and it's result will be <em>prepended</em> to the overall result.
</td>
</tr>
<tr>
<td>append</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>append</code> attribute works the same as the <code>prepend</code> attribute only that the result will be <em>appended</em> to the overall result.
</td>
</tr>
</tbody>
</table>
<h4 id="object">The <object> tag</h4>
<p>
The <code><object></code> tag of a statement pattern can either result in an URI, a literal value or a blank node. Literal values
can be typed or tagged with a language code. In case of a blank node, the object expression should lead to a
unique identifier for the node.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<statement>
[...]
<object prefix="###MY_VOCABULARY_PREFIX###" type="uri">###MY_VALUE###</object>
[...]
<object type="literal" lang="en">###MY_VALUE###</object>
[...]
<object type="literal" datatype="integer">###MY_VALUE###</object>
[...]
<object type="bnode">###MY_UNIQUE_ID###</object>
</statement></pre>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>attributes</th>
<th>required</th>
<th>values</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>prefix</td>
<td>no</td>
<td>string</td>
<td>
If the result of the <code><object></code> tag is a URI, the <code>prefix</code> attribute can be used to bind
the <code><object></code> to a defined <code><vocabulary></code> namespace. The value of the <code>prefix</code>
attribute is substituted with the URI that has been defined in the according vocabulary's <code>uri</code> attribute. Similar
to the prefix concept known from SPARQL.
</td>
</tr>
<tr>
<td>type</td>
<td>no</td>
<td>
<ul style="list-style-type: none; margin: 0; padding: 0;">
<li>uri</li>
<li>literal</li>
<li>bnode</li>
</ul>
</td>
<td>
If the value is set to <em>uri</em>, the final result of the object expression should be a xsAnyURI. The <code><prefix></code>
attribute can be used to bind a vocabulary namespace to the object value. Alternatively, you can of course construct or extract full
URIs out of values in the current resource's XML without using a prefix. If the value is set to <em>literal</em>, a plain string value
is expected as the result of the object expression. This value can then be typed using the <code><datatype></code> attribute
or tagged with a language attribute by using the <code><lang></code> attribute. If the value is set to <em>bnode</em> the result
of the object expression should be a unique identifier (see below for more details about blank nodes and some examples).
</td>
</tr>
<tr>
<td>resource</td>
<td>no</td>
<td>xsAnyURI</td>
<td>
If you set the <code>resource</code> attribute to a valid URL for an external XML document, this URL will be fetched during extraction and the <code><subject></code>
XPATH/Xquery will be executed on the external XML rather than the XML of the current resource.
</td>
</tr>
<tr>
<td>lang</td>
<td>no</td>
<td>ISO code</td>
<td>
You can language tag your object literals by using this attribute. Set the attribute value to one of the ISO language codes.
</td>
</tr>
<tr>
<td>datatype</td>
<td>no</td>
<td>XML Schema datatype</td>
<td>
You can type your object literals with this attribute. The attribute value can contain any of the official data types from
XML schema (like integer, float, double, decimal, time, date etc.)
</td>
</tr>
<tr>
<td>prepend</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>prepend</code> attribute is handy if you need to prepend a value to the current patterns XPATH/XQuery result. The attribute value should contain
a valid XPATH/XQuery expression in curly braces that results in a string (no node sets allowed). This expression will be executed <em>after</em> the <code><object></code>
expression and it's result will be <em>prepended</em> to the overall result.
</td>
</tr>
<tr>
<td>append</td>
<td>no</td>
<td>string + {XPATH/XQuery}</td>
<td>
The <code>append</code> attribute works the same as the <code>prepend</code> attribute only that the result will be <em>appended</em> to the overall result.
</td>
</tr>
</tbody>
</table>
<h6>Example 10: Creating an object URI</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-10.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-10.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 11: Creating a typed object literal</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-11.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-11.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 12: Creating a language tagged literal</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-12.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-12.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 13: Including a value from an external resource in the object</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-13.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-13.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h4 id="condition">The <condition> tag</h4>
<p>
Sometimes a statement pattern should only be applied to resources if a specific condition matches.
The <condition> tag allows you to define an expression and only if the result returns TRUE the statement pattern will be applied to the
resource.
</p>
<h6>Code</h6>
<pre class="prettyprint linenums">
<statement>
<condition>/image</condition>
<subject type="uri">/image/@url</subject>
<predicate prefix="rdf">type</predicate>
<object prefix="foaf" type="uri">Image</object>
</statement></pre>
<p>
The above statement pattern would only be applied to resources that have an image tag. If there is no image tag, the XPATH returns
empty and the statement pattern is not executed.
</p>
<h6>Example 14: Applying a <condition> to a statement pattern</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-14.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-14.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h3 id="advanced-configuration">Advanced configuration</h3>
<h4 id="variables">Context variables</h4>
<p>
The context node for each subject/predicate/object expression is always the resource that is currently crawled.
You can reference it with "." in your expressions or by using the <code>$currentResource</code> variable. When crawling
single file repositories where resources consist of repeating nodes in one XML, it is also possible to walk
above the resource node by using "..".
</p>
<h6>Reference</h6>
<table class="table table-bordered">
<thead>
<tr>
<th>variable</th>
<th>description</th>
</tr>
</thead>
<tbody>
<tr>
<td>$currentResource</td>
<td>
The <code>$currentResource</code> variable contains the full XML content of the current resource. It can be used in all XQuery
functions and XPATH expressions of the XTriples webservice (see example below).
</td>
</tr>
<tr>
<td>$externalResource</td>
<td>
Same as $currentResource but for contexts where you load an external XML resource instead of the current one using the <code>resource</code>
attribute in a subject, predicate or object tag.
</td>
</tr>
<tr>
<td>$resourceIndex</td>
<td>
The <code>$resourceIndex</code> variable contains the number of the XML resource currently crawled.
</td>
</tr>
<tr>
<td>$repeatIndex</td>
<td>
The <code>$repeatIndex</code> variable contains the number of the extraction iteration triggered by the <code>repeat</code> attribute
of a statement pattern.
</td>
</tr>
</tbody>
</table>
<h6>Example 15: Using the $currentResource variable in statement patterns</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-15.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-15.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h6>Example 16: Using the $repeatIndex variable in statement patterns</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-16.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-16.xml" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h4 id="oneton">1:n statements</h4>
<p>
It is possible to create 1:n statements with a single statement pattern. This happens automatically when an XPATH/XQuery
in the <code>object</code> part of a statement pattern yields a node set. The subject and predicate are then repeated for each node of the result set, creating n statements in total, each with
the same subject and predicate but with different object values.
</p>
<h6>Example 17: Creating a 1:n statement with an object node set</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-17.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-17.xml" data-template-class="btn btn-info btn-mini">Result</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-17.xml&format=svg" data-template-class="btn btn-info btn-mini">SVG</a>
</p>
<h4 id="ntoone">n:1 statements</h4>
<h6>Example 18: Creating a n:1 statement with a subject node sets</h6>
<p>
It is also possible to create n:1 statements with a single statement pattern. The technique is the same as with 1:n statements, with the difference
that this time the XQuery/XPATH of the <code>subject</code> part of the statement pattern yields a node set. The predicate and object are then repeated for each node of the
result set using a different subject value.
</p>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-18.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-18.xml" data-template-class="btn btn-info btn-mini">Result</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-18.xml&format=svg" data-template-class="btn btn-info btn-mini">SVG</a>
</p>
<h4 id="ntom">n:m statements</h4>
<p>
In a use case where n-subjects should be bound to m-objects with a single statement pattern, the same rules as described above apply. In a n:m case both, the <code>subject</code>
and the <code>object</code> expressions yield node sets. The webservice then iterates over all nodes of the subject result set and connects them to all nodes from the object
result set.
</p>
<h6>Example 19: Creating n:m statements in a single statement pattern</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-19.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-19.xml" data-template-class="btn btn-info btn-mini">Result</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-19.xml&format=svg" data-template-class="btn btn-info btn-mini">SVG</a>
</p>
<h3 id="errors">Error handling and debugging</h3>
<p>
Quite naturally when writing XPATH/XQuery expressions on a given XML, there will be some errors (like typecast errors etc.) or unexpected results (like empty node sets etc.).
XTriples tries to handle such errors transparently. The strategy is not to terminate query execution due to a fatal error but to catch it and then pass a
meaningful error description back to the user.
</p>
<p>The easiest way to debug unexpected outcomes is to use the internal XTriples result format.
This format is generated right before the transformation to RDF and contains all results of all statement expressions in a structure similar to the original configuration.
</p>
<p>
In this format it is easy to identify XPATH expressions for subjects, predicates or objects that came out empty or see the error messages thrown by XQuery functions.
All that needs to be done to get the internal format for debugging is to append the call to the webservice with a <code>&format=xtriples</code> parameter.
</p>
<h6>Example 20: Using the internal xtriples result format for debugging purposes</h6>
<p class="btn-group">
<a data-template="app:link" data-template-href="$baseUrl/examples/gods/conf-20.xml" data-template-class="btn btn-info btn-mini">Configuration</a>
<a data-template="app:link" data-template-href="extract.xql?configuration=$baseUrl/examples/gods/conf-20.xml&format=xtriples" data-template-class="btn btn-info btn-mini">Result</a>
</p>
<h3 id="components">Webservice components</h3>
<p>
XTriples is built on top of some great software packages. The foundation for XTriples is the <a href="http://exist-db.org/exist/apps/homepage/index.html">eXist XML database</a>.
The RDF transformations are done with the <a href="https://any23.apache.org/">Apache any23</a> webservice. For SVG graph rendering, the <a href="http://rhizomik.net/html/redefer/rdf2svg-form/">Redefer rdf2svg</a> webservice is used.
A nice and simple software package for vizualising RDF graphs with JavaScript is the <a href="http://graves.cl/visualRDF/?url=http://graves.cl/visualRDF/">visualRDF</a> webservice.
</p>
<p>
<a href="resources/images/process.png">
<img src="resources/images/process.png" alt="process chain of xtriples"/>
</a>
</p>
<h3 id="setup">
Setup your own instance
</h3>
<h4 id="exist">eXist XML database</h4>
<p>
First of all install an instance of eXist on your local computer or a server of your choice. You can find the eXist packages and the installation
instructions on the eXist webpage: <a href="http://exist-db.org/exist/apps/homepage/index.html">http://exist-db.org/exist/apps/homepage/index.html</a>. Once eXist is installed and
running, grab the latest XTriples XAR file from here: <a href="http://download.spatialhumanities.de/ibr/">http://download.spatialhumanities.de/ibr/</a>. Alternatively you can
build the XAR file yourself from the latest sources on <a href="https://github.com/spatialhumanities/xtriples">GitHub</a>. Once you have the XAR, simply install it via the
eXist dashboard. Finally adapt the value of the <code>$xtriplesWebserviceURL</code> variable in <code>extract.xql</code> to the URL of your instance of XTriples.
</p>
<p>
Depending on how independent you want to run your instance of XTriples, you can optionally install local instances of the <em>any23</em> and <em>redefer</em> webservices. If you do this
dont forget to adapt the <code>$any23WebserviceURL</code>, <code>$redeferWebserviceURL</code> and <code>$redeferWebserviceRulesURL</code> in <code>extract.xql</code> to the URLs on which you run
these webservices.
</p>
<h4 id="any23">Any23 webservice</h4>
<p>
You can download the latest version of Apache Any23 right here: <a href="https://any23.apache.org/download.html">https://any23.apache.org/download.html</a>. Check out the
<a href="https://any23.apache.org/install.html">installation instructions</a>.
</p>
<h4 id="rhizomik">Redefer webservice</h4>
<p>
You can download and install the latest version of the Redefer rdf2svg from here: <a href="https://github.com/rhizomik/redefer-rdf2svg">https://github.com/rhizomik/redefer-rdf2svg</a>.
</p>
</div>
<div class="span4">
<h3>Outline</h3>
<ol>
<li>
<a href="#background">Background</a>
</li>
<li>
<a href="#requests">Requests</a>
</li>
<li>
<a href="#basic-configuration">Basic configuration</a>
<ul>
<li>
<a href="#collection"><collection></a>
</li>
<li>
<a href="#vocabularies"><vocabularies></a>
</li>
<li>
<a href="#triples"><triples></a>
</li>
<li>
<a href="#statement"><statement></a>
</li>
<li>
<a href="#subject"><subject></a>
</li>
<li>
<a href="#predicate"><predicate></a>
</li>
<li>
<a href="#object"><object></a>
</li>
<li>
<a href="#condition"><condition></a>
</li>
</ul>
</li>
<li>
<a href="#advanced-configuration">Advanced configuration</a>
<ul>
<li>
<a href="#variables">Context variables</a>
</li>
<li>
<a href="#oneton">1:n statements</a>
</li>
<li>
<a href="#ntoone">n:1 statements</a>
</li>
<li>
<a href="#ntom">n:m statements</a>
</li>
</ul>
</li>
<li>
<a href="#errors">Error handling and debugging</a>
</li>
<li>
<a href="#components">Webservice components</a>
</li>
<li>
<a href="#setup">Setup your own instance</a>
<ul>
<li>
<a href="#exist">eXist XML database</a>
</li>
<li>
<a href="#any23">Apache any23 webservice</a>
</li>
<li>
<a href="#rhizomik">Rhizomik RDF to SVG webservice</a>
</li>
</ul>
</li>
</ol>
</div>
<div class="span4">
<h3>Schema</h3>
<p>
Check out the RelaxNG <a href="xtriples.rng">schema</a> for an exact documentation of all configuration options.
</p>
</div>
</div>
</div>
</div>
</div>