Bridge to networked research data
Note
|
This software is developed and supported by the Huygens Institute in the Netherlands. We intend to support the software indefinitely, but 2021 is our current planning horizon. This notice will be updated before the end of 2021 with the new support duration.
|
Timbuctoo is aimed at historians doing interpretative research. Such a researcher collects facts from various documents, interprets and structures them, and creates tables of these facts. The researcher then uses this new dataset either as an object to do analysis on, or as an index that allows them to look at their sources from a distance quickly stepping back to the original source for the real interpretative work.
As such an historian you often need to categorize your findings. For example: you keep track of birthplaces. You then need to decide how to write down the birthplace
-
Do you use the name of the city, or the burrough?
-
Do you use the current name or the name when the person was born?
-
If your dataset spans a long time you might have two different cities on the same geographical location. Are they one name or more?
These judgements are sometimes the core of your research and sometimes incidental to it. Timbuctoo aims to make it easier to publish your research dataset and then to re-use other people’s published datasets. To continue the example: another researcher might create a dataset containing locations, their coördinates and names and how that changed over time. You can then re-use that dataset and link to the entries instead of typing a string of characters that a humand might correctly interpret or not.
There are database-like systems, so storing your data somewhere is easy. However, there are not many tools that will:
-
allow you to upload any dataset without having to write code, (for most database importing large datasets will require you to write some amount of SQL, SPARQL or batch processing code)
-
expose your dataset so that it can be retrieved by another researcher (a http download and a REST interface)
-
allow the researcher to base it’s new dataset on that existing dataset
-
with a provenance trail
-
without having to agree on the data model
-
without having to agree on all data contents
-
-
keep track of updates to the source dataset and allow the user to subscribe to these changes
Which is the added value timbuctoo will bring.
The following prerequisites need to be installed on the machine before running the Timbuctoo program:
After the above requirements are fulfilled you can follow the following instructions to install Timbuctoo:
-
Clone the Timbuctoo Github repository into a local directory using the command:
git clone https://github.com/HuygensING/timbuctoo.git
-
On the Timbuctoo root directory run the Maven build command:
mvn clean package
-
On the "/devtools/debugrun" directory within your Timbuctoo repository, run:
./debugrun.sh
-
You can run a curl command of the following format to upload data into Timbuctoo:
curl -v -F "file=@/<complete_path_to_file>/<filename>.ttl;type=<filetype>" -F "encoding=UTF-8" -H "Authorization: fake" http://localhost:8080/v5/u33707283d426f900d4d33707283d426f900d4d0d/hpp6demo/upload/rdf?forceCreation=true
`u33707283d426f900d4d33707283d426f900d4d0d` the user id of the user when no security is used.
-
You can use the provided bia_clusius.ttl data as a example dataset. The <filetype> for this is "text/turtle". It is available in the following folder:
"<complete path to directory>/huygens/timbuctoo/timbuctoo-instancev4/src/test/resources/nl/knaw/huygens/timbuctoo/v5/bia_clusius.ttl"
-
Note that the above method forces a creation at upload time. Creating a dataset before doing the upload can be done at path:
"<host>/v5/dataSets/{userId}/{dataSetId}/create"
-
With Timbuctoo running, you can access the GraphIQL in-browser IDE by pointing your web-browser to the following address:
http://localhost:8080/static/graphiql
-
Choose the appropriate dataset from the "select dataset" dropdown and the appropriate type from the "select accept media type" dropdown
-
Use a query of the following basic format to query for data from the selected dataset:
{ field(arg: "value") { subField } }
-
Press "Ctrl + Enter" or the "play button" on the top of the IDE window to run your query. The result will be displayed on the right pane.
I can’t access my data from the GraphiQL and I get the error "SyntaxError: JSON.parse: unexpected character at line 1 column 1 of the JSON data" on the right pane when I try to query for data.
It is likely that the filepath given while using the curl command to load the dataset was incorrect. Please note that the filepath to the dataset should be given in full (i.e. complete path from root) with a '@' symbol preceding it.
Timbuctoo is licensed under the GPL license
See the contribution guidelines
Read about compiling, installing/running and using/developing timbuctoo in the documentation folder. A nicely rendered version of this documentation can be found online.
Timbuctoo is funded by
-
The Huygens Institute (indefinite)
-
CLARIAH.nl (until …)
-
NDE (funding ends december 2016)
This repository is available online at https://github.com/HuygensING/timbuctoo