This repo converts two datasets into a format that is compatible with AnnotateIt's jQuery plugin to recreate Open Shakespeare. It takes Moby's XML-formatted complete works of Shakespeare and annotation data from Finals Club stored on AnnotateIt.org and puts them on a MongoDb database and most importantly, allows them to work cohesively with version 1.2.6 of the AnnotateIt Plugin.
It can be easily customized and used to migrate this data to another URL or site with a different DOM structure using the Annotator Plugin or even the xpath jQuery library to write your own custom mapping script.
All the resources for converting the raw data to work cohesively with the AnnotateIt Plugin.
About the Data Sets
1. Convert Works
Convert the works of Shakespeare into the expected format
2. Add Works
Add works of Shakespeare into your MongoDb
3. Retrieve Annotations
Retrieve old [Finals Club annotation data](http://annotateit.org/api/search_raw?q=_exists_:finalsclub_id&size=3100&from=0) from [AnnotateIt.org](annotateit.org)
4. Add Annotations
Add the old [Finals Club annotation data](http://annotateit.org/api/search_raw?q=_exists_:finalsclub_id&size=3100&from=0) from [AnnotateIt.org](annotateit.org) to your MongoDb
5. Convert Annotation Schema
Convert the [Finals Club annotation data](http://annotateit.org/api/search_raw?q=_exists_:finalsclub_id&size=3100&from=0) into schema expected by the annotateIt plugin
The quick and dirty way to get your database set up correctly
Follow in order:
- Add Works
- Add Annotations
- Edit Annotations URI/Ranges
- Convert Annotation Schema
See the directions below to Convert into the expected HTML format and add it to your db
-- ``` npm install ``` var mongoose = require('mongoose');
var fs = require('fs');
//edit the string to refer to your database location
mongoose.connect('mongodb://localhost/open_shakespeare');
//the script expects this schema
var PlaySchema = new mongoose.Schema({
title: String,
uriTitle: String,
html: String
});
var Play = mongoose.model('Play', PlaySchema);
I like to copy the html file out of the material_cache/moby/html directory and into a separate folder with the script for organization purposes.
node importShakespeareHtml.js
mongoimport --db dbname --collection annotations --file annotations.json --jsonArray
Since the data is very large, MongoDb will return an error unless you use --jsonArray. This puts all the JSON data into one big object that will need to be parsed out into individual db entries.
-- This must be done before any of the annotation reformatting scripts can run. Change the db location address to use your database by changing the first parameter of the MongoClient.connect() function // Retrieve
var MongoClient = require('mongodb').MongoClient;
// Connect to db, edit to match your db address
MongoClient.connect("mongodb://localhost:27017/open_shakespeare", function(err, db) {
if(!err) {
console.log("connected successfully to mongodb://localhost:27017/open_shakespeare");
//on successfully connecting, run updater function
parseAnnotations(db);
} else {
console.error("Error connecting to mongodb://localhost:27017/open_shakespeare");
}
});
Run parseJsonArray.js in the console
node parseJsonArray.js
// Connect to db, edit to match your db address
MongoClient.connect("mongodb://localhost:27017/open_shakespeare", function(err, db) {
if(!err) {
console.log("connected successfully to mongodb://localhost:27017/open_shakespeare");
//on successfully connecting, run updater function
updateAnnotationsRangesUri(db);
} else {
console.error("Error connecting to mongodb://localhost:27017/open_shakespeare");
}
});
<h5>Edit the URI/Ranges</h5>
The annotateIt plugin relies directly on the URI and xPath ranges to map the annotation data to the works of Shakespeare. For more information on how this works, see the wiki page: About Annotation Plugin.<br>
<h6>updateUriRanges.js</h6>
```javascript
annotations.find().toArray(function(err, results) {
if(!err) {
results.forEach(function(annotation){
if(annotation.ranges) {
//extract title from URI to make a relative pathname that matches with the Annotorious router
var titleStart = (annotation._source.uri).search('/work') + 6,
title = (annotation._source.uri).slice(titleStart),
uri = '/#works/' + title;
//edit here to create a filepath relative to your DOM structure
var start = '/div[2]/div[1]/div[2]/div[2]' + annotation.ranges[0].start;
var end = '/div[2]/div[1]/div[2]/div[2]' + annotation.ranges[0].end;
annotations.update(
//update the changes in the db.
{'_id': annotation._id},
{
$set: {
'uri': uri,
'ranges.0.start': start,
'ranges.0.end': end
}
},
{safe: true},
function(err, result){
if(!err) {
console.log('Success!');
} else {
console.log('Error updating annotation ranges for %s', annotation._id);
}
}
);
};
});
console.log("Complete!");
} else {
console.error("Error querying annotations collection:", err );
}
});
If you intend to edit the ranges or URI for this data, you must complete step 3 before step 4.
Change the db location address to use your database by changing the first parameter of the MongoClient.connect() function ```javascript // Retrieve var MongoClient = require('mongodb').MongoClient;// Connect to the db edit this string to connect to your db: mongodb://localhost:27017/open_shakespeare
MongoClient.connect("mongodb://localhost:27017/open_shakespeare", function(err, db) {
if(!err) {
console.log("connected successfully to mongodb://localhost:27017/open_shakespeare");
updateAnnotations(db);
} else {
console.error("Error connecting to mongodb://localhost:27017/open_shakespeare");
}
});