Skip to content

Commit

Permalink
Merge pull request #5 from lcdhoffman/master
Browse files Browse the repository at this point in the history
RC 3 Commits
  • Loading branch information
quoideneuf committed Nov 26, 2013
2 parents 9dc7469 + 07c0447 commit bab9d3a
Show file tree
Hide file tree
Showing 17 changed files with 376 additions and 146 deletions.
11 changes: 0 additions & 11 deletions Gemfile
Original file line number Diff line number Diff line change
@@ -1,12 +1,7 @@
source 'https://rubygems.org'

gem 'arel'
gem 'atomic'
gem 'builder'
gem 'bundler'
gem 'coffee-script'
gem 'coffee-script-source'
gem 'erubis'
gem 'execjs'
gem 'jsmin'
gem 'json'
Expand All @@ -23,12 +18,6 @@ gem 'rack-test'
gem 'rdoc'
gem 'rspec'
gem 'rubyzip'
gem 'sass'
gem 'sdoc'
gem 'sinatra'
gem 'sinatra-assetpack'
gem 'thread_safe'
gem 'tilt'
gem 'tzinfo'
gem 'uglifier'
gem 'zip'
18 changes: 11 additions & 7 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,16 +66,20 @@ To change, for example, the version of the ArchivesSpace target, add the followi
line

Appdata.aspace_version 'v1.0.1'

# Notes

A typical migration can take several hours and will cause ArchivesSpace's
indexer to get backed up. Migrated records may not appear right away in browse or search results in ArchivesSpace. Consider running ArchivesSpace with the indexer
turned off to speed up the migration process.
If Archon response times become slow due to network latency or large datasets, it is
possible to speed up successive tests by turning on database caching. Note that you must manually delete
the database if you point the migration tool at a new Archon instance.

Appdata.use_dbcache true

A large migration may fail because of an expiration of the migration tool's session in ArchivesSpace. Avoid this by setting a 10 hour session expiration threshold in the ArchivesSpace configuration file:
*Note: this feature is not complete and should be left off by default.

AppConfig[:session_expire_after_seconds] = 36000
# Notes

A typical migration can take several hours and could cause ArchivesSpace's
indexer to get backed up. Migrated records may not appear right away in browse or search results in ArchivesSpace. Consider running ArchivesSpace with the indexer
turned off to speed up the migration process, or upgrading to a later version of ArchivesSpace.

Do not run a migration process against an ArchivesSpace instance that already
contains data.
Expand Down
51 changes: 51 additions & 0 deletions TECHNICAL_OVERVIEW.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
Archon2ArchivesSpace TECHNICAL OVERVIEW
================
# Application

The file at app/main.rb invokes a web application built on the Sinatra framework (http://www.sinatrarb.com/).

The application root ('/') responds to HTTP GET requests with a simple form in
which a user enters credentials for an Archon instance and an ArchivesSpace instance
and clicks a button. The resulting POST request initiates an instance of the
MigrationJob class. While the job is running, its output is yielded to the client's
browser as a JSON stream.

# Clients

The application contains a client class for both Archon and ArchivesSpace. Clients
handle the basic HTTP requests that are needed to read data from Archon and post
it to ArchivesSpace.

The ArchivesSpace client relies on some libraries that are extracted from the
ArchivesSpace source code. See the README document for instructions for updating
these files to match the ArchivesSpace release being targeted.

# MigrationJob

This class is the controller for a single migration from point A (Archon) to point B
(ArchivesSpace). It moves through the various Archon record types, reading the
records provided by the Archon client, transforming them, and either sending them
directly to ArchivesSpace or pushing them into a record batch that the ArchivesSpace
client posts in a single request.

# Archon Models

Archon records are represented by model classes defined in app/models. Most model
classes implement a 'transform' method which initializes a new object representing
a corresponding ArchivesSpace data structure. The new object is then fleshed out
with data and yielded (in most cases) to the block passed to the transform method.

Since not all Archon records have a 1 to 1 relationship to the ArchivesSpace data
model, there are several models that yield more than 1 object, or that function
in an idiosyncratic way.

The base class for Archon models is defined in the Archon client library. The base
class contains two types of caches to facilitate the reading of data via the
Archon API. One cache contains raw HTTP response body data from Archon. The other
cache contains instances of the ArchonRecord subclasses. A third, still experimental
cache saves Archon response data to an SQLite database, to facilitate repeated
tests against the same Archon instance.

The Archon API only provides a paginated listing of records, so ArchonRecord.find is
implemented by reading the entire set until the desired records is found. Hence the
necessity for the caching techniques described above.
11 changes: 10 additions & 1 deletion app/css/main.css
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,16 @@
background: #ddd;
}

p {
#status-console div.main {
font-size: 1.1em;
}

#status-console div.collapsed div.updates {
visibility: hidden;
height: 0px;
}

p.update, p.error, p.warn {
padding-left: 10px;
padding-right: 10px;
}
Expand Down
95 changes: 69 additions & 26 deletions app/js/main.js
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ $(document).ready(function(){
$("#nodourl").click(function(){
// If checked
if ($("#nodourl").is(":checked")) {
//show the hidden div
//show the hidden div
$("#do_baseurl").removeAttr('required');
} else {
//otherwise, hide it
Expand All @@ -46,57 +46,100 @@ $(document).ready(function(){


function updateStatus(update, emitter){
// console.log(update);
console.log(update);
if (update.type == 'error') {
emitter.show_error(update.body);
} else if (update.type == 'status') {
emitter.refresh_status(update.body, update.source);
emitter.add_status(update.body);
} else if (update.type == 'warning') {
emitter.show_warning(update.body);
emitter.show_error(update.body);
} else if (update.type == 'update') {
emitter.show_update(update.body, update.source);
} else if (update.type == 'flash') {
emitter.flash(update.body, update.source);
} else if (update.type == 'progress') {
emitter.show_progress(update.ticks, update.total);
} else if (update.type == 'update') {
} else if (update.type == 'progress_message') {
emitter.show_progress_message(update.body);
} else if (update.type == 'log') {
$('#download-log').attr('href', update.file);
} else {
// todo: toggle in progress bar
}
}


function StatusEmitter() {
var console = $('#status-console');

this.refresh_status = function(status, source){
if (source == 'aspace') {
$("#status-console div:last p.aspace").html(status);
} else {
$("#status-console div:last p.aspace").remove();
$("#status-console div:last span.progress-message").html(" - Done");
console.append("<div class=\"status " + source + "\"><p class=\"main\">"+status+"</p><p class=\"aspace\"></p></div>");
var statusBox = $('#status-console');

this.last_status = function() {
return statusBox.children('div.status:last');
}

this.add_status = function(status) {
last_status = this.last_status();
console.log(last_status);
if (last_status.length) {
last_status.addClass("collapsed");
last_status.children('div.updates').children('p:last').children('span.progress').remove();
last_status.children('div.updates').children('p.flash').remove();
}
statusBox.append("<div class=\"status\"><div class=\"main\">"+status+" <a href=\"#\" class=\"toggleUpdates\"> (+/-)</a></div><div class=\"updates\"></div></div>");

last_status = this.last_status();
toggler = last_status.children('div.main').children('a.toggleUpdates');

toggler.on('click', function(e) {
$(this).parent().parent().toggleClass('collapsed');
});
}

this.show_error = function(error){
console.addClass('error');
console.append("<p class='error'><b>"+error+"</b></p>");
this.show_error = function(body){
last_status = this.last_status();
if (!last_status.length) {
this.add_status('Migration Errors');
last_status = this.last_status();
}

html = "<p class='error'><b>"+body+"</b></p>";
last_status.children('div.updates').append(html);
}

this.show_warning = function(warning){
console.append("<p class='warn'>" + warning + "</p>");
this.show_update = function(body, source){
source = typeof source !== 'undefined' ? source : 'migration';
last_status = this.last_status();
last_status.children('div.updates').children('p:last').children('span.progress').remove();

last_status.children('div.updates').children('p.flash').remove();

html = "<p class='update "+source+"'>" + body + "</p>";
last_status.children('div.updates').append(html);
}

this.show_progress = function(ticks, total) {
var percent = Math.round((ticks / total) * 100);
$("#status-console div:last span.progress").remove();
$("#status-console div:last p:last").append("<span class='progress'> " + percent + "%</span>");
percent = Math.round((ticks / total) * 100);
last_status = this.last_status();

last_status.children('div.updates').children('p:last').children('span.progress').remove();
html = "<span class='progress'> " + percent + "%</span>";
last_status.children('div.updates').children('p:last').append(html);
}

this.show_progress_message = function(body) {
$("#status-console div:last p:first span.progress-message").remove();
$("#status-console div:last p:first").append("<span class='progress-message'> - " + body + "</span>");
$("#status-console div.status:last div.updates span.progress-message").remove();
$("#status-console div.status:last div.updates p.migration:last").append("<span class='progress-message'> - " + body + "</span>");
}

this.flash = function(body, source){
source = typeof source !== 'undefined' ? source : 'migration';
last_status = this.last_status();

last_status.children('div.updates').children('p:last').children('span.progress').remove();
last_status.children('div.updates').children('p.flash').remove();

html = "<p class='update flash "+source+"'>" + body + "</p>";
last_status.children('div.updates').append(html);
}


}


Expand Down
42 changes: 31 additions & 11 deletions app/lib/archivesspace_client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -36,8 +36,10 @@ def self.initialized?

module HTTP

def init_session
$log.debug("Logging into ArchivesSpace")
def init_session(triesleft = 20)
$log.debug("Attempt logging into ArchivesSpace")
Thread.current[:backend_session] = nil

url = URI("#{@url}/users/#{@user}/login")
raise URIException, "URI format error: #{@url}" unless URI::HTTP === url

Expand All @@ -46,14 +48,21 @@ def init_session

response = JSONModel::HTTP.do_http_request(url, req)
unless response.code == '200'
raise "Couldn't log into ArchivesSpace and start a session"
if triesleft > 0
$log.debug("Log in failed: try again in 1 second")
sleep(1)
init_session(triesleft - 1)
else
raise "Giving up: couldn't log into ArchivesSpace and start a session"
end
end

json = JSON::parse(response.body)
@session = json['session']

# for JSONModel
Thread.current[:backend_session] = @session
$log.debug("New backend session: #{@session}")
end


Expand Down Expand Up @@ -142,12 +151,16 @@ def import(y)

# if cache.empty? && seen_records.empty?
if cache.empty? && working_file.size == 0
$log.warn("Empty batch: aborting, not saving")
return {}
$log.warn("Empty batch: not saving")
return {}
end

# save the batch
$log.debug("Posting import batch")

init_session # log in before posting a batch
$log.debug("Using session: #{Thread.current[:backend_session]}")

cache.save! do |response|
if response.code.to_s == '200'

Expand All @@ -163,10 +176,12 @@ def import(y)
end
end
rescue JSON::ParserError => e
$log.debug("JSON parse error parsing chunk #{chunk}")
y << json_chunk({
:type => 'error',
:body => e.to_s
})
return false
end
end

Expand All @@ -175,6 +190,7 @@ def import(y)
:type => 'error',
:body => "ArchivesSpace server error: #{response.code}"
})
return false
end
end

Expand All @@ -195,7 +211,7 @@ def normalize_message(message)
end
elsif message['saved'] && message['saved'].is_a?(Hash)
r = {
:type => 'status',
:type => 'update',
:source => 'aspace',
:body => "Saved #{message['saved'].keys.count} records"
}
Expand All @@ -204,16 +220,16 @@ def normalize_message(message)
elsif message['status'].respond_to?(:length)
message['status'].each do |status|
if status['type'] == 'started'
r = {
:type => 'status',
r = {
:source => 'aspace',
:body => status['label'],
:id => status['id']
}
r[:type] = r[:body] =~ /^Saved/ ? :update : :flash
yield r
elsif status['type'] == 'refresh'
r = {
:type => 'update',
:type => 'flash',
:source => 'migration',
:body => status['label'],
}
Expand Down Expand Up @@ -246,9 +262,13 @@ def read(chunk)
# do nothing because we're treating the response as a stream
elsif chunk =~ /\A\n\]\Z/
# the last message doesn't have a comma, so it's a fragment
yield ASUtils.json_parse(@fragments.sub(/\n\Z/, ''))
s = @fragments.sub(/\n\Z/, '')
@fragments = ""
yield ASUtils.json_parse(s)
elsif chunk =~ /.*,\n\Z/
yield ASUtils.json_parse(@fragments + chunk.sub(/,\n\Z/, ''))
s = @fragments + chunk.sub(/,\n\Z/, '')
@fragments = ""
yield ASUtils.json_parse(s)
else
@fragments << chunk
end
Expand Down
Loading

0 comments on commit bab9d3a

Please sign in to comment.