Skip to content

Commit

Permalink
Merge pull request #6 from lcdhoffman/master
Browse files Browse the repository at this point in the history
Merge lcdhoffman repo into master repo
  • Loading branch information
quoideneuf committed Jan 10, 2014
2 parents bab9d3a + ba0fc39 commit a5f90fc
Show file tree
Hide file tree
Showing 12 changed files with 443 additions and 163 deletions.
5 changes: 2 additions & 3 deletions Gemfile
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@ gem 'execjs'
gem 'jsmin'
gem 'json'
gem 'json-schema', '= 1.0.10'
gem 'lrucache'
gem 'rufus-lru'
gem 'mime-types'
gem 'minitest'
gem 'multi_json'
Expand All @@ -15,9 +15,8 @@ gem 'puma'
gem 'rack'
gem 'rack-protection'
gem 'rack-test'
gem 'rdoc'
gem 'rspec'
gem 'rubyzip'
gem 'sinatra'
gem 'sinatra-assetpack'
gem 'zip'
gem 'tilt', '= 1.4.1'
65 changes: 65 additions & 0 deletions Gemfile.lock
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
GEM
remote: https://rubygems.org/
specs:
atomic (1.1.14-java)
diff-lcs (1.2.5)
execjs (2.0.2)
jsmin (1.0.1)
json (1.8.1-java)
json-schema (1.0.10)
mime-types (2.0)
minitest (5.2.0)
multi_json (1.8.2)
net-http-persistent (2.9)
puma (2.7.1-java)
rack (>= 1.1, < 2.0)
rack (1.5.2)
rack-protection (1.5.1)
rack
rack-test (0.6.2)
rack (>= 1.0)
rspec (2.14.1)
rspec-core (~> 2.14.0)
rspec-expectations (~> 2.14.0)
rspec-mocks (~> 2.14.0)
rspec-core (2.14.7)
rspec-expectations (2.14.4)
diff-lcs (>= 1.1.3, < 2.0)
rspec-mocks (2.14.4)
rubyzip (1.1.0)
rufus-lru (1.0.5)
sinatra (1.4.4)
rack (~> 1.4)
rack-protection (~> 1.4)
tilt (~> 1.3, >= 1.3.4)
sinatra-assetpack (0.3.1)
jsmin
rack-test
sinatra
tilt (>= 1.3.0)
tilt (1.4.1)

PLATFORMS
java

DEPENDENCIES
atomic
bundler
execjs
jsmin
json
json-schema (= 1.0.10)
mime-types
minitest
multi_json
net-http-persistent
puma
rack
rack-protection
rack-test
rspec
rubyzip
rufus-lru
sinatra
sinatra-assetpack
tilt (= 1.4.1)
98 changes: 17 additions & 81 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,96 +2,32 @@ Archon2ArchivesSpace README
================
# System Requirements

You will need to have Ruby 1.9.3 installed to run this service
You will need to have java installed to run this service. Example:

ruby --version
# example output: ruby 1.9.3p429 (2013-05-15 revision 40747)
java -version
-> Java(TM) SE Runtime Environment (build 1.7.0_17-b02)
-> Java HotSpot(TM) 64-Bit Server VM (build 23.7-b01, mixed mode)

If your system has a different version of Ruby installed, the simplest way to
leave your system intact and get 1.9.3 is to install RVM (https://rvm.io/).
# Running the service

# Installing the service
Download the .war file from the Releases page: https://github.com/lcdhoffman/archon-migration/releases

Download a release or just checkout the project from Github:
To run the service:

git clone https://github.com/lcdhoffman/archon-migration.git
cd archon-migration

Run a script to download the necessary ArchivesSpace libraries:

./scripts/import\_client\_libs.sh v1.0.0RC1

This will attempt to download the ArchivesSpace source code for ArchivesSpace v1.0.0RC1.
*Note: the service ships with libraries for ArchivesSpace 1.0.0, so you can skip this step
if you are targeting 1.0.0.

Install the application dependencies listed in the Gemfile:

gem install bundler
bundle install

Now run the application:

ruby app/main.rb

The service runs on port 4568 by default. To change this:

touch config/config_local.rb
echo "Appdata.port_number YOUR_FAVORITE_PORT_HERE" >> config/config_local.rb

# Daemonizing the Service

The service can be daemonized in several ways. One option is to install a native
ruby solution such as the Daemonize gem (http://daemons.rubyforge.org/). However,
since the service is intended to be short-lived, it may be easiest to simply
send the process to the background and disown it.

# Using the Service
java -jar archon-migration.war [--httpPort=XXXX]

The service is designed to be used in a browser window. Make sure you have a
running Archon instance and a running ArchiveSpace instance. You will also need
account credentials for each service. It is recommended that you create a
separate account called 'migration_user' and assign this user the required
permissions in each application.
This will start the application within an embedded webserver. The default port of the webserver is 8080.

Point your browser to, e.g., http://localhost:4568 and fill out the web form.
# Building the distribution

# Configuration Options
You can build a distribution by cloning the source code:

The best way to configure the application is to create a local config file:

touch config/config_local.rb

To change, for example, the version of the ArchivesSpace target, add the following
line

Appdata.aspace_version 'v1.0.1'

If Archon response times become slow due to network latency or large datasets, it is
possible to speed up successive tests by turning on database caching. Note that you must manually delete
the database if you point the migration tool at a new Archon instance.

Appdata.use_dbcache true

*Note: this feature is not complete and should be left off by default.

# Notes

A typical migration can take several hours and could cause ArchivesSpace's
indexer to get backed up. Migrated records may not appear right away in browse or search results in ArchivesSpace. Consider running ArchivesSpace with the indexer
turned off to speed up the migration process, or upgrading to a later version of ArchivesSpace.

Do not run a migration process against an ArchivesSpace instance that already
contains data.

Do not allow Archon users to create or edit data while the migration is running.

Do not allow ArchivesSpace users to create or edit data while the migration is
running.
git clone https://github.com/lcdhoffman/archon-migration.git
cd archon-migration

You can optimize the performance of the migration tool by adjusting the number of
pages of Archon data that are cached. For example, if your largest Archon collection contains 50,000 Content records, and you are running the migration tool in an environment that can afford around 300MB of memory, you might want to add this line to your config_local.rb file:
and using warbler to build a web application archive:

Appdata.archon_page_cache_size 500
gem install warbler
warble executable war

There's no (or little) advantage to setting the page cache size to a value larger than the number of Content records in the largest Collection, divided by 100. There is a significant disadvantage to keeping your page cache size smaller than the number of pages of items in your largest collection.
Now visit your application at http://localhost:8080
97 changes: 97 additions & 0 deletions README_MRI.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,97 @@
Archon2ArchivesSpace README
================
# System Requirements

You will need to have Ruby 1.9.3 installed to run this service

ruby --version
# example output: ruby 1.9.3p429 (2013-05-15 revision 40747)

If your system has a different version of Ruby installed, the simplest way to
leave your system intact and get 1.9.3 is to install RVM (https://rvm.io/).

# Installing the service

Download a release or just checkout the project from Github:

git clone https://github.com/lcdhoffman/archon-migration.git
cd archon-migration

Run a script to download the necessary ArchivesSpace libraries:

./scripts/import\_client\_libs.sh v1.0.0RC1

This will attempt to download the ArchivesSpace source code for ArchivesSpace v1.0.0RC1.
*Note: the service ships with libraries for ArchivesSpace 1.0.0, so you can skip this step
if you are targeting 1.0.0.

Install the application dependencies listed in the Gemfile:

gem install bundler
bundle install

Now run the application:

ruby app/main.rb

The service runs on port 4568 by default. To change this:

touch config/config_local.rb
echo "Appdata.port_number YOUR_FAVORITE_PORT_HERE" >> config/config_local.rb

# Daemonizing the Service

The service can be daemonized in several ways. One option is to install a native
ruby solution such as the Daemonize gem (http://daemons.rubyforge.org/). However,
since the service is intended to be short-lived, it may be easiest to simply
send the process to the background and disown it.

# Using the Service

The service is designed to be used in a browser window. Make sure you have a
running Archon instance and a running ArchiveSpace instance. You will also need
account credentials for each service. It is recommended that you create a
separate account called 'migration_user' and assign this user the required
permissions in each application.

Point your browser to, e.g., http://localhost:4568 and fill out the web form.

# Configuration Options

The best way to configure the application is to create a local config file:

touch config/config_local.rb

To change, for example, the version of the ArchivesSpace target, add the following
line

Appdata.aspace_version 'v1.0.1'

If Archon response times become slow due to network latency or large datasets, it is
possible to speed up successive tests by turning on database caching. Note that you must manually delete
the database if you point the migration tool at a new Archon instance.

Appdata.use_dbcache true

*Note: this feature is not complete and should be left off by default.

# Notes

A typical migration can take several hours and could cause ArchivesSpace's
indexer to get backed up. Migrated records may not appear right away in browse or search results in ArchivesSpace. Consider running ArchivesSpace with the indexer
turned off to speed up the migration process, or upgrading to a later version of ArchivesSpace.

Do not run a migration process against an ArchivesSpace instance that already
contains data.

Do not allow Archon users to create or edit data while the migration is running.

Do not allow ArchivesSpace users to create or edit data while the migration is
running.

You can optimize the performance of the migration tool by adjusting the number of
pages of Archon data that are cached. For example, if your largest Archon collection contains 50,000 Content records, and you are running the migration tool in an environment that can afford around 300MB of memory, you might want to add this line to your config_local.rb file:

Appdata.archon_page_cache_size 500

There's no (or little) advantage to setting the page cache size to a value larger than the number of Content records in the largest Collection, divided by 100. There is a significant disadvantage to keeping your page cache size smaller than the number of pages of items in your largest collection.
10 changes: 4 additions & 6 deletions app/lib/archon_client.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
require_relative 'startup'
require 'net/http/persistent'
require 'json'
require 'lrucache'
require 'rufus-lru'

module Archon

Expand Down Expand Up @@ -102,7 +102,7 @@ def get_container_type(id)
class ArchonRecord
include RecordSetupHelpers
include EnumLookupHelpers
@@cache = LRUCache.new(:max_size => 300, :default => false, :ttl => 600)
@@cache = Rufus::Lru::Hash.new(300)

def self.each(instantiate=true)
raise NoArchonClientException unless Thread.current[:archon_client]
Expand Down Expand Up @@ -200,7 +200,7 @@ def self.unfound(id = '0')

def self.find(id)
import_id = import_id_for(id)
if @@cache[import_id].nil?
if @@cache.has_key?(import_id) && @@cache[import_id].nil?
return unfound(id)
end

Expand Down Expand Up @@ -354,9 +354,7 @@ def http


def get_json(endpoint, usecache=true)
@http_cache ||= LRUCache.new(
:max_size => Appdata.archon_page_cache_size,
:default => false)
@http_cache ||= Rufus::Lru::Hash.new(Appdata.archon_page_cache_size)

# look at in-memory cache
if @http_cache[endpoint]
Expand Down
9 changes: 2 additions & 7 deletions app/lib/migrate.rb
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
require_relative 'archon_client'
require_relative 'archivesspace_client'
require_relative 'migration_helpers'
require 'zip/zip'
require 'zip'


class MigrationJob
Expand All @@ -24,11 +24,6 @@ def initialize(params = {})

Archon.record_type(:digitalfile).base_url = @args[:do_baseurl]

# 1 job per thread
raise "Job thread occupied." if Thread.current[:archon_migration_job]
Thread.current[:archon_migration_job] = self


@aspace = ArchivesSpace::Client.new(
:url => @args[:aspace_url],
:user => @args[:aspace_user],
Expand Down Expand Up @@ -503,7 +498,7 @@ def package_digital_files
directory = Dir.tmpdir + "/archon_bitstreams/"
zipfile_name = File.join(File.dirname(__FILE__), '../', 'public', 'bitstreams.zip')

Zip::ZipFile.open(zipfile_name, Zip::ZipFile::CREATE) do |zipfile|
Zip::File.open(zipfile_name, Zip::File::CREATE) do |zipfile|
Dir.glob("#{directory}*.*").each do |file|
zipfile.add(file.sub(directory, ''), file)
end
Expand Down
Loading

0 comments on commit a5f90fc

Please sign in to comment.