A lightweight Python application to generate a custom sort order from ArchivesSpace.
The application performs the following tasks on each row of a CSV report containing information about all archival objects within a given collection:
- Retrieves the JSON record for the object
- Extracts the
ancestor
array from the JSON and reverses it so that the collection-level record is first, etc. - Retrieves JSON for each ancestor, excluding the collection-level record, as resource records do not have position values
- Extracts the
position
value from each ancestor and, if the position value is less than 5 digits in length, add leading zeros. The largest collection in ArchivesSpace has ~88000 records linked to it, meaning that even if the collection were totally flat there would not be a position value greater than 5 digits in length - Concatenate each position value to a string variable, with a dot after each position. The sort order will end up looking something like this: 00001.00027.00005.00435
- Adds this position value in a new column in the input CSV file
- Running the executable file requires a Mac. You do not need to install Python or any other dependencies. You can just double-click on the file to run the script.
- Running the Python script requires Python 3.8+ and the
requests
andrich
third-party libraries
This application takes the custom All Archival Objects
report (CSV format) as input. This report is generated within the ArchivesSpace staff interface. To run this report:
- Click on the gear icon next to the repository name in the staff interface
- Select
Reports
from the drop down - Click on the
All Archival Objects
report - Enter the call number of the collection in the
Call Number
box - Select
CSV
from theFormat
dropdown menu - Click
Start Job
to start the report job - When the report finishes, click the
Refresh Page
button - Click the
Download Report
link to download the report
The application comes with a config.json
file, which allows the user to specify ArchivesSpace login information and the path to the input CSV file.
Sample config.json
formatting:
{
"input_csv": "full/path/to/input_csv.csv",
"aspace_api_url": "https://archivesspace.library.yale.edu/api",
"aspace_username": "yourusername",
"aspace_password": "yourpassword"
}
If the configuration file is not completed, the application will prompt the user for each of these inputs.
Double click on the executable file to run the script. If the configuration file is complete, the script will begin immediately. If not the user will be prompted to enter the input file path and ArchivesSpace login data.
Depending on the number of archival objects associated with the collection, the script could take a while to run (test runs completed approximately 675 records per minute). A progress bar will appear which includes the number of records processed and the estimated time remaining.
The application outputs a CSV file, and stores it in the same directory as the input file. The filename will be the same as the input file, with the addition of _output
at the end of the filename - i.e. full/path/to/input_csv_output.csv
The output CSV file includes a new first column, sort_order
, which stores the sort order that is generated during the process. The values in this column can be sorted within spreadsheet software, script, or other application as needed.
NOTE: If sorting in spreadsheet software, it is important to specifically open the file as plain text. Many spreadsheet software applications will default to General
format, which can cause leading zeros to be dropped from the sort order values. Obviously if this happens the sort order will not work properly.
If using Excel, follow these steps to ensure that the output file is opened in plain text:
- Open a blank workbook in Excel
- Select
Data > Get External Data > Import Text File...
- In Step 1 of the Text Wizard, select the
Delimited
radio button and pressNext >
- in Step 2, check the
Comma
box in theDelimiters
menu - In Step 3, click on the first column to highlight it
- Select the
Text
radio button in the Column data format menu - Click
Finish
- Select the
Existing sheet
radio button from the next menu and clickOK
- The spreadsheet will populate with the formatted data. To sort, click the
Data > Sort
button. - In the sort menu, check the
My list has headers
button and selectsort_order
from theColumn
drop-down. ClickOK
. - You may receive a Sort Warning asking whether you should
Sort anything that looks like a number, as a number
orSort numbers and numbers stored as text separately
. SelectSort numbers and numbers stored as text separately
and clickOK