-
Notifications
You must be signed in to change notification settings - Fork 10
The mappings file
MIK's CSV and CONTENTdm metadata parsers need to know which elements in the source metadata record for an object correspond to which MODS elements. MIK defines these mappings in a CSV file that is separate from the MIK .ini file, but is referenced in the .ini file in the [METADATA_PARSER]
mapping_csv_path
setting.
The mapping file contains two columns. The one on the left identifies the field names in the "source" metadata record, and the one on the right defines the "target" XML snippet to hold the value of the corresponding source field. Two important things about the snippets:
- They must be well-formed XML (that is, opening and closing tags must match, and must follow rules defining XML attribute syntax). You can check the well formedness of your snippets by running the
./mik --config foo.ini --checkconfig snippets
command. This command does not validate your snippets against a schema. - They must include all XML from the first child of the root element down; that is, they are appended to the root element of the MODS, DC, etc. XML
You can use a text editor to create your mapping file, or a spreadsheet application like Excel or Google Sheets. The MIK Metadata Mappings Helper is a simple Google Sheets application that provides a list of commonly used target MODS snippets, and a way to check snippets for syntactical errors. Note that some applications (Excel and Sheets included) will escape parts of a CSV file by adding an extra set of double quotation marks. They are OK; MIK can handle any valid CSV conventions. These double quotation marks do have one undesirable side effect: they will cause your snippets to fail well-formedness tests. The best way to work around this problem is to check your snippets for well formedness before saving your spreadsheet as CSV.
The first row of your mapping file should not contain any column headings.
Snippets can (and usually do) contain the special %value%
placeholder. MIK replaces this string is with the value of the source metadata field. For example, if your incoming Title is "Photograph of a dog" and Title is mapped to the MODS snippet
<titleInfo><title>%value%</title></titleInfo>
the resulting MODS markup will look like
<titleInfo><title>Photograph of a dog</title></titleInfo>
All three sample mapping files below were created in Google Sheets and then exported using the File/Download as/Comma-separate values menus.
Here is a simple metadata mapping file describing a collection of audio interviews. The first on is fairly simple, since the MODS snippets don't contain any attributes:
Title,<titleInfo><title>%value%</title></titleInfo>
Creator,<name><namePart>%value%</namePart></name>
Subject,<subject><topic>%value%</topic></subject>
Description,<abstract>%value%</abstract>
Date,<originInfo><dateIssued>%value%</dateIssued></originInfo>
Language,<language><languageTerm>%value%</languageTerm></language>
Duration,<physicalDescription><extent>%value%><extent><physicalDescription>
Rights,<accessCondition>%value%</accessCondition>
Type,<genre>Interviews (Sound recordings)</genre>
This example builds on the first one by adding attributes. Notice that quotation marks around attribute values are escaped with an additional set of quotation marks, which were added by Google Sheets as part of the CSV export. This is not necessary, but it is allowed:
Title,<titleInfo><title>%value%</title></titleInfo>
Creator,"<name><namePart>%value%</namePart><role><roleTermtype=""text"" authority=""marcrelator"">creator</roleTerm><roleTerm type=""code"" authority=""marcrelator"">cre</roleTerm></role></name>"
Subject,<subject><topic>%value%</topic></subject>
Description,<abstract>%value%</abstract>
Date,"<keyDate=""yes""><originInfo><dateIssued encoding=""w3cdtf"" keyDate=""yes"">%value%</dateIssued></originInfo>"
Language,"<language><languageTerm type=""code"" authority=""iso639-2b"">%value%</languageTerm></language>"
Duration,<physicalDescription><extent>%value%><extent><physicalDescription>
Rights,"<accessCondition type=""use and reproduction"">%value%</accessCondition>"
Type,"<genre authority=""lcsh"">Interviews (Sound recordings)</genre>"
Type of resource,<typeOfResource>sound recording-nonmusical</typeOfResource>
This last example is a bit more complex than the first two. It illustrates how to deal with source metadata that doesn't map to MODS elements. You can wrap your non-MODS snippet elements in MODS' <extension>
element, as illustrated here:
Calendar name,<titleInfo><title>%value%</title></titleInfo>
School name,"<name type=""corporate""><namePart>%value%</namePart></name>"
Medium,<physicalDescription><form>%value%</form></physicalDescription>
Work Measurements,<physicalDescription><note>%value%</note></physicalDescription>
Publisher,<originInfo><publisher>%value%</publisher></originInfo>
Year,<originInfo><dateIssued>%value%</dateIssued></originInfo>
Format type,<genre>%value%</genre>
President,"<extension><president type=""ECU custom metadata for the ecucals collection"">%value%</president></extension>"
Board members,"<extension><board_members type=""ECU custom metadata for the ecucals collection"">%value%</board_members></extension>"
Administrators,"<extension><administrators type=""ECU custom metadata for the ecucals collection"">%value%</administrators></extension>"
Instructors,"<extension><instructors type=""ECU custom metadata for the ecucals collection"">%value%</instructors></extension>"
"Staff(technicians,support staff)","<extension type=""ECU custom metadata for the ecucals collection""><staff>%value%</staff></extension>"
Degree/Diplomas/Programs,<subject><topic>%value%</topic></subject>
Majors/Concentration,<subject><topic>%value%</topic></subject>
Honorary Degree Recipients,"<extension><honorary_degree_recipients type=""ECU custom metadata for the ecucals collection"">%value%</honorary_degree_recipients></extension>"
Scholarships/Awards Recipients,"<extension><scholarship_award_recipients type=""ECU custom metadata for the ecucals collection"">%value%</scholarship_award_recipients></extension>"
Notes,<note>%value%</note>
If a source metadata field is not represented in your mappings file, MIK ignores it. So for example, if:
- your mappings file contains no XML snippet that correspond to a source metadata field
- your mappings file contains a row that has a source field name but an empty (blank) snippet
- your mappings file contains a row with a misspelled source field name
MIK doesn't add that metadata to the MODS documents it creates.
You can add MODS snippets that don't correspond to a source field name using "null mappings". If you want to add target elements to your Islandora metadata that don't have a corresponding source element, you can do so by using null
plus an integer as the source field name placeholder, as illustrated below:
null0,<accessCondition>This resource is in the Public Domain.</accessCondition>
null1,<note type="additional physical form">Also available on microfiche.</note>
null3,<identifier type="uuid"/>
The first two rows in the mappings file will add the <accessCondition>
and <note type="additional physical form">
elements to the MODS file included in each ingest package. Some metadata manipulators use the null mappings to define a template used in adding dynamically generated values (like UUIDs) to MODS documents; the third row in the example above illustrates this.
The markup created by mapping from null source elements cannot contain the special %value%
placeholder used in other mappings - the markup is the same for every XML file. If you want to modify the markup based on some property of the object being created, you will need to use a metadata manipulator.
Another way to use null mappings is to replace source metadata fields. For example, even though each of your source metadata records may have a field corresponding to the MODS <accessCondition>
snippet, you may want to not populate <accessCondition>
with the object-level values using %value%
placeholder. Instead, you may choose to populate every objects's MODS record with the same (new, improved, consistent) value. Doing this works the same way it does for other null mappings:
null0,<accessCondition>This resource is in the Public Domain.</accessCondition>
The mappings illustrated here relate the value of a single source metadata field to a single top-level MODS snippet. In this sense they are one-to-one mappings. More complex mappings, that take the values of multiple source fields and map them to a MODS structure, are also possible via a special metadata manipulator which uses a template to generate MODS structures. These templates have access to all of the source metadata for an object, and can include internal logic. If you need more than one-to-one mappings, check out the documentation on the InsertXmlFromTemplate metadata manuipulator.
If your needs are more complex than those that can be handled by MIK's mappings files, you may want to consider using the Templated metadata parser, which works with CONTENTdm and CSV toolchains.
MIK ignores any row in a mappings file that starts with a hash mark (#
). For example, in the following mappings file:
Title,<titleInfo><title>%value%</title></titleInfo>
Creator,<name><namePart>%value%</namePart></name>
Subject,<subject><topic>%value%</topic></subject>
Description,<abstract>%value%</abstract>
Date,<originInfo><dateIssued>%value%</dateIssued></originInfo>
Language,<language><languageTerm>%value%</languageTerm></language>
# Duration,<physicalDescription><extent>%value%><extent><physicalDescription>
Rights,<accessCondition>%value%</accessCondition>
Type,<genre>Interviews (Sound recordings)</genre>
the Duration mapping is ignored. This might be useful if you want to add comments to a mappings file, or temporarily remove a row while debugging or testing.
Content on the Move to Islandora Kit wiki is licensed under a Creative Commons Attribution 4.0 International License.