-
Notifications
You must be signed in to change notification settings - Fork 3
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
114 additions
and
10 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,13 +1,14 @@ | ||
# OAI-PMH | ||
|
||
OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) is a low-barrier mechanism for repository interoperability. | ||
OAI-PMH (Open Archives Initiative Protocol for Metadata Harvesting) is a low-barrier mechanism for repository interoperability. [Documentation](http://www.openarchives.org/OAI/openarchivesprotocol.html) | ||
**Data Providers** are repositories that expose structured metadata via OAI-PMH. **Service Providers** then make OAI-PMH service requests to harvest that metadata. OAI-PMH is a set of six verbs or services that are invoked within HTTP. | ||
|
||
|
||
## Endpoints | ||
|
||
### Public | ||
The following endpoints are public. The full documentation of the verbs can be found here: [Link](https://www.openarchives.org/OAI/openarchivesprotocol.html#ProtocolMessages) | ||
These endpoints/verbs are used by harvesters (other repositories) to request metadata of available records. Additionally, informations about the repository and available data formats is provided. | ||
|
||
|Verb |Description |Example URL| | ||
--- | --- |--- | ||
|
@@ -20,7 +21,7 @@ The following endpoints are public. The full documentation of the verbs can be f | |
|
||
|
||
### Non Public | ||
The following endpoints are only available as admin and all deal with sets of OAI-PMH. | ||
The following endpoints are only available as admin and all deal with sets of OAI-PMH. These endpoints will be revamped in a later update. | ||
|
||
|Request Methods |Description |Example URL| | ||
--- | --- |--- | ||
|
@@ -35,12 +36,115 @@ The following endpoints are only available as admin and all deal with sets of OA | |
|GET, HEAD, OPTIONS |Main page for sets |[https://127.0.0.1:5000/admin/oaiset/](https://127.0.0.1:5000/admin/oaiset/)| | ||
|
||
|
||
## Issues | ||
## Configuration | ||
|
||
- Records | ||
- Missing the `"_oai"` field necessary for record and identifier retrieval | ||
- elasticsearch index | ||
- It is possible to define the index used by elasticsearch for all relevant verbs via a config variable `OAISERVER_RECORD_INDEX='records'` | ||
- Currently, it is set to use the 'records' index which does not exist | ||
- The correct index for records on my machine looks like this `rdmrecords-records-record-v2.0.0-1621247047` | ||
- Available indices can be retrieved by visiting [http://localhost:9200/_cat/indices?v&health=yellow&pretty](http://localhost:9200/_cat/indices?v&health=yellow&pretty) | ||
#### ElasticSearch index | ||
Elastisearch index to be used. This will be set by `invenio-rdm-records`: | ||
```conf | ||
OAISERVER_RECORD_INDEX='records' | ||
``` | ||
|
||
#### OAI ID Prefix | ||
The prefix that will be applied to the generated OAI-PMH ids. Should be the address of the repository (f.e. repsoitory.tugraz.at): | ||
```conf | ||
OAISERVER_ID_PREFIX = 'repository.tugraz.at': | ||
``` | ||
|
||
#### Admin Emails | ||
The e-mail addresses of administrators of the repository | ||
```conf | ||
OAISERVER_ADMIN_EMAILS = [ | ||
'[email protected]', | ||
]: | ||
``` | ||
|
||
#### Available Metadata Formats | ||
Define the metadata formats available from a repository. These can be completely redefined or modified (extended, replaced, removed) as need be. | ||
```conf | ||
OAISERVER_METADATA_FORMATS = {` | ||
'oai_dc': { | ||
'serializer': ( | ||
'invenio_oaiserver.utils:dumps_etree', { | ||
'xslt_filename': pkg_resources.resource_filename( | ||
'invenio_oaiserver', 'static/xsl/MARC21slim2OAIDC.xsl' | ||
), | ||
} | ||
), | ||
'schema': 'http://www.openarchives.org/OAI/2.0/oai_dc.xsd', | ||
'namespace': 'http://www.openarchives.org/OAI/2.0/oai_dc/', | ||
}, | ||
'marc21': { | ||
'serializer': ( | ||
'invenio_oaiserver.utils:dumps_etree', { | ||
'prefix': 'marc', | ||
} | ||
), | ||
'schema': 'http://www.loc.gov/standards/marcxml/schema/MARC21slim.xsd', | ||
'namespace': 'http://www.loc.gov/MARC21/slim', | ||
} | ||
} | ||
``` | ||
|
||
The serializer is defined by a function and additional arguments. The function will be called with two fixed arguments (pid, record) and additional specified arguments. | ||
It must return an LXML element instance, to be used in the response of the OAI server. | ||
|
||
Adding another format can be achieved by putting following code in a module's `init_app` function, which resides in the module's `ext.py` file: | ||
```python | ||
|
||
def my_metadata_serializer(pid, record, **kwargs): | ||
# record['_source'] will hold the record data | ||
return = MyMetadataFormat().dump_xml(record['_source']) | ||
|
||
def init_app(self, app): | ||
app.extensions['invenio-oaiserver']['OAISERVER_METADATA_FORMATS'].set('my_metadata_format', { | ||
'serializer': ('my_module.ext:my_metadata_serializer', {}), | ||
'schema' : 'link_to_schema_definition_file', | ||
'namespace' : 'link_to_schema_definition', | ||
} | ||
) | ||
``` | ||
|
||
After this, verbs supporting the `metadataFormat` attribute will be able to pick up the metadata format. | ||
|
||
|
||
#### Record Search Class | ||
Record search class for the `ListRecords` verb. This is an ElasticSearch class by default and it should return harvestable records. | ||
```conf | ||
OAISERVER_SEARCH_CLS = 'invenio_oaiserver.query:OAIServerSearch' | ||
``` | ||
|
||
#### OAI ID Fetcher Function | ||
Will return the OAI ID of a record. This will take two arguments: `function(record_uuid, record_as_dict)`. | ||
```conf | ||
OAISERVER_ID_FETCHER = 'invenio_oaiserver.fetchers:oaiid_fetcher' | ||
``` | ||
|
||
#### Record Updated Key | ||
Record dictionary key for information on when the record was last updated. Set by `invenio-rdm-records`. | ||
```conf | ||
OAISERVER_LAST_UPDATE_KEY = "_updated" | ||
``` | ||
|
||
#### Record Created Key | ||
Record dictionary key for information on when the record was created. | ||
```conf | ||
OAISERVER_CREATED_KEY = "_created" | ||
``` | ||
|
||
#### Record's Sets Fetcher Function | ||
Function to fetch the sets a record belongs to as a list. Takes one argument: `function(record_as_dict)` | ||
```conf | ||
OAISERVER_RECORD_SETS_FETCHER = 'invenio_oaiserver.utils:record_sets_fetcher' | ||
``` | ||
|
||
#### Record Retrieval Class | ||
Used when an `identifier` parameter is used in a verb and `OAISERVER_GETRECORD_FETCHER` is not overridden. | ||
```conf | ||
OAISERVER_RECORD_CLS = 'invenio_records.api:Record' | ||
``` | ||
|
||
#### Single Record Fetcher Function | ||
Function to fetch a record and return as dict. Takes one argument: `function(record_uuid)` | ||
```conf | ||
OAISERVER_GETRECORD_FETCHER = 'invenio_oaiserver.utils:getrecord_fetcher' | ||
``` |