PicaReader provides classes for reading Pica+ records encoded in PicaXML and PicaPlain.
PicaReader is copyright (c) 2012-2016 by Herzog August Bibliothek Wolfenbüttel and released under the terms of the GNU General Public License v3.
You can install PicaReader via Composer.
composer require hab/picareader
All readers adhere to the same interface. You open the reader with a string of input data by calling
Reader::open()
and can call Reader::read()
to read the next record in the input data. If the
input does not contain (anymore) records Reader::read()
returns FALSE
. Otherwise it returns
either a record object created with PicaRecord’s Record::factory()
function.
$reader = new \HAB\Pica\Reader\PicaXmlReader()
$reader->open(file_get_contents('http://unapi.gbv.de?id=opac-de-23:ppn:635012286&format=picaxml'));
$record = $reader->read();
$reader->close();
To filter out records or fields you can attach a filter to the reader via Reader::setFilter()
. A
filter is any valid PHP callback that takes an associative array representing the record as argument
and returns a possibly modified array or FALSE
if the entire record should be skipped.
The array representation of a record is defined as follows:
RECORD := array('fields' => array(FIELD, …)) FIELD := array('tag' => TAG, 'occurrence' => OCCURRENCE, 'subfields' => array(SUBFIELD, …)) SUBFIELD := array('code' => CODE, 'value' => VALUE)
Where TAG
, OCCURRENCE
, CODE
, and VALUE
are the respective properties of a Pica+ field or
subfield.
For example, if your source delivers malformed PicaXML records like so:
<?xml version="1.0" encoding="UTF-8"?>
<record xmlns="info:srw/schema/5/picaXML-v1.0">
<datafield tag="">
</datafield>
<datafield tag="001A">
<subfield code="0">0001:14-09-10</subfield>
</datafield>
…
</record>
You can attach a filter function to remove these fields with an invalid tag:
$reader = new PicaXmlReader();
$reader->setFilter(function (array $r) {
return array('fields' => array_filter($r['fields'],
function (array $f) {
return isset($f['tag']) && \HAB\Pica\Record\Field::isValidFieldTag($f['tag']);
}));
});
$record = $reader->read(…);
$reader->close();
Large parts of this package would not have been possible without studying the source of Pica::Record, an open source Perl library for handling Pica+ records by Jakob Voß, and the practical knowledge of our library’s catalogers.