Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No ocr filename for some pages #111

Open
techgique opened this issue Oct 22, 2020 · 2 comments
Open

No ocr filename for some pages #111

techgique opened this issue Oct 22, 2020 · 2 comments
Assignees

Comments

@techgique
Copy link
Member

I skimmed the output of the most recent ingests for greatauk, fourtusker, and haydenstopdog and saw these messages. Just curious if there is anything missing / wrong with our batches.

View this text file to see the messages:
newspapers-no-ocr.txt

@apeders
Copy link
Contributor

apeders commented Oct 22, 2020

I believe it's because those image files are missing, which reflects them missing from the original microfilm instance. So, I think it's okay?

@apeders
Copy link
Contributor

apeders commented Oct 23, 2020

<dmdSec ID="pageModsBib2">
<mdWrap LABEL="Page metadata" MDTYPE="MODS">
<xmlData>
<mods:mods>
<mods:part>
<mods:extent unit="pages">
<mods:start>2</mods:start>
</mods:extent>
</mods:part>
<mods:note displayLabel="University of Nebraska-Lincoln, Lincoln, NE" type="agencyResponsibleForReproduction">nbu</mods:note>
<mods:note type="noteAboutReproduction">Not digitized, published</mods:note>
</mods:mods>
</xmlData>
</mdWrap>
</dmdSec>

for

INFO:core.batch_loader:No ocr filename for issue: Danskeren. [1915-09-01 00:00:00] page: Danskeren. (Neenah, Wis.) 1892-1920, September 01, 1915, Image 2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants