Skip to content

Files

Latest commit

f05d85a · Dec 9, 2018

History

History
67 lines (55 loc) · 5.38 KB

README.md

File metadata and controls

67 lines (55 loc) · 5.38 KB

scan_util

Utility for manage movement of film scanner files to photo galleries

I wrote this when faced with boxes of 35mm slides that needed to be scanned, catagorized, and moved to offline storage and photo galleries. This was a big job given that I had been taking 35mm slides and negs since 1964. Yes, that turned out to be 18,000+ images and 1.8TB of archive.

The photo process is simple. Your scanner may or may not have an autofeed but no matter. You spend days at this so anything that helps the workflow is a win. The typical scanner program (I used Vuescan) will have some scheme to create new files using filenames of a form that preserves the order in which they were scanned. They will also add some EXIF metadata related to the scanner but this is not as helpful as it first seems. Entering extra metadata such as when the photograph was taken or some indication of a slide number is a manual process. Think of doing that 18,000 times...

How it works

This process assumes that Vuescan is being used to scan the negatives or slides. Other scanning packages do similar things. The key issues to the process is that the filenames lexically sort such that the scanning order is preserved. The scanning program also uses a file extention (the name part after the '.') to identify the image type. We also note that a slide has a slide number and processing date stamped on the paper or plastic holder. 35mm negative film usually has a frame number along the edge as well. We use that data in the process.

My goal was to name and archive the image files in such a way that I could either use the file name or the image's metadata and find the slide or negative it came from. To do this, I spent a lot of time sorting the slides and images by slide or frame number and box or date. Once they were sorted, I scanned them in order using scan_util to copy and annotate them into the gallery archive directories.

The program has the following command line options:

Usage: scan_util [-hJv] [-b value] [-c value] [-D value] [-d value] [-i value] [-M value] [-o value] [-p value] [-S value] [-T value] [parameters ...]

-b, --batch=value

Processing box or batch number. This is useful if you have a pile of slide boxes that were processed at the same time. Think of wedding pictures or photos from a vacation.

-c, --comment=value

If multiple words, use quotes.

-D, --date=value Original processing date as 'mon-year' or 'mon day, year'

This is the date stamped on the slide holder. It is used for the file name and date fields in the EXIM data. The copyright date is the current date with the copyright holder being the author/artist.

-d, --description=value

This is similar to a comment. EXIF examples would put "1989 Backpack trip" as a description and use comments for things like, "Best fishing on the trip".

-h, --help

Help is a display of the options

-i, --src_dir=value

Source directory where the scanner program wrote its output. Note that the files in the source directory are copied and not deleted.

-J, --jpeg Generate a jpeg format image file. Off by default

This generates a high quality jpeg file. Use this option if you are scanning as raw data and at highest resolution. Raw files are huge but retain the maximum capability of the scanner. A jpeg is more useful for email or gallery images.

-M, --max-procs=value

Number of concurrent jobs. The GO runtime sets this value to the maximum number of CPUs in the machine. Making this a big number does not speed things up, especially if the destination is a single disk.

-o, --dst_dir=value

Destination directory. This is separate in order to recover from mistakes without cluttering up the source directory. Typically, this would be a USB disk that ends up in a safe place...

-p, --photographer=value

Name of original photographer/artist. This is an EXIM field. Give credit where credit is due.

-S, --suffix=value

File name suffix to match. default = dng. A '.dng' file is a raw RGB file in Adobe Photoshop. All of the files with this extention in the source directory are used for input. File names are lexically sorted for processing order.

-T, --film_type=value

Film type: TR = slide, CN = color neg, BW = B/W neg. All images are "color" but the media isn't.

-v, --show-progress

Show a series of ...'s, one as each one is done

[Parameters...}

This is where you specify the slide/frame number. It is a series of numbers and number ranges. For example, a "1 2 4 7-12 15" would indicate slides 1 and 2 skipping 3, 4, skipping 5 and 6, 7 through 12, and skipping 13 and 14 with 15 as the last. Some slides/frames get lost or are over/under-exposed. Remember those? Place these parameters to match the scan order of the images.

The filename generated in the destination contains the date, batch number, and film type. This will ensure that image files will not be overwritten. My purpose was to also match the physical media. I have retained the media in archive boxes numbered and labeled to match the files.

There are no build dependencies but there are runtime dependencies on dcraw and convert to generate a JPEG from the raw scan file. The metadata is copied and generated by exiv2. At the time of the writing of the program, the only GO libraries were simple cgo shims to use the C libraries to simply read/access the files. The simplest solution was/is to simply manage the execution of the commands.