Archival Digitization Workflow for AV, Textual and Still Images

  1. Material Selection

i. Intellectual Appraisal

Define the objective of the digitization project and select materials to be digitized.

ii. Physical Appraisal

Study the physical attributes of the set of documents to be digitized.

Is original of sufficient quality to be scanned? What digitization methods should be applied according to the physical quality of the documents? e.g. flat bed scanner or a scanner with automatic document feeder, are there any measures of protection needed before starting digitization?

iii. Legal Appraisal

Are there any specific legal implications of copying and online disseminating the collection?

Are there any access restrictions that should be considered?

2. Administration

i. Appoint Curator

A curator is appointed by the Senior Archivist or project lead.

The curator’s responsibilities:

  1. Select materials for digitization
  2. Develop digitization work plan
  3. Write digitization instructions and workflow
  4. Oversee production of digital content
  5. Develop dissemination plan

ii. Draft Work Plan (establish format requirements for the digitized objects)

  1. Assign tasks and responsibilities to participants of the workflow (e.g. scanning, online)
  2. Identify the tools required for the project (scanner, software)
  3. Develop mutually agreed upon time schedule for the project
  4. Define project aims and technical specifications for the digital product
  5. Specify the required metadata according to OSA Data Model
  6. Secure Storage Space

iii. Legal Clearance

If needed, initiate a legal arrangement with the creator/originator/collector.

iv. Contact Staff/Externals

If the digitization project is facilitated/conducted by external participants, communication and the distribution of tasks and responsibilities should be ensured.

Establish and communicate the project timeline and milestones.

3. Production

The information package created during the course of digitization consists of two main components:

  1. Digital product/content
  2. Metadata

These components have to be managed together in order to preserve not only the digital surrogates of physical documents but their context. This means in parallel with digitization the metadata related to the actual object has to be fixed in a standardized and mutually agreed upon manner.

here

3.a Digital Content

3.b Metadata Management

i. Preparations

Format and Technical Metadata Specifications

  1. Format and other characteristics of the product to be detailed and agreed upon
  2. Specify/identify the technical metadata to be captured
  3. Provide training for personnel responsible for digitization
  4. Sample scanning
  5. Quality assurance: Test samples for completeness and correctness

Metadata Specifications and Entry Guidelines

  1. Define metadata entry rules according to the manual’s Appendix C Metadata Style Guide

Database/Template Design

  1. Create a template for metadata entry
  2. Provide training for personnel responsible for metadata entry

Preliminary Cataloging and Review

  1. Test cataloging
  2. Review test cataloging and usability of the entry template

Finalize Template and Metadata Entry Guidelines

Based on the results of preliminary cataloging and review

ii. Digitization and Cataloging

Preservation Copy (D1) Production

  1. Creation of lossless preservation copies of physical documents
  2. Format requirements for preservation copies and further derivatives can be found in the ‘Formatting Requirements for Digitization Objects’ section below
  3. Quality assurance: Test samples for completeness and correctness
  4. Name preservation copies according to the naming conventions of digital objects

Cataloging

Entry of information required by the metadata template

  1. System will create permanent identifiers (PID) and technical metadata
  2. Record event metadata
  3. Record descriptive metadata

Derivatives Production

  1. Creation of access copies (D2)
  2. Creation of previews and/or thumbnails (D3)
  3. Format requirements for derivatives can be found in the Formatting Requirements for Digitization Objects
  4. Name derivatives according to the Naming Conventions of digital objects
  5. Quality assurance: Test samples for completeness and correctness

File Technical/Event Metadata Capture

  1. Capture of technical and event metadata using tools for this purpose

iii. Data Review

  1. Quality assurance: Test samples of digital copies and related metadata for completeness and correctness
  2. Normalization, if needed

v. Additional Metadata

Legal Metadata

Add metadata describing access, use and reproduction

Collection Description

Describe the digital collection based on the Digital Repository Metadata Archival Schema

Data Translation

If the collection is not English, it should be translated.

  1. Catalog entries and collection description also have to be translated
  2. Proofread and copyedit
  3. Implement or debate changes recommended by editor

iv. File Storage - Preservation Master to Tape

  1. Migrate master files accompanied with all metadata to safe storage place with limited access
  2. Attach checksum information to each master file.
  3. Migrate access copies accompanied with reference metadata to the shared Research Drive

4. Dissemination

Dissemination means the delivery of digital records to the audience. This may happen two ways:

  1. Internal dissemination
  2. External dissemination

Internal Dissemination

Reference Services makes the digital collection (stored on shared drive) and related metadata available for researchers in the Research Room

External Dissemination

OSA makes the collection accessible electronically

Formatting Requirements for Digitization Objects

Format

Derivative 1

Preservation Copy

Derivative 2

Access Copy

Derivative 3

Previews, Thumbnails

textual files

file format: .tiff

compression: LZW

tonal depth: 8 bit grayscale

resolution: 300 DPI

file format: .pdf/a multipage

compression: jpeg

file format: .jpeg or .png

color settings: grayscale

resolution: 72 DPI

still images

file format: .tiff

compression: none

tonal depth: 8 bit grayscale/24 bit color or deeper

resolution: 600 DPI

file format: .jpeg

compression: jpeg

tonal depth: 8 bit grayscale/24 bit color

resolution: 1024*768 pixels

file format: .jpeg or .png

color settings: 4 bit grayscale or 8 bit color

resolution: 72 DPI

video

file format: .avi

compression: none

file format: .mp4

compressed

screenshot or excerpt file file format: .mp4

compressed

length: shortened

audio

file format: .wav or .flac

compression: none

file format: .mp3

compressed

excerpt file format: .mp3

compressed

length: shortened

Naming Digitized Objects

Derivative

Filename

Fonds

Sub_fonds

Series

Box Number

Seq. Number

Sub-Seq. Number

Extension

D1

00001.tiff

Folder structure represents the archival hierarchy

00001

.tiff

D2

300_15_1_0001.pdf

300

15

1

0001

.pdf

D3

300_15_0001_a.png

300

15

1

0001_a

.png

  • Use ‘Box Number’ where appropriate, i.e. cassette number
  • Use ‘Sub-Sequence Number’ where appropriate, i.e. multi-part digitized objects


Written by Emily Hanlon on Thursday August 25, 2016 - updated on Monday August 29, 2016