Skip Navigation

JScholarship Digital Preservation Policy

Preservation Guarantees | Digital Preservation Standards | Format Types Accepted | Image Capture Standards | Institutional Roles and Responsibilities

About Digital Preservation

Digital preservation consists of managing various elements and activities necessary to maintain access and functionality essential to the purposes for which the original digital material was created or acquired. A well managed digital preservation program will:

  1. ensure continual access (ongoing usability of a digital resource)
  2. retain all qualities of authenticity (that the digital material is the same as when it was first created unless accompanied by metadata indicating any changes) in JScholarship in perpetuity

Elements of the Digital Preservation Policy

Standards and Best Practices

In order to ensure discovery and interoperability among diverse systems, the Digital Preservation Policy will follow standards where they exist. Where standards do not exist, our policy will be to share in best practices. These will include all standards employed from the moment of capture through the life-cycle of the digital object and comprising all of the managed activities listed below.

Preservation Guarantees

JScholarship will provide the following levels of archival guarantee:

  1. that each bitstream can be stored and retrieved
  2. that a mechanism is available to determine whether or not the bitstream has changed
  3. that a given bitstream can be understood
  4. that each bitstream can be rendered for presentation for users

Guarantees one (1) and two (2) will be supported for ALL bitstreams, but not all guarantees will apply to all bitstreams. The level of support will depend heavily on the file format, the application used to generate the content, and the set of features.

For example, because of the large number of documents created in Microsoft Word, it is very likely that all four guarantees will apply to most Word documents. If however, a bitstream contains a document that took advantage of a very obscure feature of Word that nearly no one uses, it is more likely that at least some features will not be supported, both in terms of three (3) understanding and four (4) rendering. 

Guarantees three (3) and four (4) will require a set of management activities to maintain the abilities described. Migration and emulation are two possible approaches to this problem. Migration describes the act of transforming one file format to another, while emulation describes the simulation on current computer equipment of an environment which supports the original computer software and/or data formats associated with understanding and providing access to content. The current data management plans will enumerated the formats for which these guarantees apply and describe with method(s) (e.g. format migration or emulation) will be used to support each format. The level of support that will be provided for a given collection should be assessed and negotiated on the front end of the process, during negotiation with the owner or curator of the content. It is useful to codify these agreements into service level agreements (SLAs), where appropriate.

top

Digital Preservation Standards

Sources of digital content include:

Digital Content Format Types Accepted into JScholarship*

Content TypeFormat Example
Text ASCII (ANSIX3.4 ECMA-6, ISO646), UTF-8 Unicode
Audio (voice or music)                                                                                 AIFF, Wave, MP2
VideoMPEG, MPEG-2, AVI, MPEG-7 for associated textual metadata
ImageGif (Gif87a, Gif89a), JPEG, JPEG2000, JFIF, TIFF, ITU-T.6, (TIFF4.0, TIFF5.0, TIFF6.0)

 *Content contributor(s) are responsible for providing preservation file formats for ingestion. For content that does not meet preservation standards, content contributor(s) must indicate file formats when know so JScholarship may advise how to resolve possible consequences for preservation guarantees.

top

Image Capture Standards: Converting from Analog to Digital

The Johns Hopkins JScholarship Digital Preservation Policy acknowledges that collection items should only be subjected once to the stress of digitization. Objects will be digitized at archival quality when converting from analog to digital. Because objects vary in their nuances, there are many factors of an original to consider that will affect imaging decisions. These include such issues as the quality or condition of the original, the type of original, the dimensions of the original, the elements of the original that must be conveyed, the type of scanning equipment needed to capture the original and how the item or collection may be used in the future by scholars. Since each item or collection may present different needs, our practice is to present best practices which most faithfully render the underlying source document. The criteria for faithful rendering include completeness, image quality (tonality and color), and the ability to reproduce pages in their correct (original) sequence. As a faithful rendering, a digital master will also support production of a printed page facsimile that is a legible facsimile when produced in the same size as the original (that is 1:1).

Material TypeRecommended Image Parameters
Text Documents
Clean high contrast black and white document's, text and or graphic illustrations, artwork/originals, maps, plans, and oversized documents with printed type e.g laser printed, typeset pages with typeface of 7 pt and above

1-bit bitonal mode - adjust scan resolution to produce a QI of 8 for smallest significant character

or

1-bit bitonal mode - 600 ppi for capture resolution; less for oversize see alternative minimum

or

8-bit grayscale mode - 400 ppi for documents with smallest significant character of 1.0 mm or lager

NOTE: regardless of the approach used, adjust scan resolution to produce a minimum pixel measurement across the long dimension of 6,000 lines for 1-bit files and 4,000 lines for 8-bit files.

Documents with poor legibility or diffuse characters (e.g. carbon copies, Thermofax/Verifax, etc.) handwritten annotations or other markings, low inherent contrast, staining, fading, halftone illustrations, or photographs.

Do not include color or sepia toned inks.

8-bit gray scale mode - adjust scan resolution to produce a QI of 8 for smallest significant character

or

8-bit gray scale mode - 400 ppi for documents with smallest significant character of 1.0mm or larger

NOTE: regardless of the approach used, adjust scan resolution to produce a minimum pixel measurement across the long dimension of 4,000 lines

Manuscripts/Artifactual Text
Documents as described for grayscale scanning and/or where color is important to the interpretation of the information or content, or desire to produce the most accurate representation

24-bit RGB - 400 ppi -600 ppi for documents with smallest significant character of 1.0mm or larger

NOTE: Regardless of the approach used, adjust scan resolution to produce a minimum pixel measurement across the long dimension of 4,000 lines of 24 bit files.

Negatives/Transparencies
Black & White

8-bit grayscale mode - 4,000 pixels on the long side of image area, excluding mounts and borders (for 35mm and medium format up to 4" x 5")

8-bit grayscale mode - 6,000 pixels on the long side of image area, excluding mounts and borders (for 8" x 10" or larger)

Color/Monochrome (e.g. collodion wet-plate negative, pyro developed negatives, stained negatives, etc.)24-bit RGB color - 4,000 to 6,000 pixels on the long side of image area, excluding mounts and borders
Photographs - prints reflective
Black & White

8-bit grayscale mode - 4,000 pixels across the long dimension of image area (8 1/2" X 10" or smaller)

8-bit grayscale mode - 6,000 pixels across the long dimensions of image area (larger than 8 1/2" x 10")

Color & Monochrome and for albumen prints and other historic prints

24-bit RGB mode - 4,000 pixels across the long dimension of image area (8 1/2" x 10" or smaller)

24-bit RGB mode - 6,000 pixels across the long dimension of image area (larger than 8 1/2" x 10" but smaller than 11" x 14")

24-bit RGB mode - 8,000 pixels across the long dimensions of image areas (larger than 11" x 14")

Audio

Audio masters at 96 kilocycles and 24-bit word length

VideoHigh quality audio (linear PCM) at a minimum of 44.1 to 48 kHz; camera set to  same rate
Film See the NFPF Web site for standards

While there are no universal image capture standards for digitization, there are a number of well documented and supported best practices and guidelines that have been established in the digital library community. JScholarship will adhere to best practices adopted by recognized leading institutions. See the Digital Library Federation's Benchmark for Faithful Digital Reproductions of Monographs and Serials for a detailed discussion. Further details and specification can be found at the US National Archives and Records Administration Technical Guidelines for Digitizing Archival Materials for Electronic Access.

top

Institutional and Operation Roles and Responsibilities

Successful preservation of JScholarship hinges on clear definitions and coordination  of the complex array of distributed responsibilities. The preservation policy ensures that preservation measures are properly documented, taken in the proper sequence and that no steps are left out by JScholarship working groups.

Sustainability

Preservation of a digital repository implies sustainability of content and all the resources necessary throughout the digital lifecycle. Sustainability must include a framework by which the long-term viability of ingested content is insured and its proper disposition in the event of a change in resources or institutional commitment is addressed. The Johns Hopkins University Libraries Council is the administrative guarantor that digital preservation will remain an institutional commitment. A combination of Service Level Agreements and sustainability models will be used to respond to financial or other resource limitations. The responsibility for review and assessment belongs to JScholarship Oversight Committee and its working groups. Enduring preservation of digital resources will require substantial and on-going assessment of resource commitments and the creation of sustainability models to monitor and to assess periodically the following resource commitments.

top

Managed Activities

Metadata             The metadata librarian(s) is (are) responsible for metadata compliance. Content contributor(s) is (are) responsible for providing selected metadata at the time of submission as described in the implementation plan. Metadata is fundamental to preserving digital resources in that it enables digital repository users to find, evaluate, navigate, and manage digital objects. Metadata will comply with all standards and best practices as embodied, for example, in PREMIS and JHOVE. All repository collections will include the most complete and appropriate metadata schemes, and require the following types of metadata: administrative, technical, structural, and preservation. The nature and amount of more complex metadata will be explicitly laid out in Service Agreements.
File ManagementThe Sheridan Libraries Library Digital Program and the Welch Advanced Technology Information Service are responsible for file management. The need to manage the disparate requirements of different formats require a determination of acceptable formats for data content to be made prior to accession and ingestion. A variety of preservation strategies including replication, migration , emulation, and refreshment will be used according to evolving technical and professional standards, for example ISO 14721:2003, Space Data Information Transfer Systems (OAIS). Migration describes the process of converting a digital object (or component thereof) from one data format to another, usually because the original format is or will soon be unsupported. Emulation describes the act of simulating the environment or context in which a particular digital object (or component thereof) is accessed when the native environment or context is not available.
Repository BackupThe Sheridan Libraries Library Digital Program and IT@JH are responsible for JScholarship backup. Backup describes the copy of repository data (administrative and content) to a storage system other than that which supports the repository itself. To help assure the maintenance of viable and authentic digital content, a well-documented back up program should be instituted. One full copy of all content, associated metadata and systems specifications will be stored at a secure, geographically distant location in keeping with Crisis Management, Disaster Recovery, and other relevant technical and professional standards. Cooperative agreements with other organizations (academic or commercial) may provide a  suitable means to achieving remote back-up storage. Additionally, at least one local copy (primary replica) may be kept onsite to allow for faster recovery after partial losses. Formats and media chosen for both of the aforementioned purposes should reflect evolving standards, needed speed of access, reliability, and function of the copy.
Repository AuditThe Sheridan Libraries Library Digital Program and the Sheridan Libraries Systems Department are responsible for repository audits. The repository audit establishes confidence in the authenticity and completeness of digital content. All managed activities will be documented according to evolving standards so as to provide an audit trail which meets criteria as described in the Implementation Plan required by projects such as The Center for Research Libraries Trustworthy Repositories Audit and Certification (TRAC): Criteria and Checklist.
Risk AssessmentThe Sheridan Libraries Preservation Department, Library Digital Program, and Libraries Systems Department share responsibility for risk assessment. Because of the ever changing nature of threats to infrastructure, facilities, systems and data a regular program of risk assessment and mitigation should be conducted in keeping with evolving professional and technical standards. Systems security, infrastructure, financial resources, organizational commitment and numerous other factors all pose potential risks to the viability of an institutional repository. Risks of natural and human origin, direct and indirect, targeted an peripheral, purposeful and consequential are to be assessed on a regular basis and steps to mitigate each identified threat type are to be taken will be described in a Risk Assessment Plan.

top

Update 1/2008

Return to JScholarship