The Reagan Moore article touches upon the problems associated with databases, in addition to simple documents. Not only must each and every document be converted to work with current technology, but the catalog must also be converted, as well as updated. Moore writes, “The organization of the data into collections must also be preserved in the face of rapidly changing database technology. Thus each collection must be migrated forward in time onto new data management systems, simultaneously with the migration of the individual data objects onto new media.” This adds an additional task that either those updating files or those working to combat digital obsolescence in the first place must handle. Organization and databases are essential parts of any library, digital or otherwise, and making old digital files less obsolete becomes less necessary if these documents are not sufficiently archived and indexed, allowing them to be found and read by future researchers. The article identifies the three aspects of a digital collection that must be kept up to date as digital object, data collection, and presentation representations. The object representation is comprised of the unique data regarding the format and context of each digital item. The data collection representation is “typically a subset of the attributes associated with the digital objects.” The attributes should be stored in metadata, so that their associations can be repeated in future versions of the database. The presentation representation relates to the interface that a researcher uses to access the information in the digital objects. According to Moore, “[r]e-creation of the original view of a collection is a typical archival requirement.” Data should not be changed much from its original form, or it will lose almost as much as would a translation or reprint of a book. For this reason, preservation of a document’s original form is a significant issue in the field of digital preservation.
One major problem with nearly any possible solution to the technological obsolescence issue is that there are no national standards for preservation of digitally stored materials. In the words of Hedstrom, “digital library research has focussed on architectures and systems for information organization and retrieval, presentation and visualization, and administration of intellectual property rights (Levy and Marshall). The critical role of digital libraries and archives in ensuring the future accessibility of information with enduring value has taken a back seat to enhancing access to current and actively used materials. As a consequence, digital preservation remains largely experimental and replete with the risks associated with untested methods; and digital preservation requirements have not been factored into the architecture, resource allocation, or planning for digital libraries.” Some libraries have tried to set standards for software and storage files, such as the use of common forms for databases and images. Hedstrom declares, “The strategy rests on the assumption that software products which are either compliant with widely adopted standards or are widely dispersed in the marketplace are less volatile than the software market as a whole. Most common commercial products today provide utilities for backward compatibility and for swapping documents, databases, and more complex objects between software systems.” This strategy does not, however, eliminate the need for migration of files. They must still be transferred as standards are changed and newer, better programs come into existence. This is a reason why standards should remain fairly constant, or at least change in a way that allows for methods such as emulation and encapsulation to remain effective as the programs continue to change.
Along the same lines as making hard copies is another procedure that retains the digital nature of electronic documents, but simplifies them. This, as Hedstrom states, is to save the files in “the simplest possible digital formats in order to minimize the requirements for sophisticated retrieval software.” The cost of this process is low, and data can be easily transferred from one program or system to another without any significant loss of content. The main problem with this is that many digital materials, especially multimedia files, are not simply textual and numeric data, and require a higher level of programming than an ASCII text file or other simple form would supply. Hedstrom even makes explicit the fact that the solution will only be effective “where retaining the content is paramount, but display, indexing, and computational characteristics are not critical.” An interesting point made by Rothenberg, which adds to the argument against this method, is that digital preservation methods cannot be limited to text. There are many multimedia documents being created today. Rothenberg’s article declares that “the generation of multimedia records has increased rapidly in recent years, to include audio recordings, graphical charts, photographic imagery, and video presentations, among others,” and that “multimedia and hypermedia records are likely to become ever more popular and may well become dominant in the near future.” In order to be truly useful, these multimedia and hypermedia documents must be preserved, which would require more complex and universal solution than one that simply captures text.
Hedstrom mentions the possibility of simply retaining old computer hardware and software, so that updating files would be rendered unnecessary. She states that it “would support replay of original sources and contribute to the preservation of software as a significant cultural and intellectual resource in its own right.” Hedstrom makes it clear that feasibility studies should be done before actually putting this into practice, as it would certainly cause problems. It could be expensive, and result in an enormous amount of hardware and software being kept by libraries. Maintenance would also be difficult, as computer hardware and storage media tend not to last all that long, and it would be difficult to find someone who would repair an old computer system. Rothenberg and Granger (“Emulation”) agree, with the latter stating that the former gave the cost, necessity to keep building new device interfaces, and limited lifetime of computer microchips as arguments against the idea.
Perhaps the three most significant strategies for digital preservation are migration, emulation, and encapsulation. Migration is basically just copying data to a new set of computer hardware and software. One disadvantage of this technique, according to the Preserving Access to Digital Information (PADI) website is that “[m]igration to new operating environments often means that the copy is not exactly the same as the original piece of information.” Moore’s article addresses the problem of time consumption in migration. It mentions that “[t]he concern is that when the data storage technology becomes obsolete, the time needed to migrate to new technology may exceed the lifetime of the hardware and software systems that are being used. This is exacerbated by the need to be able to retrieve information from the archived data.” Not only is copying documents time-consuming, but it is also far from foolproof. As Rothenberg states, “the copy process must avoid corrupting documents via compression, encryption, or changing data formats.” It is a difficult process, and not one that should have to be done every few years. Different methods of migration include transferring data to non-digital media, using digital standard formats (a somewhat dangerous task in this world of ever-changing standards), and use of software that can decode data from older programs or versions of the same program (as, for instance, recent versions of Microsoft Word can do). Backwards compatibility is not always possible, though. The Commission on Preservation and Access makes it clear that “copying depends either on the compatibility of present and past versions of software and generations of hardware or the ability of competing hardware and software product lines to interoperate. In respect of these factors -- backward compatibility and interoperability -- the rate of technological change exacts a serious toll on efforts to ensure the longevity of digital information.” He goes on to mention that “it is costly and difficult for vendors to assure that their products are either ‘backwardly compatible’ with previous versions or that they can interoperate with competing products.” Despite its flaws, however, migration is probably the most feasible solution using modern technology.
Emulation is defined by the PADI website as “the process of mimicking, in software, a piece of hardware or software so that other processes think the original equipment/function is still available in its original form.” The site gives the fact that the data do not need to be changed as a major advantage of this method. Problems with emulation include cost, which might be prohibitive because of issues with intellectual property rights (Granger, “Emulation”). David Bearman also has problems with emulation, suggesting that it “would not preserve electronic records as evidence even if it could be made to work and is serious overkill for most electronic documents where preserving evidence is not a requirement.” He argues that Rothenberg, a major proponent of emulation, is concentrating too much on the functionality of computer systems, and not enough on the actual content of records.
The PADI website identifies encapsulation as “a technique of grouping together a digital object and anything else necessary to provide access to that object.” The information in an encapsulation package should include “the representation information used to interpret the bits appropriately for access; the provenance to describe the source of the object; the context to describe how the object relates to other information outside the container; reference to provide one or more identifiers to uniquely identify the object; and fixity to provide evidence that the object has not been altered.” The presence of this information makes it much easier for later computer programs to interpret the original data in its original form. Still, it is not foolproof, as it requires future computers to be able to interpret the data from the old package. For obvious reasons, capsules can only be based on present technology, not what might be used in the future. As long as every new generation of information on standards and programs is added to the package, it can remain in existence for a long time. While not as time-consuming as simply copying the documents, however, this process could still require a significant amount of extra work, as well as some extra cost.
Granger (“Emulation”) advises using a combination of these methods. For instance, if the storage media become obsolete, but the document formats do not, migration would be the most sensible strategy, while emulation could be used when the software and operating systems are in danger of becoming obsolete. In this way, the advantages of each system could be utilized, while the disadvantages could be avoided.
Metadata are also an important aspect of digital preservation. As Titia van der Werf-Davelaar indicates, such data will be necessary to future researchers who are trying to convert previously stored information to more modern forms. The metadata should include information about the format of the document in question, requirements for accessing the data, and how the information has changed from its earliest form. As the article states, “[t]he parts that need to be emulated need to be specified in detail (metadata) in a high-level language and the user needs to be educated to "use" the digital original -- as future generations will not know how to interact with obsolete IT-based end user environments.” Regardless of what conversion method is used, people making conversions will find metadata useful in determining exactly what needs to be done, how to do it, and what differences might exist between the new version and the original. Unfortunately, standards for metadata might be slow in coming into being. Both businesses and educational institutions, according to Charles Thomas and Linda Griffin, do not want to implement metadata standards, because the creation of the metadata would be costly (with no immediate benefits seen by the money suppliers) and time-consuming. They suggest that a metadata standard will only come into existence when it becomes profitable. Metadata standards are important for preservation, however. Stewart Granger (“Metadata”) writes, “We should recognise and accept that art historians, say, will have special and different requirements from, say, mathematicians, and vice versa. But what we should resist as far as possible is the situation where metadata can meet the needs for, say, resource discovery perfectly but does nothing for preservation or rights management - and vice versa.” While not every field would necessarily create and use metadata in the same manner, there should definitely be standards used for creation, discovery, and preservation.
Commission on Preservation and Access, The (1 May 1996). “Preserving Digital Information.” Available: http://www.rlg.org/ArchTF/tfadi.index.htm
Granger, Stewart (October 2000). “Emulation as a Digital Preservation Strategy.” D-Lib Magazine 6(10). Available: http://www.dlib.org/dlib/october00/granger/10granger.html
Granger, Stewart. “Metadata and Digital Preservation: A Plea for Cross-Interest Collaboration.” Available: http://dspace.dial.pipex.com/stewartg/metpres.html
Hedstrom, Margaret. “Digital Preservation: A Time Bomb for Digital Libraries.” Available: http://www.uky.edu/~kiernan/DL/hedstrom.html
Moore, Reagan, et al (March 2000). “Collection-Based Persistent Digital Archives—Part 1.” D-Lib Magazine 6(3). Available: http://www.dlib.org/dlib/march00/moore/03moore-pt1.html
“PADI—Preserving Access to Digital Information” (15 November 1999). Available: http://www.nla.gov.au/padi/
Rothenberg, Jeff (January 1998). “Avoiding Technological Quicksand: Finding a Viable Technical Foundation for Digital Preservation.” Council on Library and Information Resources. Available: http://www.clir.org/pubs/reports/rothenberg/contents.html
Thomas, Charles F., and Linda S. Griffin (1998). “Who Will Create the Metadata for the Internet?” First Monday. Available: http://www.firstmonday.dk/issues/issue3_12/thomas/
Werf-Davelaar, Titia van der (September 1999). “Long-Term Preservation of Electronic Publications.” D-Lib Magazine 5(9). Available: http://www.dlib.org/dlib/september99/vanderwerf/09vanderwerf.html