Digitization for Genealogists

Why genealogists should digitize local history

Copyright
This is a concern, but should not paralyze you. Some important points: copyright is not forever, it does NOT exist to protect the author, copying or digitizing does not confer copyright and  people acting in good faith have some safeguards. It is important to be polite. Do not use other's work without at least giving them credit.

Standards and Best Practices
The goal of standards and best practices is to handle the item once. The problems with standards are: they will change, they offer only temporary protection and they are beyond the means of most individuals. The answer may be responsible minimalism. Don't let the perfect be the enemy of the possible. 

Access and Preservation
To digitize a document is different than preserving it, the goal of preservation being to provide access to an original item. Digitization complements preservation by protecting the original and providing far superior access. Genealogists should focus on access.

Some Concepts, Definitions and Jargon

Graphics (photographs, etc.)

Text

Scanner jargon

Other definitions can be found at TechEncyclopedia.

Types of Digital Projects

An Existing Digital Document. Local historians, genealogists and others often have interesting documents already in electronic format. These documents can be easily converted  to HTML for web publication.  The Index to Elsie's Scrapbook and Lincoln High School Graduates 1904-2000 are examples of this kind of document.

Retyping an Older Document. An older document can simply be entered into a word processing program, checked for accuracy, converted to HTML and then web published. This biography  and pension request are examples of this method.  This method is suitable for text-only documents, where original format is less important than the content.

Scan and OCR. Text can be digitized using OCR. The resulting text will need to be proofed thoroughly, especially if the original is not laser quality. This proofing can be more time consuming than simply re-entering the document. Graphics can be scanned separately and combined with the text. If the text is used to create HTML, the resulting files are small and can be viewed in any browser. Since this method changes the format of the document, it is best used when the format is relatively unimportant. Centennial Story 1890-1990 is an example of OCR text with a separate graphic section, which reflects the original format. Each chapter of the book has been placed in a separate file for ease of access. The Appendix was updated to include additional information.

Page views. When the format or feel of the original document is important, page views allow that to be replicated. Since the master TIFF files are difficult to display, smaller GIF files or some other format is used for display. The Making of America site uses page views, with a large custom database to manage them. This is beyond the means or needs of most genealogists.

Adobe Acrobat. If the original format of a document is important, Acrobat (not Acrobat Reader) can be used to replicate it. A page is scanned, usually creating a  TIF file. Acrobat converts the TIF file into a PDF file. Acrobat can also take a series of scanned pages and combines them in one document (file) with important advantages in terms of display.  Acrobat sells for about $250 and is a sophisticated program, but not beyond the ability of a dedicated amateur.

Scan and Post. If an item is mainly graphical (such as photographs), the graphics can be scanned and web published. Scanners are inexpensive and usually include graphic software that will help clean up the files and save them in the best file format. The Young Postcards are an example of this method. Thumbnails (small graphics linked to the larger version) have been used to make it easy to browse. 

Databases. While databases can be used to access large number of records or photographs, this is beyond the means of most individuals. It is relatively easy to generate HTML pages from most database or spreadsheet programs, though that makes updating harder. If you have a database, consider partnering with someone who can handle loading it.

Creating a Web Site

Now that you have a group of digital documents, post them to your web site. Web space is usually included with your Internet account. You can also use a site at Geocities or other "free" site provider. This provides a stable address when you change Internet providers, but involves other inconveniences. For advice on designing a genealogy related site, check Cyndi's List or these other sites.   Then make yourself known.

Recommended Internet Sites

Digitization in Public Libraries Web Site - Handouts and advice from the 2001 PLA Spring Symposium on digitization. Be sure to check the Resource list.

This page was prepared by Andy Barnett,  a genealogist and Assistant Director of McMillan Memorial Library. Many examples used are from the Library's collection of digital historical titles. His home address is hof_1991 (at) hotmail.com.

This page is located at http://www.oocities.org/onelibrarian.geo/digitize_gen.html

Last updated April 14, 2002