Why genealogists should digitize local history
Copyright
This is a concern, but should not paralyze you. Some important points: copyright is not forever, it does NOT
exist to protect the author, copying or digitizing does not confer copyright
and people acting in good faith have some safeguards. It is important to
be polite. Do not use other's work without at least giving them credit.
Standards and Best Practices
The goal of standards and best
practices is to handle the item once. The problems with standards are: they will
change, they offer only temporary protection and they are beyond the means of
most individuals. The answer may be responsible minimalism. Don't let the
perfect be the enemy of the possible.
Access and Preservation
To digitize a document is different than
preserving it, the goal of preservation being to provide access to an original
item. Digitization complements preservation by protecting the original and
providing far superior access. Genealogists should focus on access.
Graphics (photographs, etc.)
Text
Scanner jargon
Other definitions can be found at TechEncyclopedia.
An Existing Digital Document. Local historians, genealogists and others often have interesting documents already in electronic format. These documents can be easily converted to HTML for web publication. The Index to Elsie's Scrapbook and Lincoln High School Graduates 1904-2000 are examples of this kind of document.
Retyping an Older Document. An older document can simply be entered into a word processing program, checked for accuracy, converted to HTML and then web published. This biography and pension request are examples of this method. This method is suitable for text-only documents, where original format is less important than the content.
Scan and OCR. Text can be digitized using OCR. The resulting text will need to be proofed thoroughly, especially if the original is not laser quality. This proofing can be more time consuming than simply re-entering the document. Graphics can be scanned separately and combined with the text. If the text is used to create HTML, the resulting files are small and can be viewed in any browser. Since this method changes the format of the document, it is best used when the format is relatively unimportant. Centennial Story 1890-1990 is an example of OCR text with a separate graphic section, which reflects the original format. Each chapter of the book has been placed in a separate file for ease of access. The Appendix was updated to include additional information.
Page views. When the format or feel of the original document is important, page views allow that to be replicated. Since the master TIFF files are difficult to display, smaller GIF files or some other format is used for display. The Making of America site uses page views, with a large custom database to manage them. This is beyond the means or needs of most genealogists.
Adobe Acrobat. If the original format of a document is important, Acrobat (not Acrobat Reader) can be used to replicate it. A page is scanned, usually creating a TIF file. Acrobat converts the TIF file into a PDF file. Acrobat can also take a series of scanned pages and combines them in one document (file) with important advantages in terms of display. Acrobat sells for about $250 and is a sophisticated program, but not beyond the ability of a dedicated amateur.
- Wood County place names is a 130 page book scanned bi-tonally at 600 dpi. It is available as a 21 MB file or broken into sections. Acrobat gathers the scanned pages together and retains the flavor of the original, but the resulting file is much larger than an unformatted HTML file of the text would be.
- Rules and Regulations of the T. B. Scott Free Public Library, a four page pamphlet, was scanned in grayscale because it was originally printed on colored paper. The format of the original is an important part of the charm of this document, warranting its publication in PDF format. The text of this document would require only 17K in plain HTML, but takes 1.7 MB in grayscale TIFF or PDF.
- Official Historical Program - Wood County Centennial is a 28 page document with dozens of photographs scattered throughout the text. Its small size made it appropriate for in-house digitization. The text was scanned bi-tonally, with grayscale photographs pasted in. Acrobat was used to gather the scanned pages.
Scan and Post. If an item is mainly graphical (such as photographs), the graphics can be scanned and web published. Scanners are inexpensive and usually include graphic software that will help clean up the files and save them in the best file format. The Young Postcards are an example of this method. Thumbnails (small graphics linked to the larger version) have been used to make it easy to browse.
Databases. While databases can be used to access large number of records or photographs, this is beyond the means of most individuals. It is relatively easy to generate HTML pages from most database or spreadsheet programs, though that makes updating harder. If you have a database, consider partnering with someone who can handle loading it.
- Index to the 1928 Standard Atlas of Wood County, Wisconsin / compiled by Marlys Manley Steckler. This was originally a database created with Excel. Simply using SAVE AS HTML created the static pages.
- Stevens Point Area Obituary Index. This on-line database can be searched and easily updated.
Now that you have a group of digital documents, post them to your web site. Web space is usually included with your Internet account. You can also use a site at Geocities or other "free" site provider. This provides a stable address when you change Internet providers, but involves other inconveniences. For advice on designing a genealogy related site, check Cyndi's List or these other sites. Then make yourself known.
Digitization in Public Libraries Web Site - Handouts and advice from the 2001 PLA Spring Symposium on digitization. Be sure to check the Resource list.
This page was prepared by Andy Barnett, a genealogist and Assistant Director of McMillan Memorial Library. Many examples used are from the Library's collection of digital historical titles. His home address is hof_1991 (at) hotmail.com.
This page is located at http://www.oocities.org/onelibrarian.geo/digitize_gen.html.
Last updated April 14, 2002