Digitale data archiveren

We recommend that production master image files be stored on hard drive systems with a level of data redundancy, such as RAID drives, rather than on optical media, such as CD-R. An additional set of images with metadata stored on an open standard tape format (such as LTO) is recommended (CD-R as backup is a less desirable option), and a backup copy should be stored offsite. Regular backups of the images onto tape from the RAID drives is also recommended. A checksum should be generated and should be stored with the image files.

Currently, we use CD-ROMs for distribution of images to external sources, not as a long-term storage medium. However, if images are stored on CD-ROMs, we recommend using high quality or "archival" quality CD-Rs (such as Mitsui Gold Archive CD-Rs). The term "archival" indicates the materials used to manufacture the CD-R (usually the dye layer where the data is recording, a protective gold layer to prevent pollutants from attacking the dye, or a physically durable top-coat to protect the surface of the disk) are reasonably stable and have good durability, but this will not guarantee the longevity of the media itself. All disks need to be stored and handled properly. We have found files stored on brand name CD-Rs that we have not been able to open less than a year after they have been written to the media. We recommend not using inexpensive or non-brand name CD-Rs, because generally they will be less stable, less durable, and more prone to recording problems. Two (or more) copies should be made; one copy should not be handled and should be stored offsite. Most importantly, a procedure for migration of the files off of the CD-ROMs should be in place. In addition, all copies of the CD-ROMs should be periodically checked using a metric such as a CRC (cyclic redundancy checksum) for data integrity. For large-scale projects or for projects that create very large image files, the limited capacity of CD-R storage will be problematic. DVD-Rs may be considered for large projects, however, DVD formats are not as standardized as the lower-capacity CD-ROM formats, and compatibility and obsolescence in the near future is likely to be a problem. Top

28-1-06
On the longevity of burned CDs (not to be confused with commercially stamped CDs such as music or software):

Factory-pressed CDs are totally different from recordable CDs. In a pressed CD, the data is literally "molded into" (actually pressed into) the media and will not disappear unless the CD is physically damaged. Recordable CDs use a dye that changes color or reflectivity when heated. There are different dye types commonly used in recordable CDs--phthalocyanine, azo, and cyanine, in particular--and they do not all have the same life expectancy and stability...

All of the studies that I have seen except one suggest that properly burned one-time media (-R media, but not -RW media; see below) has an expected life of decades to possibly even centuries. There was a study by NIST (a U.S. government agency, used to be the National Bureau of Standards) on the relative stability of different media here:

StabilityStudy.pdf Top

You can see some comparisons in the NIST study of the different dye types. But this study did not attempt to extrapolate the data to a life expectancy, although it did provide data about the relative stability of the different dyes and reflection layers behind them.

However, opinions still differ as to how long such media will last. The OSTA (Optical Storage Technology Association), in a report here:

cdqa13.htm

suggests that optical recordable media will last 50 to 200 years. This observation is backed by quite a number of studies that I have seen done both by the media makers and others. However, some storage experts suggest numbers more in line with your question, for example the expert in this report suggests a life of only 2 to 5 years:

life_expectancy.html (I have a suspicion that this is the article that you read).

The bottom line is that you are not going to get one single answer that everyone agrees on, although I personally am confident that properly recorded CD-R media can last decades if not a century or two. These 3 articles provide a good starting point for understanding some of the variables involved, which include:

-Dye type
-Physical construction of the media
-Storage conditions (temperature, humidity,, light exposure, mechanical stress, chemical exposure and air quality)
-Manufacturing conditions (can vary from baatch to batch in otherwise identical media of the same brand) Top

Now let us mention some other things that are relevant and important:

-The quality of the burner. A borderline deefective burner can "under expose" the media to the laser beam, producing a seemingly good recording (at the time of burning) that will "fade" over time (failing weeks, months, years or decades sooner than it should have had the laser beam intensity been correct)

-Recording speed. Fast burns (52X) are probbably less stable than somewhat slower burns (say 16x to 32x), but you can burn media too slowly also. There is a very good analogy here to photographic film and exposure levels. The dyes on a given media have a certain range of acceptable "exposures" and outside of that range, you can either under or over expose the media to the laser beam. However, mechanical jitter and certain other variables (largely a function of the quality of the drive) generally will be unconditionally worse at faster speeds.

-Your own handling and storage practices. OOn a CD, the data "exists" in a dye layer on the label side of the media. This can be scratched from the back (from the label side), which will literally and directly destroy the data. The front side is clear plastic but can also be scratched. While front side damage may make the data less readable or completely unreadable, the data is still intact and undamaged on the label side, and the scratches on the front can normally be removed by polishing the plastic. On recordable DVDs, the data is on a layer "inside" the media, but the media is a laminate of several layers and can delaminate, destroying the data. Flexing - even VERY minor flexing - is particularly bad at causing such damage. And, also, recordable DVDs tend to fail from the outside in, so you can increase your success rate and decrease the incidence of failures by not recording such media beyond 80% to 90% of capacity, leaving the outside edge, where the failure rate is greatest and failure occurs first, blank anyway.Top

-Labeling: The glues in adhesive labels, orr the solvents in pen-type markers, both applied to the label side (the side containing the data) can SLOWLY penetrate the reflective backing and dye layers and destroy the data. Therefore, for archival media, the safest policy is to not label the CD or DVD itself at all. If you do label it, with either a label or a pen, you are, at best, taking a chance with your data. Hint: it is safe to write on the clear inner hub (where there is no data at all) with a suitable pen that will not rub off.

And, finally, I would be remiss if I did not mention one other factor which is really huge: Eraseable "RW" media is FAR less stable than one-time "R" media and should absolutely not be used for any permanent recordings of any kind whatsoever. There is no question that RW media can and does "fade". Although I have never seen failure of "R" media that I could attribute with absolute certainty to dye instability. I routinely see "RW" recordings that are unreadable after periods of months to a year or two when there is really no other explanation for the failure. I see this both on CD-RW and DVD + / - RW media, and I advise people in the strongest possible terms not to use "RW" media for anything that they want to consider permanent. Since RW media is also both more expensive (a lot more expensive) and slower. From my perspective the decision to never even buy RW media at all is an easy one.

Submitted by: Barry W. of North Canton, OH Top

9-2-06
De licentie-overeenkomsten met fabrikanten die gebruik willen maken van de CD-R-technologie van Philips, krijgen voortaan meer informatie. De norm wordt verbeterd waardoor de compatibiliteit met andere systemen verbetert. Bovendien is er een prijsdaling van 4,5 naar 2,5 dollar per schijfje.

11-2-06 Question:
How long do digital pictures last?

Answer:
Given the likely future of digital data storage, your best preservation for color photo images is in just that form: stored data. Barring the total breakdown of society and human civilization, there will continue to be institutional data preservation facilities available to everyone.
Putting your precious digital photo files "into the system" is likely to be the most surefire, long-term way to preserve them. Of course, you will always be able to produce physical media prints by whatever process is extant at the time you want to view the images, using whatever is the best cost/performance technology you wish to pay for at that time.
It is convenient to keep a backed up, working copy of your digital photo files in your own possession for the near term; this can be on the best media currently available now, optical or magnetic (see previous articles about the longetivity of CD media), but remember that any physical media is subject to deterioration, so you want redundancy. Keep more than one copy and keep the copies on different types of media.
Be careful to use data formats conformant to widely accepted, non-proprietary standards. Do not count on your digital camera's native format to be around for long. Get those images copied from the camera manufacturer's file format into some well-known open-standard format.
For physical prints, you have many choices, but it's a fact that all the color processes create prints that suffer from deterioration over the years. Most color dyes are not stable, especially when exposed to light. You can consult photographic experts to find the best "archival quality" print media and inks. Whether the color image is infused into the media, or layered onto the surface is not as important as the fundamental permanence of the chemistry used.
I personally know that Kodachrome transparencies, protected from light and humidity, last in excellent condition at least 60 years. Maybe longer - time will tell. I can scan my Kodachrome transparencies made around WW2 era and get good prints by various processes. Then I gently put the transparencies back in their special containers, out of the light. Even still, I know that those transparencies are aging - the dye particles are being hit by a few random bits of radiation and breaking down, from year to year. In a few centuries, these images will probably lose detail and color.
For color prints, there are a number of high quality processes from the photographic industry as well as the commercial printing world. These prints can be made from digital files just as easily as from traditional photographic negatives or positives.
In recent years, makers of inkjet, dye sublimation, and color laser printers have claimed archival permanence for their inks (toners). It remains to be proven, but such prints might be a good, low-cost way to keep your photographs at least for a decade or two.
If you can live with monochrome prints, things get more interesting. Various old photographic processes create images that are made up of very small particles of noble metals. Gold, silver, platinum (and other) processes create prints that, on archival quality papers, seem to be able to last for over a hundred years, perhaps much more if ambient conditions are controlled. And some of these old processes yield prints that are highly regarded aesthetically for their resolution and tonality. Simple carbon based inks are very stable - that's why you can still view old prints made from printing inks hundreds of years ago.
But the bottom line is this: Get your photos into open-standard-format digital files. Put those files on servers operated by reliable companies. Make some more temporal copies for your own use (magnetic and optical media) and keep those as working copies. Make prints from time to time, when you (or your descendants, or the future legal owners of the images) need them.
Let's hope that there will be humans around in a few hundred years who have leisure time (or work need) to enjoy your images.
Submitted by: Dion J. Top

Everything is going digital. Digital music, digital photos, digital movies. Is that a dangerous trend?

In 100 years, anything we put on electronic media will not exist. Yet anything published will still be around.

Someday flash memory will take over. But drives get so big, and they're so inexpensive, and so fast that memory hasn't been able to catch up. , 200 Gigs of flash memory would cost quite a bit.

19:18 2-3-06
Of course, each computer user in the family will always want their own local storage, but it may also be convenient and secure to have a central server where archives and backups live permanently.

When it comes to maintenance and storage, another wise investment is an uninterruptible power supply ; both for that basement network server and for each computer in the house. When the power fails, the battery backup gives you at least a few minutes to save your work before the screen goes dark. Top

For larger systems, old-line UPS manufacturer APS offers a 1200VA system for a street price of about $149 ; it provides 8 outlets, surge protection, and battery runtime of well over an hour.

Companies like Intermatic, Leviton and Panamax all make high capacity surge protectors that mount right where your electrical service enters the house, and thus protect everything in your home from surge damage. Top

3-3-06
With government records, reports and documents increasingly being created and stored in digital form, there is a software threat to electronic access to government information and archives. The problem is that public information can be locked in proprietary software whose document formats become obsolete or cannot be read by people using software from another company.

To cope with the problem, 30 companies, trade groups, academic institutions and professional organizations are announcing today the formation of the OpenDocument Format Alliance, which will promote the adoption of open technology standards by governments. Top

But Microsoft supports another open standard for documents, called OpenXML Document Format. In Office 2007, which Microsoft will ship in the second half of the year, OpenXML will be the default format for saving documents instead of Microsoft's proprietary formats, said Alan Yates of the company's Office division.

The OpenXML format is supported by Intel, Apple, Toshiba, BP and the British Library, among others, Mr. Yates said. Microsoft submitted OpenXML to Ecma International, a standards body in Geneva, last year. Top

Open Archive Initiative (OAI)
OAI is an initiative to develop and promote interoperability standards that aim to facilitate the efficient dissemination of content.

Archive
The term "archive" in the name Open Archives Initiative reflects the origins of the OAI ­ in the e-prints community where the term archive is generally accepted as a synonym for repository of scholarly papers. Members of the archiving profession have justifiably noted the strict definition of an ?archive? within their domain; with connotations of preservation of long-term value, statutory authorization and institutional policy. The OAI uses the term "archive" in a broader sense: as a repository for stored information. Language and terms are never unambiguous and uncontroversial and the OAI respectfully requests the indulgence of the professional archiving community with this broader use of "archive".
(OAI definition quoted from FAQ on OAI Web site)

OAI Protocol for Metadata Harvesting (OAI-PMH)
OAI-PMH is a lightweight harvesting protocol for sharing metadata between services. Top

 

Protocol
A protocol is a set of rules defining communication between systems. FTP (File Transfer Protocol) and HTTP (Hypertext Transport Protocol) are examples of other protocols used for communication between systems across the Internet.

Harvesting
In the OAI context, harvesting refers specifically to the gathering together of metadata from a number of distributed repositories into a combined data store.

Data Provider
A Data Provider maintains one or more repositories (web servers) that support the OAI-PMH as a means of exposing metadata.
(OAI definition quoted from FAQ on OAI Web site)

Service Provider
A Service Provider issues OAI-PMH requests to data providers and uses the metadata as a basis for building value-added services.
(OAI definition quoted from FAQ on OAI Web site)
A Service Provider in this manner is "harvesting" the metadata exposed by Data Providers Top

PDFs can be a valid choice as long-term accessible documents. (Work is being done on a PDF variant based on PDF 1.4. The PDF/A or PDF-Archive is specifically scaled down for archival purposes.)

Microsoft Word documents can be converted into accessible PDFs, but only if the Word document is written with accessibility in mind - for example, using styles, correct paragraph mark-up and "alt" (alternative) text for images, and so on.

PDF on the WEB

Documents described in markup languages such as HTML/XHTML delegate responsibility for many display decisions to the renderer. This means that an XHTML document can render quite differently across various web browser platforms. While the end user experience of an XHTML document can vary significantly depending on browser, platform, and screen resolution, a PDF file can be reasonably expected to look exactly the same to every viewer. The desire for greater control over user experience has led many authors to use the PDF format to publish online content. This is particularly true for order forms, catalogues, brochures, and other documents which are primarily formatted for printing. The ubiquity of Adobe Reader and wide corporate availability of easy to use WYSIWYG PDF authoring have further enticed many (mostly corporate) web authors to publish a wider variety of information as PDF. Top

Critics of this practice cite several reasons for avoiding it. The major one is that the inflexibility of PDF rendering makes it difficult to read on screen: it does not adapt to the window size nor the reader's preferred font size and font family, as classic XHTML web page does. PDF files tend to be significantly larger than XHTML/SVG files presenting the same information, making it difficult or impossible for users with low-bandwidth connections to view them. Adobe Reader, the de facto standard PDF viewer, has historically been slow to start and caused browser instability, particularly when run alongside other browser plugins (Adobe Reader 7 addressed many of these concerns, but is not available under Windows 98/ME). Adobe Reader is also unavailable in current versions on many alternative operating systems and is distributed under a proprietary license unacceptable to some users. During each major release of Adobe (Acrobat) Reader, the installer package gets significantly larger to support extra features, but users are left without means to selectively install components. Top

Archival Gold CD-R's are made with a patented Phthalocyanine dye. When compared to Cyanine and Azo dyes found in the majority of CD-R's on the market, Phthalocyanine dye lasts significantly longer when subjected to the harmful effects of UV light, heat, and humidity. This combined with the non-corroding effects of gold make Archival Gold CD-R's the 300 year disc. Archival Gold DVD-R's have also been tested to last well over 100 years; both Archival Gold formats outlast their competitors by decades.

Mitsui 650MB Archive Gold CD-R
All of the main components in the MAM Gold CD-R are very stable in the environment; plastic (polycarbonate), Phthalocyanine dye and gold (it NEVER oxidizes).

ATIP - absolute time in pre-groove: a code embedded in all blank CDs that gives the burner important information about the disc such as; its manufacturer, type (data or music), capacity, supported writing speeds, absolute lead-in time and last possible position where data can be written.

Veilig archiveren

Iedereen heeft wel bestanden die hij of zij veilig wil bewaren voor de toekomst. In Windows XP is dat geen probleem.

U doet hiervoor het volgende:

1. Open de Verkenner en klik met de rechtermuisknop op het gewenste bestand of op de groep bestanden. (Houd de Control-toets ingedrukt om meerdere bestanden te selecteren).
2. Kies Kopiėren naar > Gecomprimeerde map. Windows zal de bestanden nu kopiėren en verkleinen (zippen) naar een nieuwe map. Dit bestand krijgt standaard de naam van het bestand of een van de bestanden.
3. Om de naam te wijzigen klikt u met de rechtermuisknop op de huidige naam. Kies uit het snelmenu de optie Naam wijzigen en geef een nieuwe naam in. Het achtervoegsel .zip moet wel blijven staan.
4. Om de map te beveiligen opent u deze nieuwe map en kiest u voor Bestand > Wachtwoord toevoegen.
5. In het nieuwe scherm geeft u een wachtwoord op, en bevestigt u dit voor de zekerheid.
6. Klik op OK en sluit de map.

De bestanden nemen nu veel minder plek in op uw harde schijf, en ze zijn beveiligd tegen "vreemden".Top


3-4-2008
De Nederlandse Overheid wil dat digitale documenten vrij uitwisselbaar zijn, onafhankelijk van het gebruikte softwarepakket. Zo moeten gegevens ook op de lange termijn leesbaar blijven – belangrijk voor de archieven.

Door het ISO-stempel biedt OpenXML een alternatief voor ODF, het open documentformaat waarop de Nederlandse rijksdiensten gaan overschakelen. ODF is ontwikkeld door Microsofts concurrent Sun Microsystems en verkreeg in 2006 al ISO-goedkeuring. Sun protesteerde, samen met onder meer IBM en criticasters uit de openbronwereld tegen de goedkeuring van OpenXML.

Volgens het "actieplan Nederland open" in verbinding van het ministerie van Economische Zaken zal ODF voorlopig "nevengeschikt en aanvullend" gebruikt worden naast .doc- en .pdf-formaten. Microsoft neemt ODF echter serieus: er is een plug-in voor alle recente Office-pakketten, die uitwisseling met ODF-documenten mogelijk maakt.Top


23-5-2008
ODF, PDF en XPS
Naast het open documentformaat, dat onder andere door IBM en Sun wordt ondersteund, zal het nieuwe servicepack ook ondersteuning voor PDF/A en XPS toevoegen. Op dit moment zijn er nog speciale plugins nodig om Office met de formaten te laten werken. PDF/A is de speciale 'archief-versie' van het Portable Document Format, en zal tot in lengte van dagen worden ondersteund. Gebruikers kunnen documenten straks zonder extra software opslaan als pdf, xps of odf-bestand.

Nog geen Open XML
Opvallend genoeg komt er geen ondersteuning voor Open XML, het formaat van Microsoft zelf dat onlangs door de ISO als standaard is aangenomen. Pas vanaf de volgende versie van Office zal het programma Open XML ondersteunen. Op dit moment slaat Office 2007 bestanden op in een oudere versie van Office XML. Voor de ISO-certificering is dat formaat enigszins aangepast.

ODF
Het Open Document Format is al veel langer een ISO-standaard. Veel critici snapten daarom dan ook niet waarom Microsoft haar eigen OXML als standaard wilde vastleggen. Wel lijkt het softwarebedrijf zich steeds opener op te stellen. Dat komt voor een groot deel door de druk van bijvoorbeeld de Europese Commissie. Ook willen steeds meer bedrijven en overheden werken met open bestandsformaten om leveranciers-afhankelijkheid te voorkomen.Top


21-6-2008
Billie Walsh talked about the rapid advance of computer technology making your documents unreadable. I completely agree with his remarks. Over the years I've seen the demise of cassette tape storage, 8?, 5?, 3 inch and now 3? inch floppy disc drives; 12 inch laser discs were replaced by CDs and then DVDs. HD discs made a brief appearance and blue ray drives became the standard now. Like Billie I see no future in any form of mechanical storage in the long term. It's a matter of time before CDs, DVDs, Blue ray and hard drives are replaced by solid state storage with no mechanical parts to wear out.

As genealogists we need to consider long term storage - 100s of years not 10s! I used to think that CD-ROM was the solution to long term storage but now many of the early CDs I produced can't be read due to oxidation of the reflective layer in them. It seems that even five years for a CD is pushing your luck unless the reflective layer is gold rather than aluminium. I can still read a 'gold' CD I made in 1995.

Storing on hard drive isn't a long term solution either since I've lost count of the number of hard drives I've had which failed. How about USB memory then? - not the answer since again I have had loads fail. I still however can retrieve the information I put on the Web in 1995. I suspect archiving your data on a site such as Rootsweb is the safest method for most of us. No doubt Rootsweb servers will become obsolete but as they do the data will be transferred to new equipment and backups will be made in case of a server disaster. The only worry is that we have been for a number of years in a period of very quiet solar activity. A period of intense activity could seriously affect computers, wiping out much of the Internet.

Oh heck! Maybe we should make paper copies then? You just need to consider fire, flood, fading ink, mildew, bookworms and crumbling paper:(

Future-proof formats

The answer is different for different types of documents. In general, though, use the simplest and most popular standards-based format that you can.

For text, use ASCII if possible. If you need formatted text, use PDF which can be readily converted to PostScript. If you use special fonts, be sure and archive them as well as your document. Once you get it to PostScript, it can be manipulated and printed with any number of standard tools, e.g. LaTeX, which have been around for decades already and are likely to still be around in decades to come.

For images, stick to RAW or TIFF if you can afford the storage space (RAW is better since TIFF is, technically, a proprietary Adobe format). If not, stick to free, non-proprietary compressed standards such as JPEG or PNG. Avoid GIF unless you need animation - it's technically a proprietary format, and limited to 255 colors. BMP is probably OK, but its close association with Microsoft suggests there may be better alternatives.

For video, use MPEG, which is another free, non-proprietary standard.

The same rationale applies to music, where the safest alternatives are either WAV, if you can afford the storage space, or MP3 (part of the MPEG standard) if you need compression.

This brings me to the last point - compressed storage. Data compression is a wonderful thing for archival storage, but you need to avoid niche or proprietary tools. I use RAR all the time on my Windows machines, but whenever I need something to be archived forever, I stick with ZIP which is free, standardized, and universally available. On my Linux machines, I'll use either zip or gzip.

A subsequent post compels me to add an explicit caveat. Regardless of what type of computer or what software you use, when I say to avoid proprietary formats, that especially includes _any_ Microsoft formats. Even if you use MS Office, there are options to save documents in formats other than Microsoft's native formats. Some are already on the "Save as" or "Export" menus. Others may require the use of 3rd-party software (e.g. Acrobat or one of the several free PDF writers).

Why? Microsoft has a terrible history of releasing new versions of software which have various levels of incompatibility with previous versions. Right now, I can use a Microsoft-supplied filter to provide interoperability between Word 2003 and Word 2007 documents, but will such filters be available for Word 2020 when I have a Word 97 document? Using PDF/PostScript, I can be reasonably confident I won't be stuck.

While historically I would agree with you since Microsoft has a published standard for OOXML in theory there should be viewers for it as long as there is demand for them and someone willing to write one.

That being said my gut instincts tell me that ODF has a better long term outlook, but Microsoft has a lot of money to try to ensure that their format doesn't get replaced.

PDF/A is the best approach for long term archival. It is an ISO standard and the format used by the US National Archives and Records Administration (NARA). The advantage to PDF/A is that it will retain the formatting and presentation of the existing documents in a format that is meant for long term archival - 100+ years. If you save the text in ASCII you lose all formatting.

The only minor disadvantage to PDF/A is that you must embedded all fonts which can make the files large. There are some licensing issues with certain fonts that you also need to be aware of (some font producers prohibit embedding). Aside from being to archive textual information in PDF/A you can also use it to store graphic and image formats. There are a number of tools on the market that enable creation PDF/A compliant documents.

For more info on PDF/A look it up on wikipedia.

I am in complete agreement with Rbsjrx when it comes to sticking with standard data formats for archival purposes. However, you need to be aware that RAW formats for digital images are proprietary to each camera manufacture and widely different. The closest to a standard RAW format is the Digital Negative (DNG). Although this format was developed by Adobe, it has been made freely available to all software and camera makers. In addition, Adobe offers software for no charge which can be used to convert the various camera RAW formats to DNG. Today, not only Adobe image editing software can handle DNG, but that of most other software makers as well. Finally, since the introduction of the DNG specification, many cameras have been introduced using it as their RAW format.

Migration

Migration is the transferring of data to newer system environments (Garrett et al., 1996). This may include conversion of resources from one file format to another (e.g., conversion of Microsoft Word to PDF or OpenDocument), from one operating system to another (e.g., Windows to Linux) or from one programming language to another (e.g., C to Java) so the resource remains fully accessible and functional. Resources that are migrated run the risk of losing some type of functionality since newer formats may be incapable of capturing all the functionality of the original format, or the converter itself may be unable to interpret all the nuances of the original format. The latter is often a concern with proprietary data formats.

The National Archives Electronic Records Archives and Lockheed Martin are jointly developing a migration system that will preserve any type of document, created on any application or platform, and delivered to the archives on any type of digital media. In the system, files are translated into flexible formats, such as XML; they will therefore be accessible by technologies in the future. Lockheed Martin argues that it would be impossible to develop an emulation system for the National Archives ERA because the volume of records and cost would be prohibitive.

9-5-2009 Scannen: Google has come up with a system that uses two cameras and infrared light to automatically correct for the curvature of pages in a book. By constructing a 3D model of each page and then "de-warping" it afterward, Google can present flat-looking pages online without having to slice books up or mash them onto a flatbed scanner.