CL Home Erica Brown Home

Text Summarization


Text Summarization is the process of distilling the most important information from a source to produce an abridged version for a particular user and task (Chinchor 2000).

The evaluation of the ability of systems to perform effective and useful summarizations of texts was first performed in the TIPSTER evaluations funded by DARPA. The output consisted of various lengths of summaries, beginning with a summary established at 10% of the size of the original document. Summaries of variable lengths were also created and compared to the 10% length summary, in order to determine what the most useful compression ratio was. The "ideal" variable length seems to be approximately a 20% ratio (Chinchor 2000).
Summarization in the form of creating document abstracts has traditionally been performed by humans. "A person knowledgeable in the subject matter of the document reads it and then writes a short, typically one paragraph, summary of the document" (TIPSTER Summarization 2000).
Human abstractors often reach a level of only 85% agreement on the content of an abstract (TIPSTER Summarization 2000). However, in the TIPSTER SUMMAC evaluation, the best system achieved an F-score of 72% (Chinchor 2000).
Examples of automatic summarization systems can be found on the Web. For instance, Columbia University has a page on their website with links to three different summarization systems. Examples pages available for each system display the types of output available. The University of Surrey also has a demo system available, using Java. The Computing Research Laboratory (CRL) at New Mexico State University has a system (MINDS) which utilizes summarization technology as part of the system.

This page last modified November 13, 2006 by Erica Brown.
httpd://www.oocities.org/ejb_wd/Summarization.html
© 2000-2006, Erica Jean Lindsey Brown, All rights reserved

This page has been accessed - - times since November 14, 2000