Описание тега summarization
Summarization is the process of identifying the most important information from a source, or a number of sources, in order to present it in a short form. Automatic Summarization is the process of producing summaries by means of automatic techniques, in order to overcome the cost and time required to manually produce summaries (e.g. by professional human abstractors).
The need for Automatic Summarization is motivated by the problem of information overload: the amount of available information is constantly increasing, while the time users can afford to spend scanning through information remains constant (or is even decreasing).
Basic definitions
Depending on the function, a summary can be:
- Indicative: indicates whether reading the full text in depth is worthwhile;
- Informative: covers all the salient aspects of the source;
- Critical: provides a critique of the source and expresses opinions about the source material.
Depending on the user, a summary can be:
- Static / Generic: not personalized for a particular user;
- Dynamic / User-oriented: tailored for a specific user, depending on a user profile or a user query.
Depending on the source, the summarization process can be:
- Single-document: the summary provides information from a single source;
- Multi-document: the summary provides information from a number of sources which discuss a particular topic, possibly providing overlapping information.
Depending on the summarization technique, a summary can be called:
- Abstract: if some material is not verbatim present in the original source, e.g. some rephrasing is involved;
- Extract: if all the material is verbatim present in the original source.
Readings
- Automatic Summarization by Ani Nenkova and Kathleen McKeown http://repository.upenn.edu/cgi/viewcontent.cgi?article=1749&context=cis_papers
- http://en.wikipedia.org/wiki/Automatic_summarization