Thursday, April 2, 2009

The automatic Summarizer and the deadwood Expressions.

The deadwood expressions and some adverbs fill the text of unnecessary words. They can be removed without a significant lost of information. Sentences between brackets provide some extra information / explanations; but the redundancy they give are not necessary in a summary.

- The next version of the summarizer TexLexan will include a function to replace the dead wood expressions with single word or simplified expressions. The result will not extraordinary in term of compression (about 5 to 10%) but will make the summaries easier to read.

- The sentences between brackets will be suppressed too.

- Some combinations of adverbs or adjectives will simplified. For instance, the adverb 'very' can be often omited without changing the meaning of a sentence.

