सूचना प्रौद्योगिकी और सॉफ्टवेयर इंजीनियरिंग जर्नल

सूचना प्रौद्योगिकी और सॉफ्टवेयर इंजीनियरिंग जर्नल
खुला एक्सेस

आईएसएसएन: 2165- 7866


CONDENZA: A System for Extracting Abstract from a Given Source Document

Mgbeafulike IJ and Christopher Ejiofor

Despite the increasingly availability of documents in electronic form and the availability of desktop publishing software, abstracts continue to be produced manually. The purpose of CONDENZA is to develop a system for abstract extraction from a given source document. CONDENZA describes a system on automatic methods of obtaining abstracts. The rationale of abstracts is to facilitate quick and accurate identification of the topic of published papers. The idea is to save a prospective reader time and effort in finding useful information in a given article or report. The system generates a shorter version of a given sentence while attempting to preserve its meaning. This task is carried out using summarization techniques. CONDENZA implements a method that combines apriori algorithm for keyword frequency detection with clustering based approach for grouping similar sentences together. The result from the system shows that our approach helps in summarizing the text documents efficiently by avoiding redundancy among the words in the document and ensures highest relevance to the input text. The guiding factors of our results are the ratio of input to output sentences after summarization.